A weekly challenge to help you learn to prepare data and use Tableau Prep
2021: Week 28 - Solution
Get link
Facebook
X
Pinterest
Email
Other Apps
Solution by Tom Prowse and you can download the workflow here.
The challenge this week was to look at all of the results from penalty shootouts in the Football World Cup and European Championships and then analyse how good or bad the teams have been.
Step 1 - Input Data
The first task is to input both tables from the Excel document, one for the World Cup one for the Euro's.
These can be input via the wildcard union feature, but the field names don't quite match up directly, therefore I've decided to bring both tables in and then use a union tool.
As you can see some of the fields need to be merged and we can do this by using functionality in the union tool, by selecting a field and then choosing the + of the 2nd field that you want to be merged:
Step 2 - Clean Fields
The next step is to do some general cleaning of the competition, dates and teams. The following changes have been made:
Competition
To parse the competition from the Table Names field (this is created by the union), we can use the split functionality to return everything after the last '/'.
This removes all of the table information and leaves us with the competition name.
Dates
To calculate the correct date of each match we need to combine the Date and the Event Year fields. Currently, each of the date rows has a year of 2021, and this isn't correct for a lot of the matches, so we can solve this by using the following calculation:
This will correctly format our date with the correct field type and year value.
Teams
For some of the rows in our Team fields (winners & losers) there are some additionally leading and trailing spaces that we don't require. Using the Clean functionality within Tableau Prep we can easily remove these so that we have a nice clean list of teams.
After this cleaning our table now looks like this:
Step 3 - Combine Winners & Losers
The next step is to combine both the winners and losers takers into a single column. You could do this via a pivot (see other examples from posts on social media) or you could split into two branches, one for winners, one for losers.
I have gone for the split into branches approach, so now have two aggregate fields for both Winners & Losers.
Both of these branches are going to be the same but switching the Winner or Loser field.
Within the aggregation tool we want to return a single row for each of the Winning team's takers, therefore we just aggregate the following fields:
Then we can exclude any null values and create a 'Winner' string to show which branch the rows are from.
The same process can then be repeated for the Losers branch. Then we can union the results together to get a table with each of our takers in a single column:
Step 4 - Combine German Teams
At this stage we can group the West Germany & Germany teams by using a manual grouping feature. This can be done by selecting both of the teams and grouping them.
Step 5 - Win %
We are now in a position to start answering the questions from the requirements. First, we can calculate the win % for each of the teams.
To do this we want to aggregate how many times a team has won or lost a shootout by using the following aggregation tool setup:
From here we can then calculate the total amount of shootouts each team has been in by using a Fixed LOD calculation:
Then keep only the winning teams and calculate the Win %:
Round(([Shootouts]/[Total Shootouts])*100,0)
And finally we can rank these from highest to lowest:
We then have our first output:
Step 6 - Scored %
Next we are going to have a similar process but this time we want to calculate what the score % for each team is.
This time we need to calculate whether a penalty was scored or missed. This can be found in the Penalty Taker field with the following calculations:
Penalties Scored
IF CONTAINS([Penalty Taker],'scored') THEN 1
ELSE 0
END
Penalties Missed
IF CONTAINS([Penalty Taker],'missed') THEN 1
ELSE 0
END
From here we want to calculate the total penalties scored & missed by each team:
Then calculate the Scored %
ROUND(
[Penalties Scored]
/
([Penalties Missed]+[Penalties Scored])*100,0)
And rank the %s
We then have output number 2:
Step 7 - Penalty Position
The last part of the challenge is to see what the most successful penalty position number is. We can create another branch from where we calculated whether a penalty was scored or missed and total this for each penalty position number:
You can also post your solution on the Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help!
Created by: Carl Allchin Welcome to a New Year of Preppin' Data challenges. For anyone new to the challenges then let us give you an overview how the weekly challenge works. Each Wednesday the Preppin' crew (Jenny, myself or a guest contributor) drop a data set(s) that requires some reshaping and/or cleaning to get it ready for analysis. You can use any tool or language you want to do the reshaping (we build the challenges in Tableau Prep but love seeing different tools being learnt / tried). Share your solution on LinkedIn, Twitter/X, GitHub or the Tableau Forums Fill out our tracker so you can monitor your progress and involvement The following Tuesday we will post a written solution in Tableau Prep (thanks Tom) and a video walkthrough too (thanks Jenny) As with each January for the last few years, we'll set a number of challenges aimed at beginners. This is a great way to learn a number of fundamental data preparation skills or a chance to learn a new tool — New Year&
Created by: Carl Allchin Welcome to a New Year of Preppin' Data. These are weekly exercises to help you learn and develop data preparation skills. We publish the challenges on a Wednesday and share a solution the following Tuesday. You can take the challenges whenever you want and we love to see your solutions. With data preparation, there is never just one way to complete the tasks so sharing your solutions will help others learn too. Share on Twitter, LinkedIn, the Tableau Forums or wherever you want to too. Tag Jenny Martin, Tom Prowse or myself or just use the #PreppinData to share your solutions. The challenges are designed for learning Tableau Prep but we have a broad community who complete the challenges in R, Python, SQL, DBT, EasyMorph and many other tools. We love seeing people learn new tools so feel free to use whatever tools you want to complete the challenges. A New Year means we start afresh so January's challenges will be focused on beginners. We will use dif
Free isn't always a good thing. In data, Free text is the example to state when proving that statements correct. However, lots of benefit can be gained from understanding data that has been entered in Free Text fields. What do we mean by Free Text? Free Text is the string based data that comes from allowing people to type answers in to systems and forms. The resulting data is normally stored within one column, with one answer per cell. As Free Text means the answer could be anything, this is what you get - absolutely anything. From expletives to slang, the words you will find in the data may be a challenge to interpret but the text is the closest way to collect the voice of your customer / employee. The Free Text field is likely to contain long, rambling sentences that can simply be analysed. If you count these fields, you are likely to have one of each entry each. Therefore, simply counting the entries will not provide anything meaningful to your analysis. The value is in