2023: Week 39 - Solution


Solution by Tom Prowse and you can download the workflow here.



The challenge this week is looking at Andy Murray's 200 grand slam match wins and preparing the data so that we can visualise in a Tableau dashboard.

Step 1 - Input Tables

Within our input we have a file containing details of the matches between 2000 & 2023. For this we can use the union multiple tables option within the input step where we want to include all file names that start with 20


We can then remove fields that aren't required so that the table looks like this:



Step 2 - Matches & Players of Interest

First we want to filter so that we are only looking at Grand Slam matches. Using the Series field we can select 'Grand Slam' and keep only this value. 

We can then create a new step and then create another filter where we only keep the 4 players from the requirements in the winner field.


We can then create a single row for each of these 4 players by using an aggregate step where we group by Winner: 


Step 3 - Player Wins & Losses

Now we have a list of the 4 players that we want to compare, we can now join this list back with our list of grand slam matches. 

We can join the Players of Interest using an inner join on the Winner field. 



Then we can create a new string field which determines that these are the matches that the players have won.

Next, we can give each of the wins an order so that we can identify the first 200 wins for each player. To rank the wins we can use an analytical calculation to group by Player of Interest then rank the Date in an ascending order. 

Win Number


Then we can filter the Win Number with the range from 0-200


At this stage our table looks like this: 



Step 4 - Match Losses

Now we have the first 200 wins we can fill in the missing games which resulted in a loss. To identify these games we need to find the date that the 200th win was won, then use this to return all games that are before this date.

We can identify the 200th win using an aggregation where we find the max date for each player: 



Then we can join back to our original game list where Loser = Player of Interest and Date <= 200th win. 


Then we can create a Loss string field and our table should look like this: 



Step 5 - Win & Loss

Now we have identified the wins and losses we can bring these together into a single table. Before doing this we first need to calculate the time between each of the wins.

From the win branch we can create a new step, before the aggregation, and then calculate the previous win date for each row.

Previous Win Date 
{ PARTITION [Player of Interest] : { ORDERBY [Win Number] asc : LOOKUP([Date],-1)}}

We can then calculate the days between each of these dates: 

Days Since Last Win
DATEDIFF('day',[Previous Win Date],[Date])

After we've calculated the time between each of the wins we can then bring the table together with the losses using a union step. This will 'stack' each of the tables together into a single table that looks like this: 


You can download the output from here.

After you finish the challenge make sure to fill in the participation tracker, then share your solution on Twitter using #PreppinData and tagging @Datajedininja@JennyMartinDS14 & @TomProwse1

You can also post your solution on the Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help! 

Popular posts from this blog

2023: Week 1 The Data Source Bank

2023: Week 2 - International Bank Account Numbers

How to...Handle Free Text