2024: Week 41 - Solution



Solution by Tom Prowse and you can download the workflow here


Step 1 - Input All Years

The first step is to input all of the sheets for each years worth of data. We can do this using the 'union multiple tables' functionality within the input step where we want to include all sheets with the matching pattern '20*': 


We can then merge some of the columns together so that we have a complete table with no mismatched fields. The fields we need to merge are: 

  • Total & Total Earnings
  • Salary/Winnings & Salary/winnings
  • Country, Nationality, & Nation
After merging the table should look like this: 




Step 2 - Create Years and Monetary Amounts

Next we want to rename the Table Names field to Year and make sure it's a number field.

Then we can turn to the monetary amounts by pivoting the Salary/Winnings, Total Earnings, and Endorsements fields using a Columns to Rows pivot: 


This allows us to then remove the following in these fields: 

Remove $ sign
REPLACE([Pivot1 Values],'$','')

Remove *
REPLACE([Remove $ sign],'*','')

Then we can split the numbers and text by using a split on the ' ' (space): 



Then we can rename the split fields to Numeric & Units then remove any of the other fields that we just created. 

Finally we can calculate the value based on the number and text with this calculation: 

Values 
IF [Units]='million'
THEN [Numeric]*POWER(10,6)
ELSE [Numeric]
END

We're now ready to pivot the data back 'the other way' so that we have a column for each of the metrics with the values: 



At this stage the table should look like this:




Step 3 - Join Source

Finally we want to make sure that the source of the data is included on each of the rows. For this we can use a cross-join to join the source data row to every row of our data source. To set up the join correctly we need a field that matches on both sides, therefore we can create a dummy field to join on. In our case we have used a number 1.

We can then use an inner join on this dummy field from both sides within our join: 


Finally we can remove the dummy fields and we are ready to output our table: 



You can view the output here.

After you finish the challenge make sure to fill in the participation tracker, then share your solution on Twitter using #PreppinData and tagging @Datajedininja@JennyMartinDS14 & @TomProwse1

You can also post your solution on the Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help! 

Popular posts from this blog

2023: Week 1 The Data Source Bank

2024: Week 1 - Prep Air's Flow Card

How to...Handle Free Text