2021: Week 29 - Solution

 


Solution by Tom Prowse and you can download the workflow here.


This week we joined up with Workout Wednesday crew to create a joint challenge that helped us to visualise the Tokyo 2020 Olympic event calendar. The first part of the challenge was to make sure that the data is structured in the correct way so that it makes it easier when we come to visualise at a later stage. 

Step 1 - Format Date

The first step is to correctly format our date time field by combining the Date and Time field. To do this we want to use the MakeDateTime function which will allow us to bring together both the Date and Time fields into a single field. 

The date half of this calculation is easy, all we need to do is change the data type to a Date and this will automatically be picked up for us. 

The time half is a little bit trickier. First we have some time fields that are 'xx', we need to replace these with a number in the same format as the other rows, eg '0:00'. Next, we want to split the time into hours and mins, we can do this by splitting on the ':' delimiter.

Once we have changed the newly split Hour & Min fields into a number, we can use the following calculation to create our Date Time: 

UK Date Time 
MAKEDATETIME([Date],MAKETIME([Hour],[Mins],00)) 

Our data should now look something like this:


Step 2 - Parse Events

The next step is to break apart each of the events so that they are on a separate row. Before bringing them into a separate row, we want to use the split functionality to split all of the events into separate fields. They are separated by a ',' so we can use a custom split to split all values: 


Now we have all the events into separate fields, we can remove the original event field and then use a pivot step to bring these into a single field. 

The pivot is a Columns to Rows pivot where we can use the wildcard pivot to bring in all fields with the word 'Split': 


After the pivot there are  a few null values, in the event split field, that can be excluded and the pivot names field can also be removed. The table should look something like this:


Step 3 - Group Sports

The next step is to group our sports so they have a general group. We can do this using Tableau Prep's grouping features. First, you'll want to duplicate the Sport field and rename to Sport Group, then use the group by Common Characters and then manually group the remaining outliers. 


You should have 34 Sport Groups and 48 Sports.

Step 4 - Combine Venues

Next we want to combine the Venue's data set so that we have the location for each of the events. Before joining the Venue table, we first want to split to location field into a latitude and longitude field by using a split on a ',' so we have two fields. 

We can then join the tables together using the Sport & Venue fields from both tables: 


As a result of the join our table should look like this: 



Step 5 - Medal Ceremony

The final task of this week's challenge is to find out if there is going to be a medal ceremony or not. This is quite hard to do as some sports don't identify whether it's a medal event or not, however we can assume that any event that contains 'Gold Medal' or 'Victory Ceremony' should be included.

The calculation that we use to identify these are: 

Medal Ceremony? 
CONTAINS([Events Split],'Victory Ceremony') 
OR 
CONTAINS([Events Split],'Gold Medal')

After this calculation we are ready to output our data which should look like this: 


The full output can be downloaded here

Once you have completed the challenge head over to the Workout Wednesday challenges (Tableau or PowerBI) so that you can use this data set to create & visualise your Japan 2020 Olympics event calendar!  

After you finish the challenge make sure to fill in the participation tracker, then share your solution on Twitter using #PreppinData and tagging @Datajedininja@JennyMartinDS14 & @TomProwse1

You can also post your solution on the Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help! 

Popular posts from this blog

2023: Week 1 The Data Source Bank

2023: Week 2 - International Bank Account Numbers

How to...Handle Free Text