2024: Week 31 - Olympics Special

Challenge by: Zak Saucede

For the final challenge in Preppin' Data's sports themed month we're turning our attention towards the Olympics. More specifically, we're looking at the Women's Heptathlon, using data from the World Athletics Championship in August 2023.  

Zak, a Data School Consultant from DS39, recently created the viz below, after realising that despite ranking in the Top 3 for 5 out of the 7 events, it wasn't enough for Anna Hall to win gold. Therefore, he wanted to do further analysis to see if it's better to be a specialist in a few events, or a generalist across all the events.

Click here to interact with the viz

Inputs

1. Results Table - at a first glance, our data looks very clean, and analysis ready - but 3 data points in a single row is going to require some attention, as well as the position being in brackets for each event

2. Event Lookup Table - Let's add in some additional detail about the events, such as making them more human readable 

Requirements

  • Input the data
  • Rename the fields in the results table
    • POS - Position
    • ATHLETE - Athlete
    • NAT - Nationality
    • POINTS - Total Points
  • There's a duplicate field for the 800m Run event, remove this
  • Fill down the values in the first 4 fields so no nulls remain (help)
  • Remove the letters from the Total Points field
  • Pivot the data so there is a row for each event
  • Reshape the data so there is a field for:
    • Event Score
    • Event Time/Distance
    • Rolling Total
  • Join on the Event Lookup Table
  • Split off the Position information that is contained within brackets in both the Event Time/Distance and Rolling Total fields
    • Rename these to Event Position and Position After Event
  • Sort the rows in Position and Event No. order
  • Output the data

Output


  • 13 fields
    • Position
    • Athlete
    • Nationality
    • Total Points
    • Event Name
    • Event Type
    • Event No.
    • Event Time/Distance
    • Event Points
    • Event Position
    • Rolling Total Points
    • Position After Event
  • 126 rows (127 including headers)

You can view the output here.

After you finish the challenge make sure to fill in the participation tracker, then share your solution on Twitter using #PreppinData and tagging @Datajedininja@JennyMartinDS14 & @TomProwse1

You can also post your solution on the Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help! 

Popular posts from this blog

2023: Week 1 The Data Source Bank

2024: Week 1 - Prep Air's Flow Card

How to...Handle Free Text