2024: Week 29 - Formula 1 Qualifying Special

Challenge by: Dan Wade

Dan is part of Data School cohort DS42 and has recently set himself a challenge to create 24 Formula 1 vizzes, one for each race week of the season. Naturally, this means he's had to do a lot of data preparation so he's been very generous to also prepare a challenge for our Sports themed July. Over to Dan to explain the challenge.


In Formula 1, each driver participates in a qualifying session which sets the starting grid for the race. A good qualifying result can give a driver a better starting position, which can be crucial for a strong race performance and being in front reduces the risk of being involved in accidents.

Qualifying is made up of three rounds, Q1, Q2 and Q3. It begins with Q1, where all 20 drivers compete for 18 minutes; the five slowest are eliminated and placed in positions 16-20. Q2 follows, lasting 15 minutes, with the remaining 15 drivers vying for the fastest times; the five slowest are then eliminated and placed in positions 11-15. Finally, in Q3, the top 10 drivers from Q2 compete for 12 minutes to set the fastest lap times, with the quickest driver earning pole position and the others filling positions 2-10.

The data set contains every single lap-time recorded in the session including all warm-ups and cool down laps. Your task is to arrange this data to produce an output of
the final qualifying result.

Tip: You can identify when a session ends and another starts by observing a larger gap in start times

Inputs

The data this week comes from the OpenF1 API.
  1. Qualifying Lap Times 
  2. Drivers and Constructor Details 

Requirements

  • Input the data
  • Clean the date_start field to make it useable as a date
  • Identify Q1 times
    • Identify the earliest date_start of the session (This is shortly after Q1 began)
    • Calculate when Q1 ended using the guidelines above
      • Allow an extra 2 minutes to catch all drivers
    • Filter to keep only Q1 times
      • Identify the quickest lap time per driver
      • Rank the drivers AND filter to keep only the 5 slowest (P16-P20)
  • Identify Q2 times
    • Filter out Q1 times
    • Identify the earliest date_start of the session
    • Calculate when Q2 ended using the guidelines above 
      • Allow an extra 2 minutes to catch all drivers
    • Use a calculated field to identify Q2 times and Q3 times
    • Filter to keep only Q2 times
      • Identify the quickest lap time per driver
      • Rank the drivers AND filter the 5 slowest (P11-P15)
  • Identify Q3 times
    • From the calculated field where you identified Q3 times, filter out the Q2 times
    • Identify the quickest lap time per driver
    • Rank the drivers (P1-P10)
  • Union Q1, Q2 and Q3 together
  • Join to your driver data
  • Sort to get the correct order for the output
  • Output the data

Output

  • 6 fields
    • Position
    • driver_number
    • driver_code
    • driver_name
    • constructor_sponsor_name
    • lap_duration
  • 20 rows (21 including headers)
You can view the outputs here.

After you finish the challenge make sure to fill in the participation tracker, then share your solution on Twitter using #PreppinData and tagging @Datajedininja@JennyMartinDS14 & @TomProwse1

You can also post your solution on the Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help! 

Popular posts from this blog

2024: Week 1 - Prep Air's Flow Card

How to...Handle Free Text

2023: Week 1 The Data Source Bank