2024: Week 29 - Formula 1 Qualifying Special
Challenge by: Dan Wade
Dan is part of Data School cohort DS42 and has recently set himself a challenge to create 24 Formula 1 vizzes, one for each race week of the season. Naturally, this means he's had to do a lot of data preparation so he's been very generous to also prepare a challenge for our Sports themed July. Over to Dan to explain the challenge.
In Formula 1, each driver participates in a qualifying session which sets the starting grid for the race. A good qualifying result can give a driver a better starting position, which can be crucial for a strong race performance and being in front reduces the risk of being involved in accidents.
Qualifying is made up of three rounds, Q1, Q2 and Q3. It begins with Q1, where all 20 drivers compete for 18 minutes; the five slowest are eliminated and placed in positions 16-20. Q2 follows, lasting 15 minutes, with the remaining 15 drivers vying for the fastest times; the five slowest are then eliminated and placed in positions 11-15. Finally, in Q3, the top 10 drivers from Q2 compete for 12 minutes to set the fastest lap times, with the quickest driver earning pole position and the others filling positions 2-10.
The data set contains every single lap-time recorded in the session including all warm-ups and cool down laps. Your task is to arrange this data to produce an output of
the final qualifying result.
Tip: You can identify when a session ends and another starts by observing a larger gap in start times
Inputs
Requirements
- Input the data
- Clean the date_start field to make it useable as a date
- Identify Q1 times
- Identify the earliest date_start of the session (This is shortly after Q1 began)
- Calculate when Q1 ended using the guidelines above
- Allow an extra 2 minutes to catch all drivers
- Filter to keep only Q1 times
- Identify the quickest lap time per driver
- Rank the drivers AND filter to keep only the 5 slowest (P16-P20)
- Identify Q2 times
- Filter out Q1 times
- Identify the earliest date_start of the session
- Calculate when Q2 ended using the guidelines above
- Allow an extra 2 minutes to catch all drivers
- Use a calculated field to identify Q2 times and Q3 times
- Filter to keep only Q2 times
- Identify the quickest lap time per driver
- Rank the drivers AND filter the 5 slowest (P11-P15)
- Identify Q3 times
- From the calculated field where you identified Q3 times, filter out the Q2 times
- Identify the quickest lap time per driver
- Rank the drivers (P1-P10)
- Union Q1, Q2 and Q3 together
- Join to your driver data
- Sort to get the correct order for the output
- Output the data
Output
- 6 fields
- Position
- driver_number
- driver_code
- driver_name
- constructor_sponsor_name
- lap_duration
- 20 rows (21 including headers)