2020: Week 47
Challenge by: Jenny Martin
Prep Air want to do some analysis of flight delays to and from its key destinations. After many discussions with the airport, they finally agreed to share this data. However, it's not in the best structure, so we'll definitely need to do some prep before our analysis can begin. (It's almost like they're afraid of what we'll find!)
A special thank you to Michael this week for sharing a similarly structured dataset with us that sparked the idea for this challenge!
Inputs
We have 2 inputs this week:
Information on the delayed flights, separated across multiple lines
Aggregated view of flights which were not delayed
Requirements
- Input the data
- Aggregate the data so that you have 1 row per flight delay, instead of the current 3 rows
- Make sure all Airport codes are valid. Group those which are not.
- Calculate the total delay and number of delayed flights for each Airport, for each journey type
- Combine with information on flights which were not delayed
- Calculate the average delay and % of flights which were delayed for each Airport, for each journey type
- Output the data
Output
- 4 fields
- Airport
- Type
- % Flights Delayed
- Avg Delay (mins)
- 10 rows (11 including headers)
The full output can be downloaded here.
After you finish the challenge make sure to fill in the participation tracker, then share your solution on Twitter using #PreppinData and tagging @Datajedininja, @JennyMartinDS14, @JonathanAllenby & @TomProwse1
You can also post your solution on the Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help!