2022: Week 42 - Missing Training Data

Challenge by: Jenny Martin

Recently, my colleague Joe Stokes brought me an interesting challenge that I knew needed to be turned into a Preppin' Data! 

He was working with sports player training data. Since the athletes wouldn't take part in the same training sessions every single day, this was leading to a lot of gaps in the data. As there was a metric which calculated the average across all sessions for each day, this could lead to some misleading conclusions, with a lot of variation over time.

Instead, what we want is to take the value from the previous session, even if the player didn't partake in the session on that day. This helps to keep the average a bit more stable. It also means we have data for every day, even if the player were off sick that day.


Input

Our input is very simple, with a row for each training session the player participated in and their resulting score.

Requirements

  • Input the data
  • For each player and each session, we want to know what the date is of the next session
    • E.g. for Player 1 who had an Agility session on 4th Jan, their next Agility session was 6th Jan - so we have missing data on 5th Jan
  •  For the most recent training session in the dataset, assign the next session as the maximum date in the dataset
  • Scaffold the data so we have a row for each player and each session
    • Careful of duplicates!
  • Create a flag to indicate whether the score comes from an actual session, or is carried over from the previous session
  • Exclude all weekends (Saturdays & Sundays) from the dataset
    • Players need time off too!
  • Output the data

Output

  • 5 fields
    • Player
    • Session
    • Session Date
    • Score
    • Flag
  • 6,750 rows (6,751 including headers)
You can download the full output here

After you finish the challenge make sure to fill in the participation tracker, then share your solution on Twitter using #PreppinData and tagging @Datajedininja@JennyMartinDS14 & @TomProwse1

You can also post your solution on the Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help! 


Popular posts from this blog

2023: Week 1 The Data Source Bank

2023: Week 2 - International Bank Account Numbers

How to...Handle Free Text