2024: Week 42 - Strictly Come Dancing
Challenge by: Jenny Martin
Autumn is always the time of year that Strictly Come Dancing returns to our TVs in the UK. I always find it interesting which songs are chosen for the couples to dance to - particularly when there are repeats. With 22 seasons, this repetition is not surprising, so I set about gathering a dataset that would allow me to see what the most common song choices are.
Inputs
I used this as an opportunity to learn to webscrape Wikipedia using Python (with a lot of help from ChatGPT!), so the resulting table is a combination of each dance from each series:
Requirements
- Input the data
- One thing the data is missing is a year field for when the Series took place
- Series 1 and 2 were both in 2004
- All following series happen annually
- Series 3 in 2005 etc.
- The webscraping isn't quite perfect and the table headers are repeated throughout the dataset, make sure these are removed
- Split the Week field into a numeric value and put extra details in the theme week
- Split this theme week further, so that it it's the Final/Semi Final/Quarter Final this detail is in a Stage field instead
- The Score field is made up of the Total Score and individual judges scores. Since the number of judges can vary depending on the series/week, split the Score field into these 2 categories
- In certain weeks there can be a group dance. These can be identified by the word group or marathon in the Dance field. Update the Couple field to be Group and ensure there is only 1 row for these dances so the music choice is only counted once
- There can be more than 1 song in the Music field. Make sure there is a row for each song, as well as the song and artist being in separate fields
- You may notice we have some additional fields such as Film and Musical. These correspond with the theme weeks. Since there will only be a maximum of 1 theme per week, combine these fields into 1
- Remove unnecessary fields
- Output the data
Output
- 13 fields
- Year
- Series
- Week
- Stage
- Theme
- Theme Detail
- Couple
- Score
- Judges Scores
- Dance
- Song
- Artist
- Result
- 2,524 rows (2,525 including headers)
After you finish the challenge make sure to fill in the participation tracker, then share your solution on Twitter using #PreppinData and tagging @Datajedininja, @JennyMartinDS14 & @TomProwse1
You can also post your solution on the Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help!