2020: Week 43

Challenge by: Jenny Martin

Sometimes when researching one Preppin' Data idea, you encounter a rather hideous data structure that makes you wonder, "could Prep handle this?" Suddenly you're on a completely different tangent to the challenge you were originally planning and you've got a chunky Prep workflow that just begs to be turned into a challenge itself. So here we are, looking at the most popular baby names for boys and girls in England and Wales in 2019. 

Inputs

The data itself comes from the Office for National Statistics:


There is one input for boys names and one input for girls names. As you can see, each month is its own table and they are laid out next to each other in the Excel sheet. Not an ideal input for Tableau Desktop! Pay particular attention to May and August which have additional rows as there have been ties in the rankings. 

Requirements

  • Input the data
  • I recommend starting with the boys names
    • Remove totals
    • Pivot to create a month field
    • Bring all the months together
    • Perhaps using the new grouping functionality and reusing steps may be useful to avoid tedious repetition
    • Still stuck? Check the hint below
  • For the girls names, carry out the above steps
  • For additional rows, that have not been picked up as part of the main tables:
    • Pivot so that information from headers is not lost
    • Restructure the data so it has the headings: Rank, Name, Count
    • Join it to cleaned girls names to bring through month information
    • Combine it with girls names
  • Output the monthly rankings as a single datasource
  • Aggregate to a year level and calculate new rankings
  • Output the top 10 baby names for boys and girls in 2019 in England and Wales



Hint: the data interpreter could be very useful in this challenge!

Outputs

Download our outputs here.

Monthly rankings output:
  • 5 fields
    • Gender
    • Month
    • Rank
    • Name
    • Count
  • 244 rows (245 including headers)
2019 Rankings output:
  • 4 fields
    • Gender
    • 2019 Rank
    • Name
    • Count
  • 20 rows (21 including headers)
After you finish the challenge make sure to fill in the participation tracker, then share your solution on Twitter using #PreppinData and tagging @Datajedininja@JennyMartinDS14@JonathanAllenby & @TomProwse1

You can also post your solution on the brand new Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help! 





Popular posts from this blog

2023: Week 1 The Data Source Bank

2023: Week 2 - International Bank Account Numbers

How to...Handle Free Text