2021: Week 12 - Maldives Tourism

Challenge By: Jenny Martin

One of the best things about being a Dr Prepper is that people are always bringing interesting datasets to your attention. A little while ago, Tableau Zen Master Lorna Brown showed me a dataset with all kinds of information on tourism in the Maldives. This database has a lot of data on different Key Economic Indicators, but as you can imagine, it also has a bit of a quirky structure! For inspiration as to why we might want to clean this data up, check out Lorna's viz below:


Input

Our input is very wide this week, with 136 fields and only 28 rows. It covers tourism in the Maldives from 2010 to 2020. The source of this data is here but you can download it in the usual way from here.


Requirements

  • Input the data
  • Pivot all of the month fields into a single column (help)
  • Rename the fields and ensure that each field has the correct data type
  • Filter out the nulls (help)
  • Filter our dataset so our Values are referring to Number of Tourists
  • Our goal now is to remove all totals and subtotals from our dataset so that only the lowest level of granularity remains. Currently we have Total > Continents > Countries, but we don't have data for all countries in a continent, so it's not as simple as just filtering out the totals and subtotals. Plus in our Continents level of detail, we also have The Middle East and UN passport holders as categories. If you feel confident in your prep skills, this (plus the output) should be enough information to go on, but otherwise read on for a breakdown of the steps we need to take:
    • Filter out Total tourist arrivals
    • Split our workflow into 2 streams: Continents and Countries
      • Hint: the hierarchy field will be useful here
    • Split out the Continent and Country names from the relevant fields (help)
    • Aggregate our Country stream to the Continent level (help)
    • Join the two streams together and work out how many tourists arrivals there are that we don't know the country of (help)
    • Add in a Country field with the value "Unknown" (help)
    • Union this back to here we had our Country breakdown (help)
  • Output the data

Output


  • 4 fields
    • Month
    • Breakdown
    • Country
    • Number of Tourists
  • 1,826 rows (1,827 including headers)

The full output can be downloaded here.

After you finish the challenge make sure to fill in the participation tracker, then share your solution on Twitter using #PreppinData and tagging @Datajedininja@JennyMartinDS14 & @TomProwse1

You can also post your solution on the Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help! 


Popular posts from this blog

2023: Week 1 The Data Source Bank

2023: Week 2 - International Bank Account Numbers

How to...Handle Free Text