2024: Week 50 - Solution
Solution by Tom Prowse and you can download the workflow here.
Step 1 - Reshape Data
Our first task is to combine our two tables so that we have a single table to work from. This involves a few different steps and joins to get the data in the right shape.
First we want to input both of the tables and join these together using an inner join where country = entity.
From here we can remove the entity field and rename the Life Expectancy at Birth to Country: Life Expectancy at Birth.
We also want to include the continent information, so from the List of Countries input, we can create an aggregate step and group by Continent. This will create a list of all the continents which we can use to join back to our Life Expectancy table.
With the second join this time we want to use an inner join where entity = continent:
Now we have both of these tables ready, we can join them together using a left join where we include all of the values from the Country table. We want this join to occur when on the Year and Continent fields.
Step 2 - Life Expectancy Higher than Continent Average
First we want to filter the data so we only have values for 1950 - 2020. We can do this by using a range of values filter on the Year field:
Then we can calculate whether it's above average or not:
Above Continent Avg?
IF [Country: Life expectancy at birth]>[Continent: Life expectancy at birth]
THEN 1
ELSE 0
END
Then we can calculate the % change by using a lookup calculation:
% Change
We can then make sure this is rounded to 1dp:
% Change
ROUND(100*[% Change],1)
At this stage the table should look like this:
Step 3 - Top Countries per Continent
The last step is to identify the top three countries per continent with the highest percentage increase in life expectancy.
For this we first need to aggregate to a country and continent level so we need to use an aggregate step where we group by Country and Continent, and then find the Max % Change, Sum of Number of Rows and Sum of Above Continent Avg?
From here we can rename Number of Rows to Number of Years, and Above Continent Avg? to Number of Years above Avg.
We can then use these fields to calculate the % of Years Above Continent Avg:
% Years Above Continent Avg
ROUND(
100*[Number of Years Above Avg]
/
[Number of Years]
,1)
Then finally we can rank the customers based on the % change:
Rank
Then from here we want to keep numbers 1, 2, & 3 from the rank field to identify our top 3 countries from each continent.
Our final table should look like this:
After you finish the challenge make sure to fill in the participation tracker, then share your solution on Twitter using #PreppinData and tagging @Datajedininja, @JennyMartinDS14 & @TomProwse1
You can also post your solution on the Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help!