2022: Week 4 - Solution



Solution by Tom Prowse and you can download the workflow here


This is the next instalment of our introductory challenges throughout January. For this challenge we looked at how each of our students are getting to and from school, and if this is sustainable or not!

Step 1 - Input Data

We have been given two sets of data for this challenge. The first is the data set that we have used throughout all of the Jan challenges, and the second is details about how each of the students gets to school. 

These two tables will join together using an inner join where id = Student id.


After the join we can remove any fields that we don't need including the id and any parental fields. The table should look like this: 



Step 2 - Pivot Days

The next step is to transform our table so that instead of having a column for each of the days, we want these on individuals rows instead. To make this change we can use columns to rows pivot with each of the day fields within the pivot values: 


Our table has now gone from being wide, to taller and thinner: 



Step 3 - Remove Spelling Mistakes

Next we need to check the Method of Travel field for any spelling mistakes. Tableau Prep contains so nice features that will try to identify these and then group them together automatically. 

Within the group functionality we can use the group by pronunciation to group together the majority of the different types. 


However there is one 'Walk' that doesn't get picked up, therefore we can manually group these by highlighting both, then pressing Group Values. We now should have 12 different Methods of Transport: 


Step 4 - Sustainable or Non-Sustainable

The next grouping that we need to make is based on whether or not the Method of Transport has a motor or not. If it has a motor then we are saying that it's non-sustainable and if it has no motor then it's sustainable.

For this we can use the manual grouping by selecting all the methods that have a motor and grouping them together. 

First we want to duplicate the Method of Travel field, and then in the duplicated field highlight the options for each group, then right click and group values. The two groups should have the following values: 

Sustainable
  • Bicycle
  • Dad's Shoulders
  • Hopped
  • Jumped
  • Mum's Shoulders
  • Scooter
  • Skipped
  • Walk
Non-Sustainable
  • Aeroplane
  • Car
  • Helicopter
  • Van
Our table should now look like this:


Step 5 - Total each Method

The next step is where we create a total for each of the different methods. To do this we want to use an aggregate step where we group by Sustainable?, Method of Travel, and Weekday, then Sum Number of Rows: 


After the aggregation the table should look like this: 



Step 6 - % of Total

The final step this week is to calculate the % of total for each travel method. For this we need to first find the total trips made on each day, and then divide this by each of the methods for that day. 

Trips per day  

To calculate the total amount of trips that were taken on each day we need to use a Fixed LOD, which is grouped by Weekday and totals the sum of Number of Rows. 

Next we can use that total to divide by each of our methods: 

% of trips per day

ROUND(([Number of Rows (Aggregated)]/[Trips per day]),2)

This will give us our % of total and round this to 2 decimal places. 

Finally we want to rename the Number of Rows field to Number of Trips and we are ready to output our data: 



You can download the full output here

After you finish the challenge make sure to fill in the participation tracker, then share your solution on Twitter using #PreppinData and tagging @Datajedininja@JennyMartinDS14 & @TomProwse1

You can also post your solution on the Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help! 

Popular posts from this blog

2023: Week 1 The Data Source Bank

2023: Week 2 - International Bank Account Numbers

How to...Handle Free Text