2022: Week 17 - Solution


Solution by Tom Prowse and you can download the workflow here


The challenge this week is to show how data prep can really help you out in situations where you are working across different levels of aggregation and want to save yourself from making complicated calculations within Tableau Desktop.

Step 1 - Location Spelling Errors & Timestamps

First we need to input the Streaming table into our workflow and then we can starting cleaning the spelling mistakes within the Location field. Currently, the value 'Edinurgh' is spelt incorrectly, so we can double click on this value within the Profile Pane and update it to the correct spelling 'Edinburgh'.

Next, we can focus on formatting the date in the correct Data type, by changing it from a String to a Date & Time field. This will successfully remove the T & Z from 't' field so we can also rename the field to 'Timestamp'. 

At this stage our data should now look like this:


Step 2 - Content Type

Now we have our Streaming table cleaned, we can aggregate the view so that we know the total duration of streams across the different location, content type, users, and timestamps. The aggregate tool is setup like this (Group UserID, Timestamp, Location, Content Type. Sum Duration): 


From here we can start to update the content type field based on the details in the requirements. There are various types of content (Primary, Secondary, & Preserved) so we want to split this out into separate branches for Primary and another for all of the 'other' types.

Primary Branch
The first branch is filter where the Location is Cardiff, Edinburgh, or London. You can highlight these three values in the profile pane and press Keep Only. After filtering, we can then update the Content Type field so that all values are 'Primary'.

Then the second step on this branch is to calculate each of the users first streaming month. Therefore, we can use a fixed LOD to find the minimum timestamp for each user:

Min Active Month 


And then convert this field to a Month level using the following calculation:

Min Active Month 
DATE(DATETRUNC('month',[Min Active Month]))

Secondary Branch
We can repeat the same process but on a separate branch and this time we want to Exclude Cardiff, Edinburgh, and London from the Location list. Then update the null values in the Content Type to Secondary. We should now have a mix of Secondary and Preserved content types. 

Again we can repeat the minimum month calculations, but this time we want to find the minimum month for each user, content type, and location combination. 


Then we can convert this date to a month level:

Min Active Month
DATE(DATETRUNC('month',[Min Active Month]))

At this stage our workflow should be split into two separate branches that look a little like this: 


Now we need to combine these branches together by using a Union step to combine them into a single table. The data should now look like this:


Step 3 - Join Avg Pricing

Finally we want to combine the Avg Pricing table into our workflow. Before joining the tables we need to ensure that the Month field is formatted in the same way, so we need to update the Month data type to a Date in both inputs/branches.

We can then join the tables together, ensuring that we make sure that all values from the Streaming branch are included because we want to include all of the content types (the Preserved content will drop out with an inner join), and we can join on Month = Min Active Month and Content Type = Content Type: 


The final steps are to remove any fields that are no longer required, and then replace the null values in the Avg Price field with '14.98'. 

After these changes our table should look like this and we are ready to output our data:


You can download the full output here

After you finish the challenge make sure to fill in the participation tracker, then share your solution on Twitter using #PreppinData and tagging @Datajedininja@JennyMartinDS14 & @TomProwse1

You can also post your solution on the Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help! 


Popular posts from this blog

2024: Week 1 - Prep Air's Flow Card

2023: Week 1 The Data Source Bank

How to...Handle Free Text