2023: Week 4 - Solution
Solution by Tom Prowse and you can download the workflow here.
This is the final week of our introductory challenges so hopefully you are starting to build up knowledge of different techniques that you can use within Tableau Prep.
We're looking at customer acquisitions this month and combining different tabs from a spreadsheet so we can consolidate and analyse the data from different tables.
Step 1 - Input Data
Step 2 - Merge Demographic
Now we have all of the data input into our workflow we can start to clean the fields so we can start the analysis.
First, some of the demographic field headers have different spellings and therefore aren't matched from the union. To overcome this we want to merge the fields together so that we remove the nulls and have a single column with all of the demographic date.
To merge the fields all we need to do is select the fields (using ctrl on Windows), then right-click, and then select Merge.
Merge the fields will create a single field and you may need to rename but we should now have one field only for Demographic:
Step 3 - Join Date
Note how we have first created the date as a string. This will allow us to use the month as a written month and then use the date type functionality within Tableau Prep to change it into a date field.
Tableau Prep is clever enough to recognise that this string is actually in a date format, therefore when we update the data type we now have a numeric Date field.
Our table should now look like this:
Step 4 - Pivot Demographics
We are now in a position to transform the shape of our table so that each of the different demographic values is in a separate column. Currently, these are all in a single field so it makes it harder to compare each demographic for each customer, therefore we want to pivot so that each value is it's own field.
We can use a Rows to Columns pivot, with Demographic in the Pivot Fields, and a Max Value in the aggregate:
After ensuring each of the new fields is the correct data type (Eg, Date of Birth), our table should now look like this:
We now have a single row for each customer with their demographic information in separated fields.
Step 5 - Remove Duplicates
The final step this week is to remove any times that a customer appears more than once. If this is the case then we want to only return the earliest joining date.
To ensure we only have one row per customer, we can use an aggregation where we group by ID, Account Type, Date of Birth, and Ethnicity, and then return the Min for Joining Date:
This is the final step of the challenge and we are now ready to output our data: