2023: Week 4 - Solution


Solution by Tom Prowse and you can download the workflow here.


 

This is the final week of our introductory challenges so hopefully you are starting to build up knowledge of different techniques that you can use within Tableau Prep. 

We're looking at customer acquisitions this month and combining different tabs from a spreadsheet so we can consolidate and analyse the data from different tables. 

Step 1 - Input Data

After connecting to our New Customers data source, we then want to bring in the sheet for January (or any month). Within the input step, we can control if we include a single table or multiple table by selecting the radio button. 

When selecting 'Union Multiple Tables' this allows us to create a union straight from the input step based on various conditions. In our case we want to include multiple tables from where the filename matches New Customers.xlsx, but in some cases you can limit the tables that you bring in by setting a matching pattern.


Note, depending on your version of Tableau Prep, this interface may look different

After setting up the wildcard union, you can see that there are now 12 tables included: 


As a result our table should now look like this:





Step 2 - Merge Demographic

Now we have all of the data input into our workflow we can start to clean the fields so we can start the analysis. 

First, some of the demographic field headers have different spellings and therefore aren't matched from the union. To overcome this we want to merge the fields together so that we remove the nulls and have a single column with all of the demographic date.

To merge the fields all we need to do is select the fields (using ctrl on Windows), then right-click, and then select Merge.


Merge the fields will create a single field and you may need to rename but we should now have one field only for Demographic: 


Step 3 - Join Date

Next we want to create a joining date for each of the customer IDs. Currently, the joining date is split into a Day, Month (in words) from the Table Names field, and we know from the requirements that the year is 2023. Therefore we need to format at date field from these separate fields. 

Joining Date 
[Joining Day]+" "+[Table Names]+" "+"2023"

Note how we have first created the date as a string. This will allow us to use the month as a written month and then use the date type functionality within Tableau Prep to change it into a date field. 

Tableau Prep is clever enough to recognise that this string is actually in a date format, therefore when we update the data type we now have a numeric Date field.

Our table should now look like this: 


Step 4 - Pivot Demographics

We are now in a position to transform the shape of our table so that each of the different demographic values is in a separate column. Currently, these are all in a single field so it makes it harder to compare each demographic for each customer, therefore we want to pivot so that each value is it's own field. 

We can use a Rows to Columns pivot, with Demographic in the Pivot Fields, and a Max Value in the aggregate: 


After ensuring each of the new fields is the correct data type (Eg, Date of Birth), our table should now look like this: 


We now have a single row for each customer with their demographic information in separated fields.

Step 5 - Remove Duplicates

The final step this week is to remove any times that a customer appears more than once. If this is the case then we want to only return the earliest joining date. 

To ensure we only have one row per customer, we can use an aggregation where we group by ID, Account Type, Date of Birth, and Ethnicity, and then return the Min for Joining Date: 


This is the final step of the challenge and we are now ready to output our data: 


You can download the output from here

After you finish the challenge make sure to fill in the participation tracker, then share your solution on Twitter using #PreppinData and tagging @Datajedininja@JennyMartinDS14 & @TomProwse1

You can also post your solution on the Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help! 

Popular posts from this blog

2024: Week 1 - Prep Air's Flow Card

2023: Week 1 The Data Source Bank

How to...Handle Free Text