2023: Week 31 - Solution



Solution by Tom Prowse and you can download the workflow here.



This is our first week where we revisit another HR focused month with some challenges provided by Kelly Gilbert

For the first challenge we look at filling in missing IDs from various different tables. 

Step 1 - Lookup Table

First we want to create a lookup table to identify the unique IDs from each table. We can input both tables and create an aggregate step after each step. Within these aggregations we want to make a unique list of the employee id's and the guid's so we can group by both of these fields: 


We can then use a union step to bring both branches together and create a long table with all the IDs: 


From there we can remove the Table Names field and then filter the table to remove any fields that are Null in both the employee id and guid fields. We can filter using this calculation: 

NOT ISNULL([employee_id])
AND 
NOT ISNULL([guid])

Then finally we can use another aggregation field to remove any duplicate values by grouping by employee_id and guid: 


At this stage our table should look like this: 



Step 2 - Dimension Table

We now want to utilise the newly created lookup table to fill in any missing ids from the original table. 

First we want to use an outer join between the dim table and the lookup table, where we include all of the rows from the dim table, and join on the guid. 



This provides us with the lookup table and dim table, so we can merge both of the ID fields and both of the employee id fields to remove some null values.

However, there are still some remaining null values in the guid field, therefore we can utilise the lookup table again by joining it using the same outer join but this time joining on employee_id: 


Again, we can merge both of the employee_id and guid fields so that we have a complete table with no missing ID fields.

Our first output should look like this:



Step 3 - Monthly Table

The final stage of the challenge involves us following the same process again, but this time for the monthly table. We want to repeat the following steps, but on a separate branch and on the monthly table: 

1. Outer join with the lookup table using the guid field then merge both of the id fields:


2. Outer join with the lookup table using the employee_id field and then merge both of the id fields: 


After this the output should look like this: 



You can download the output from here.

After you finish the challenge make sure to fill in the participation tracker, then share your solution on Twitter using #PreppinData and tagging @kelly_gilbert@AbirammT@Datajedininja@JennyMartinDS14 & @TomProwse1

You can also post your solution on the Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help! 

Popular posts from this blog

2023: Week 1 The Data Source Bank

2023: Week 2 - International Bank Account Numbers

How to...Handle Free Text