2020: Week 32 - Solution
This solution is from Tom Prowse and you can download our full solution here.
This week we have seen lots of different ways of completing this challenge, so I thought it would be good to discuss a couple different options! We love seeing the different solutions so please keep sharing with on Twitter or on the Tableau Forums.
Solution 1 - LODs
The first step is to create a self join with the original data source, which will increase the amount of data but allow us to match up the manager names with the row IDs. Before, completing the self-join, on one of the branches we want to remove all null values from the store manager field, so our data will look like this:
Once we have removed the null values, we are ready to join back to the original data source by using the following join condition:
Notice how our number of rows has now increased due to the various different matches occurring as a result of the join.
We now have two Store Manager fields, one with nulls and one without the nulls. Therefore, we can use the merge functionality within Tableau Prep to combine these fields into one. To merge the fields, you can just drag the field on top of another within the profile pane.
As you may have noticed it looks like we have filled in the null values, however there is some duplication and some managers have the incorrect field.
To overcome the duplication, we want to find the max Row-ID-1 (removed nulls branch) for each Row-ID (Original Branch) by using this calculation:
This allows us to find the lowest Row ID for each manager and our data should look like this:
We now want to filter to remove any duplicated/wrongly assigned manager and store combinations. To do this we just need to filter where our Fixed LOD = Row ID -1.
Finally we can remove any fields that are no longer required and we should be left with the following output:
Solution 2 - Joins & Aggregates
Our second solution this week comes from Carl and he utilises joins and aggregates to overcome the challenge without the use of LODs.
The first step is going to identify the fields that we want to aggregate at a later stage. Therefore we create the following calculation:
Sum Me
IF ISNULL([Store Manager]) THEN 0ELSE 1END
We're now ready to do a self-join similar to our previous solution, but this time we aren't going to remove the nulls beforehand. The join looks like this:#
Again we are greatly increasing the amount of rows that we have but we will use these later.
Next we are going to rename some fields just so that it is a little more clear moving forwards:
- Sum Me-1 to Don't Sum Me
- Sum Me to Don't Sum
- Don't Sum Me to Sum Me
Make sure you keep them in this order!
After the renaming, we can then remove any additional fields that we no longer need. These include:
- Don't Sum
- RowID-1
- Store Manager-1
- Store-1
- Sales Target-1
Next up we want to use an Aggregation tool to total the Sum Me field by each Row ID. The aggregation looks like this:
We can then want to join back to the original workflow, using the Row-ID:
Next we want to use an aggregation again, but this time grouping by the Sum Me field and returning the Max of Store Manager:
Again, we can now join back onto the original workflow, but this time we will use the Sum Me field:
Just as a recap, this stage of the workflow should look like this:
Now we are almost complete, all that is left is to use an aggregate again to return the required output. We group by Store and Store Manager, and then Avg Sales Target:
After the aggregation we should now be ready to output the data, which should look like this:
The solutions and full outputs can be found here.
Hopefully this gave you a couple of ways around solution this challenge within Tableau Prep and we look forward to seeing more solutions in the future!
After you finish the challenge make sure to fill in the participation tracker, then share your solution on Twitter using #PreppinData and tagging @Datajedininja, @JennyMartinDS14, @JonathanAllenby & @TomProwse1
You can also post your solution on the Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help!