2021: Week 20 - Solution
Solution by Tom Prowse and you can download the full workflow here.
Continuing the calculations theme for May, this week we looked at numbers. The aim was to make creating a control chart in Tableau Desktop a lot easier by doing some of the table calculations in Prep before bringing the data into Desktop.
Step 1 - Create Mean & Standard Dev
The first task is to create the mean and standard deviation for the complaint field. To do this we want to duplicate the complaints field, as we want to use this twice in the next aggregation step.
The aggregation step is where we are going to create the mean and stdev for each week. Therefore, we group by Week and calculate the Avg and Stdev for our two complaints fields:
Step 2 - Upper & Lower Control Limits
Now we have the mean and stdev, we can calculate the upper and lower control limits for our chart. These are calculated by adding or subtracting the stdev from the mean:
Upper Control Limit
[Mean] + [Standard Deviation]
Lower Control Limit
[Mean] - [Standard Deviation]
Then we can calculate the variance between these two limits:
Variation (1SD)
[Upper Control Limit]-[Lower Control Limit]
These calculations are for 1 standard deviation away from the mean, so we need to repeat these calculations for 2 & 3 stdevs as well. To do this we just multiply the stdev by 2 or 3:
Upper Control Limit (2SD)
[Mean] + (2*[Standard Deviation])
Lower Control Limit (2SD)
[Mean] - (2*[Standard Deviation])
Then we can calculate the variance between these two limits:
Variation (2SD)
[Upper Control Limit (2SD)]-[Lower Control Limit(2SD)]
Upper Control Limit (3SD)
[Mean] + (3*[Standard Deviation])
Lower Control Limit (3SD)
[Mean] - (3*[Standard Deviation])
Then we can calculate the variance between these two limits:
Variation (3SD)
[Upper Control Limit (3SD)]-[Lower Control Limit(2SD)]
Step 3 - Join Original Data
Now we have calculated all of the limits, we can join this back onto our original input step. This will bring the limits into each of the complaints rows.
The join condition is Week = Week:
Our data now looks like this:
Step 4 - Calculate Outliers
The next step is to calculate whether or not each of the complaints are outside of our given limits and therefore an outlier. Again we need to have three different calculations, one for each SD:
Outlier? (1SD)
IF [Complaints] < [Lower Control Limit]
OR [Complaints] > [Upper Control Limit]
THEN 'Outside'
ELSE 'Within'
END
This is then repeated for the other two SD as well:
Outlier? (2SD)
IF [Complaints] < [Lower Control Limit (2SD)]
OR [Complaints] > [Upper Control Limit (2SD)]
THEN 'Outside'
ELSE 'Within'
END
Outlier? (3SD)
IF [Complaints] < [Lower Control Limit (3SD)]
OR [Complaints] > [Upper Control Limit (3SD)]
THEN 'Outside'
ELSE 'Within'
END
This creates a flag field to show whether each of the complaints are inside or outside. For example, this could be used to colour the different marks when using in Tableau Desktop.
Step 5 - Create Outputs
The final step this week is to filter for only outliers and then export these to a single document with three different sheets. To do this we are going to need three different branches (one for each SD) and then use the output to Excel function.
For the output we want to create three separate output steps, which will create the different sheets that we require. The file name will be the same in each but each one will update a different sheet in the Excel document:
As a result our output should look like this:
Here's our full output for comparison. Remember we don't care about the order of the columns or rows as Tableau Desktop will import them in whatever order they come in!
After you finish the challenge make sure to fill in the participation tracker, then share your solution on Twitter using #PreppinData and tagging @Datajedininja, @JennyMartinDS14 & @TomProwse1
You can also post your solution on the Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help!