2021: Week 20 - Solution

Solution by Tom Prowse and you can download the full workflow here


Continuing the calculations theme for May, this week we looked at numbers. The aim was to make creating a control chart in Tableau Desktop a lot easier by doing some of the table calculations in Prep before bringing the data into Desktop. 

Step 1 - Create Mean & Standard Dev

The first task is to create the mean and standard deviation for the complaint field. To do this we want to duplicate the complaints field, as we want to use this twice in the next aggregation step. 

The aggregation step is where we are going to create the mean and stdev for each week. Therefore, we group by Week and calculate the Avg and Stdev for our two complaints fields:


Step 2 - Upper & Lower Control Limits

Now we have the mean and stdev, we can calculate the upper and lower control limits for our chart. These are calculated by adding or subtracting the stdev from the mean:

Upper Control Limit
[Mean] + [Standard Deviation]

Lower Control Limit
[Mean] - [Standard Deviation]

Then we can calculate the variance between these two limits: 

Variation (1SD)
[Upper Control Limit]-[Lower Control Limit]

These calculations are for 1 standard deviation away from the mean, so we need to repeat these calculations for 2 & 3 stdevs as well. To do this we just multiply the stdev by 2 or 3: 

Upper Control Limit (2SD)
[Mean] + (2*[Standard Deviation])

Lower Control Limit (2SD)
[Mean] - (2*[Standard Deviation])

Then we can calculate the variance between these two limits: 

Variation (2SD)
[Upper Control Limit (2SD)]-[Lower Control Limit(2SD)]

Upper Control Limit (3SD)
[Mean] + (3*[Standard Deviation])

Lower Control Limit (3SD)
[Mean] - (3*[Standard Deviation])

Then we can calculate the variance between these two limits: 

Variation (3SD)
[Upper Control Limit (3SD)]-[Lower Control Limit(2SD)]

Step 3 - Join Original Data

Now we have calculated all of the limits, we can join this back onto our original input step. This will bring the limits into each of the complaints rows. 

The join condition is Week = Week: 


Our data now looks like this:


Step 4 - Calculate Outliers

The next step is to calculate whether or not each of the complaints are outside of our given limits and therefore an outlier. Again we need to have three different calculations, one for each SD:

Outlier? (1SD) 
IF [Complaints] < [Lower Control Limit] 
OR [Complaints] > [Upper Control Limit] 
THEN 'Outside' 
ELSE 'Within' 
END 

This is then repeated for the other two SD as well: 

Outlier? (2SD) 
IF [Complaints] < [Lower Control Limit (2SD)] 
OR [Complaints] > [Upper Control Limit (2SD)] 
THEN 'Outside' 
ELSE 'Within' 
END 

Outlier? (3SD) 
IF [Complaints] < [Lower Control Limit (3SD)] 
OR [Complaints] > [Upper Control Limit (3SD)] 
THEN 'Outside' 
ELSE 'Within' 
END 

This creates a flag field to show whether each of the complaints are inside or outside. For example, this could be used to colour the different marks when using in Tableau Desktop. 

Step 5 - Create Outputs 

The final step this week is to filter for only outliers and then export these to a single document with three different sheets. To do this we are going to need three different branches (one for each SD) and then use the output to Excel function.

Within each of the clean steps, we want to filter for 'Outside' only from the outlier field, and then remove other SD outliers and control limits. This can be repeated for each SD so you are left with 10 fields in each output.

For the output we want to create three separate output steps, which will create the different sheets that we require. The file name will be the same in each but each one will update a different sheet in the Excel document: 


As a result our output should look like this: 


If you want to test the values you have produced, Carl has built the Control Chart using Table Calculations in Desktop and shared here:
Here's our full output for comparison. Remember we don't care about the order of the columns or rows as Tableau Desktop will import them in whatever order they come in!

After you finish the challenge make sure to fill in the participation tracker, then share your solution on Twitter using #PreppinData and tagging @Datajedininja@JennyMartinDS14 & @TomProwse1

You can also post your solution on the Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help!

Popular posts from this blog

2023: Week 1 The Data Source Bank

2023: Week 2 - International Bank Account Numbers

How to...Handle Free Text