2022: Week 11 - Solution

Solution by Tom Prowse and you can download the workflow here.

This week we looked at one of the frequently occurring problems in data preparation - missing data and dealing with the gaps!

We returned to the Prep School for this weeks challenge so let's see how to solve it!

Step 1 - Input Data

First we need to input our table of data which is a single CSV file, and looks like this when input into Tableau Prep:


Notice the null values for Lesson Name and Subject, these are the areas that we want to identify and fill in with the correct information from above. 

Step 2 - Weekday Aggregate 

After identifying the null values, we can then create an aggregate field to show us what Teacher and Lesson Time combinations happen on each day, whilst retaining the Min value for the others.

The aggregate tool is set up as follows: 



As the same lesson repeats on each week, we can remove the Week field at this stage.

After the aggregate our table should now look like this: 


Notice how we no longer have the weekly breakdown, but a snapshot of what happens across each day of the week. 

Step 3 - Join Original Table

This is the stage where we can use this new weekly snapshot table to fill in the missing values from our original data set.

We need to create a join from our latest step with the first step that we created: 


The join conditions need to be set up so that we are joining where Weekday = Weekday and Teacher = Teacher. As a result this will return the 40 rows that were in our original data table: 



Step 4 - Clean Fields After Join

After the join we have a few fields that have been duplicated so we can remove these from our table. We want to remove the following: 
  • Lesson Name-1 
  • Subject-1
  • Weekday-1
  • Teacher-1
  • Lesson Time-1
We can also rename the following fields:
  • Attendance to Avg Attendance per Subject & Lesson
  • Attendance-1 to Attendance
Our table should now look like this: 



Step 5 - Extract Lesson Time

The final step is to extract the time from the Lesson Time field. This is currently formatted as a Date field, therefore we can use the following calculation to convert it to a string, and then extract the right side of the string: 

Time
RIGHT(STR([Lesson Time]),8)

We can then remove the Lesson Time field and our table is ready to output:


You can download the full output here

After you finish the challenge make sure to fill in the participation tracker, then share your solution on Twitter using #PreppinData and tagging @Datajedininja@JennyMartinDS14 & @TomProwse1

You can also post your solution on the Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help! 

Popular posts from this blog

2023: Week 1 The Data Source Bank

2023: Week 2 - International Bank Account Numbers

How to...Handle Free Text