2020: Week 36 - Solution

 


The solution this week is by Tom Prowse and you can download our workflow from here

This week we took a look a school scheduling and if we had enough teachers and if the school is asking too much from them!

Step 1 - Students per Year per Subject 

The first step this week is to calculate how many students we have in each year and each subject. Using the Students table, we need to split the subject field so that we have a column for each subject. We do this by using a custom split - All values using the separator '/':


After this we have a column for each Subject: 


We now want to bring these all back into one column by using a pivot tool. We use a Columns to Rows pivot, using the wildcard pivot on 'Subject': 


After the pivot we can remove the Pivot Names, rename Subject Split to Subject, and Exclude any empty fields from the Subject field. Our table should now look like this: 


The final step in this part of the challenge is to use an aggregate tool to find the total number of students, per age and subject. We group by Subject & Age, then Count Name: 


We can then rename the Name field to No. of Students: 


Step 2 - Teaching Hours Required

The next stage of the challenge is to calculate how many hours each student needs based on their age. For this we use the Hours table, and similar to the Students table, we are going to use the Split functionality, but this time we will split the Age field. An automatic Split should work here. 

After splitting the Age, we want to Pivot these into a single column. Again we will use a Columns to Rows pivot but this time use 'Age' as the wildcard. 


After the pivot we can tidy the fields by removing the Pivot Names and renaming Age Group Split to Age. We should now have a row for each age and how many hours they require for teaching:


Now we have worked out how many hours are required for each age group, we can join this onto our original workflow. We can use the Age field from both tables to do this: 


Our workflow now looks like this:


Our data table looks like this: 


Step 3 - Room Capacity

The next task for the challenge is to work out the room capacity for each subject. This time we want to use the Rooms table, but we don't need to split any fields instead we want to use an aggregation. Within the aggregation we want to group by Subject and then Sum the capacity, giving us the capacity needed for each subject:


Now we have the capacity by each subject, we can join this to our original workflow. To do this we want to use an inner join on Subject: 


Now that we have combined the Rooms data, we can calculate how many rooms are needed using this calculation: 

Rooms Needed
CEILING([No of Students]/[Capacity])

Using the ceiling function will round the numbers up so that we don't have decimals or anyone missing out on a room. 

We then want to calculate the number of teaching hours that are needed. We use this calculation: 

Teaching Hours Needed
[Rooms Needed]*[Hours teaching per week]

At this stage we have calculated all that we need to do and can move onto the next step. Our table now looks like this:


Step 4 - Teacher Availability 

We are now going to focus on the final table of the input, which holds the information about each of the teachers. Like we have done previously, we will start by splitting the Working Days field using the split functionality, auto split should work but to be 100% sure you can use a custom split on All value using a ',' as a separator.

Next, we pivot the split values using a columns to rows pivot and 'Working Days' as our wildcard. 


After the pivot we want to tidy a few of the fields up by removing Pivot Names, renaming Working Days Split to Working Days, and excluding any empty values from Working Days. Our table should now look like this, with a row for each day a teacher will work: 


At this stage we need to split the workflow into two different sections. One section will calculate the total amount of working days and subjects for each teacher, and the second section will remove any extra rows by returning a single row for each teacher, subject and age group combination.

Section 1
For this aggregation we want to Group by Name, and then Countd on Working Days and Subject: 


We now have how many working days each teacher has, and also the number of subjects that they teach: 


Section 2
For the second aggregation we are going to group by Name, Age, & Subject. This will give us a row for each different combination: 


This has reduced our number of rows from 25 to 7 and we are now ready to join them back together. 


We can use an inner join and use Name as the condition: 


After the join, we can now calculate how many potential hours per subject each teacher has. We used this calculation: 

Potential Hours per Subject
([Working Days]/[No. of Subjects])*6

We can then work out what the range of ages each teacher is going to teach. We want to find the Min and Max for each Ages Taught category, therefore we can use an Auto Split on this field, which break this out into two fields for us. We can simply rename these fields from 'Ages Taught - Split 1' to 'Min Age' and 'Ages Taught - Split 2' to 'Max Age'. 

Our table now looks like this, and is ready to join to our original workflow: 


Step 5 - Metrics for Output

The final stage of this week's challenge is to bring all of the data together and calculate the metrics that we need for the output. 

First we want to join the Teachers table onto our original workflow. We can do this with the following join conditions: 


Note: Make sure you have the '>= / <=' for Age, as this will keep them within range!

We can then remove any additional fields that we no longer need and should be left with the following table: 


To calculate the metrics, we need to use a couple of aggregations. First we want to return one row for each Name & Subject combination, and return the Avg Potential Hours per Subject (this will remove duplication when aggregating), and Sum Rooms Needed & Teaching Hours Needed:


We now only have one row for each Teacher and Subject combination: 


Then finally we want to aggregate so that we only have one row for each subject. The setup looks like this: 


Now we have one row per subject, we can calculate the % Utilised with this calculation: 

% Utilised
Round(([Teaching Hours Needed]/[Potential Hours per Subject])*100)

That's the final step and we are now ready to output the data: 


After you finish the challenge make sure to fill in the participation tracker, then share your solution on Twitter using #PreppinData and tagging @Datajedininja, @JennyMartinDS14, @JonathanAllenby & @TomProwse1

You can also post your solution on the Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help! 

Popular posts from this blog

2023: Week 1 The Data Source Bank

2023: Week 2 - International Bank Account Numbers

How to...Handle Free Text