2021: Week 17 - Solution

 


Solution by Tom Prowse and you can download the workflow here


This week's challenge came from Pat Lucas and we looked at a classic data preparation challenge based around timesheets. There were lots of detailed requirements in this one so let's see how we solved it!

Step 1 - Remove Totals

After inputting the data set, we first want to remove any rows that contained a total in the Project field. This is nice and easy as we can select the three rows ('Overall Total for Dan/Sam/George') and then right-click and exclude them from our workflow. 


Step 2 - Pivot Dates

Next we want to transform the structure of our table, and move all of the dates into a single column by using a Columns to Rows pivot. We have used the 'Wildcard Pivot' on a '/' to automatically bring in all of the date fields: 


Step 3 - Split Name, Age & Area

We are now ready to split the Name, Age, Area of Work field into separate columns. We can't use a single custom split in this case as there are multiple separators, however using an automatic split will bring these into three separate columns. 

After the split we can rename and remove any unnecessary fields so that we have Name, Age, Area, & Hours. We also want to exclude 'Annual Leave' from the Hours field and then change the data type to a decimal number. 


Step 4 - Total Hours

Next we want to calculate the total number of hours for each area, date, and employee by using an aggregation step. The setup will look like this: 


Step 5 - Calculate Total Hours & Days per Person

We are now ready to calculate the number of days and hours each person has worked. Again we are going to use an aggregation tool for this, with the following setup: 


From here we can then calculate the avg hours each person worked per day: 

Avg Hours 
[Hours]/[Date]

Our table now looks like this: 


Step 6 - Remove Chats

For this next stage we want to go back to the first aggregation tool and create a new branch from here. Within this step we can remove the 'Chats' rows from the Area field so we are left with Special Projects and Client only. 


Step 7 - Calculate Total Hours per Area

After removing Chats, we want to calculate the total hours per person and area using an aggregation tool with the following setup: 


Step 8 - Calculate Total Overall Hours

Next we want to create a new branch with an aggregation tool to calculate the total hours across all projects. For this we just need to group by person and sum hours. 


Then we can join both of these totals together using the Name field: 


Step 9 - Calculate % of Total

Using the two different totals we can calculate the % of total hours for each area: 

% of Total 
ROUND(([Hours]/[Total Hours])*100)


Step 10 - Avg  Hours 

The final step is to calculate the avg hours for client work only. On the same step that we created the % of Total we can Keep Only the 'Client' rows from the Area field.

Then we want to join this step with our Avg Hours step from the branch earlier in the workflow.  


Again we are going to join the branches using the Name field: 


After completing some final tidying our table should now look like our desired output: 



You can download the full outputs here.


After you finish the challenge make sure to fill in the participation tracker, then share your solution on Twitter using #PreppinData and tagging @Datajedininja@JennyMartinDS14 & @TomProwse1

You can also post your solution on the Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help! 

Popular posts from this blog

2023: Week 1 The Data Source Bank

2023: Week 2 - International Bank Account Numbers

How to...Handle Free Text