2023: Week 34 - Solution

Video Solution


Solution by Tom Prowse and you can download the workflow here.



We are onto the penultimate challenge of our HR month and we are looking at turnover rates this week.

Step 1 - Every employee & every month

After inputting our data source, we first want to aggregate the data so we have a single row for each employee, their hired date and their latest leave date. Within the aggregation we can group by dc_nbr, employee_id and hire_date, then find the max leave_date: 



We then want to find the start of the month when an employee has left the company. This will be used when we are creating the new rows for the months between their hire and leave dates: 

leave_date_start_of_month 
DATE(DATETRUNC('month',[leave_date]))

Then we can fill in any null values with a standard date of 2023-06-30: 

leave_date_start_of_month 
DATE(IFNULL([leave_date_start_of_month],#2023-06-01#))

We are now ready to create a row for each of the months between the hire and leave dates for each employee. To create this we can use the New Rows step where we create 1 new row for each month where the hire_date<=leave_date_start_of_month.


After creating the rows the table should look like this: 




Step 2 - Number of employees who have left

Next, we want to calculate how many employees have left the company and how many are still present at the end of each month.

First, we want to replace the dummy leave date, that we used to fill out the rows, with a null value if an employee hasn't actually left the company: 

leave_date_start_of_month
IF ISNULL([leave_date])
THEN NULL
ELSE [leave_date_start_of_month]
END

Then we want to create a flag to identify the final month that an employee worked before leaving: 

ee_leaving 
IF [leave_date_start_of_month]=[month]
THEN 1
ELSE 0
END

Before we do the employee count, we first need to find the end of each month when an employee left by using this calculation: 

month_end_date 
DATE(DATEADD('day',-1,(DATEADD('month',1,DATETRUNC('month',[month])))))

Now that we have calculated all of these we can calculate the number of unique employee ids present at the end of the month: 

ee_count
IF 
(ISNULL([leave_date]) OR
[leave_date]>=[month_end_date])
THEN 1
ELSE 0
END

We now have some flag fields that will allow us to aggregate and create some counts of employees based on each DC and month. Within the aggregation step we want to group by dc_nbr and month_end_date, then sum ee_count and ee_leaving. This will get how many employees are employed or have left for each month.



Step 3 - Fill in all months

Next we need to make sure that we have filled in all of the months between the start and end for each dc.

First we want to find the min and max month end date for each dc so we need to duplicate the month_end_date and then rename each of the month_end_date fields to max and min. 

We can then use an aggregation step to group by dc_nbr then calculate the min and max of the dates. 


We can then find the first of the month for each the min and max with the following calculations: 

min
DATE(DATETRUNC('month',[min]))

max
DATE(DATETRUNC('month',[max]))

Then we can fill in the missing months between the min and max so we have a row for each month. 


Then we need to repeat the calculation to find the end of month date (like we did previously): 

month_end_date
DATE(DATEADD('day',-1,(DATEADD('month',1,DATETRUNC('month',[month])))))

We can now join this back to our original data where we aggregated and found the counts per dc and month. Using an outer join where we return all of the months and has a condition where dc_nbr = dc_nbr and month_end_date = month_end_date


After the join we can replace any null values in the ee_count and ee_leaving fields with 0 using the ZN function.

At this stage the table will look like this: 



Step 4 - Turnover Calculation

Finally, we are ready to calculate the turnover calculation based on what is stated within the requirements. We need to calculate the following fields: 

ee count for last month: 
ee_count_pm 
{ PARTITION [dc_nbr]: { ORDERBY [month_end_date] asc : LOOKUP([ee_count],-1)}}

ee count from 12 months ago
ee_count_p12
{ PARTITION [dc_nbr]: { ORDERBY [month_end_date] asc : LOOKUP([ee_count],-12)}}

Average number of employees this month
avg_ee_month
ROUND(([ee_count_pm]+[ee_count])/2,6)

Average number of employees over the past 12 months
avg_ee_p12
ROUND(([ee_count_p12] + [ee_count]) / 2,6)

Then we can calculate a moving calculation to calculate the number of employees who have left over the last 12 months

ee_leaving_p12


Then finally we can calculate the monthly turnover and turnover over the last 12 months: 

turnover_month
ROUND([ee_leaving] / [avg_ee_month] * 100,2)

turnover_p12
ROUND([ee_leaving_p12] / [avg_ee_p12] * 100, 2)

We can then remove any fields that we don't require and prepare the data for our output: 


You can download the output from here.

After you finish the challenge make sure to fill in the participation tracker, then share your solution on Twitter using #PreppinData and tagging @kelly_gilbert@emmajones115@Datajedininja@JennyMartinDS14 & @TomProwse1

You can also post your solution on the Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help! 

Popular posts from this blog

2024: Week 1 - Prep Air's Flow Card

How to...Handle Free Text

2023: Week 1 The Data Source Bank