2022: Week 36 - Solution

Solution by Tom Prowse and you can download the workflow here.


The challenge this week looks at filling in missing data when trying to create a calendar in Tableau Desktop. This is a common problem that many people face and filling in the blanks can often be more difficult than expected. 

Lets see how we can use data prep to make this a little easier!

Step 1 - Start & End Date

The first step is to identify a Start & End date for our data set. We want this to be dynamic based on the dates within the input so this could change every time we connect a new file. 

First we need to duplicate the Schedule_Date field. This will allow us to use it twice within the aggregation where we can find the Min and the Max. 


From here we can then make sure we are looking at the start or end of month based on the min and max dates.

To make sure our start date is the start of the month, we can use the following calculation to make the 16/06/2022 into 01/06/2022: 

Start Date
DATE(DATETRUNC('year',[scheduled_date]))

Our end date needs to be the last day of the year. To do this we first need to find the first day of the year: 

End Date 
DATE(DATETRUNC('year',[scheduled_date-1]))

Then from here we can find the last day of the previous year, then add 1 year to this date. 

End Date

DATE(DATEADD('day',-1,DATEADD('year',1,[End])))

After these calculations we should now have our Start and End dates: 


The last task is to pad out the rows between the start and end dates. We need a row for each day between these.

We can do this by using the New Rows step within Tableau Prep with the following setup: 


We can then remove the Start and End dates and we should have a row for each day: 


Step 2 - One Employee per Day

Next up we want to create a field that contains each of our employees. To do this we want to go back to the start of our flow and create a new branch. 

From here we can combine the First and Last Name of each employee using a string calculation: 

full_name
[first_name]+" "+[last_name]

Then use an aggregation step to create a unique list of each employee (1 row for each): 

We now want to use this list to make sure that each employee has been assigned to each of the days in our other list. We can do this by using a cross join on each using a dummy join field. On each branch we need to create a field that is the same on both (we use 1) and then join these using an inner join: 


Notice how the number of rows increases massively. This is because each employee now has a row for each day between our start and end dates. 


Step 3 - Calendar Schedule

Now we have each employee assigned to each day, we can join the original schedule and then assign an employee to the correct day.

First we want to join our original table to the latest join using an outer join where we retain all of the dates between our Start and End date. The join conditions are where emp_id = emp_id and schedule_date = Date:


From here we can remove any unneeded fields (emp_id, first_name, last_name) and remove the -1 from any field names. Our table should look like this: 


Then we can create a flag to see if the employee has been scheduled for this day: 

scheduled
NOT ISNULL([scheduled_date])

Finally we make sure our schedule_date field is complete by ensuring there is a date for each Null value: 

schedule_date'
IFNULL([scheduled_date],[Date])

Our final table should look like this:


You can download the full output here


After you finish the challenge make sure to fill in the participation tracker, then share your solution on Twitter using #PreppinData and tagging @Datajedininja@JennyMartinDS14, @TomProwse1 @VizWhat


You can also post your solution on the Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help.


Popular posts from this blog

2024: Week 1 - Prep Air's Flow Card

How to...Handle Free Text

2023: Week 1 The Data Source Bank