2021: Week 19 - Solution
Solution by Tom Prowse and you can download the full workflow here.
We continued our fundamental challenges this week with a focus on string calculations. We are trying to find out some more detail on what is going on with the project over runs in Prep Air by parsing out the key information from the commentary log.
Step 1 - Create Week
The first step that we have done this week is to create a 'Week X' field. We have already been provided with the week number so we need to add the string 'Week' before each one. We use the following calculation for this:
Week
'Week '+ str([Week])
Our week rows now look like this:
Step 2 - Split Commentary
The next step is to break apart each of the different projects from the commentary field. To do this we want to split each one into a separate row by using the split functionality. Each project/task starts with a pair of square brackets (eg, [NLS/Op-Sc]) so we can use the '[' to split each of the different projects.
Our custom split function looks like this:
After the split there should be each project as a separate field for each week:
Ideally we want all of the different projects to be in a single row, therefore we can use a columns to rows pivot to do this. Using a wildcard pivot will allow us to pick up any fields with the word 'Split' in them meaning that this will work for any future projects as well:
After the pivot we also want to remove any blank fields that have occurred from the split/pivot steps.
Step 3 - Parse Project & Sub-Project
Now we have all of the projects in a single column we can start to parse out the key information using the following steps:
1, Split Detail
Using a custom split on ']' we can separate the detail text from the commentary:
As a result our data looks like this:2, Split Project, Sub Project & Task
Again we can use the split functionality to split the project and sub-project. This time we want to use '/' as the separator:
Then the final clean is to remove the punctuation from the Project Code field. Tableau Prep has in-built functionality to do this so it's super simple
We now have a clean Project Code field, so the next task is to again use the split functionality to split apart the sub-project and task field. Instead of using a custom split, an automatic split will work here so we should now have two columns that look like this:
Step 4 - Join Lookups
Step 5 - Owner Code
Step 6 - Days Noted
An alternative would be to use a Regex function with the following calculation:
Days Noted
REGEXP_EXTRACT([Detail],'.*?(\d+)\sdays.*')
After extracting the number of days, we are then ready to output the data.