2023: Week 12 - Solution
Solution by Tom Prowse and you can download the workflow here.
This weeks challenge is a long one, so hopefully you stayed with it and chipped away at the different techniques that were required to solve the challenge.
Step 1 - UK Bank Holidays
This will allow us to fill down the year for each row and bank holiday date
From here we can tidy the table by keeping the relevant information. First we want to find the max source row number for each year:
Max
Then we can filter to keep only where the Max = the Source Row Number-1 and finally remove the Max, Source Row Number-1, Source Row Number, and Year field then rename Year-1 to Year. Our table should look like this:
Then finally we can remove the nulls from the Date field and create a date based on the Date and Year fields.
UK Bank Holiday
Then after removing the unneeded fields we should have a list of all UK Bank Holiday dates:
Step 2 - Combine UK New Customers
We can then join this table onto our existing workflow ensure that all of the rows from the UK Bank Holiday table is included. This will mean that we need to use a left (or right) join where UK Bank Holiday = Date:
After the join our table should look like this:
Step 3 - Reporting Days
Next we want to identify the reporting days as per the requirements:
This allows us to identify the weekends (Sat & Sun) and the values with no date within the UK Bank Holiday field are reporting days.
From here we can then split the workflow to assign customers based on the next reporting day that they joined. We can have separate branches for this:
Joined on a Non-Reporting Day
All we need to do here is filter the Reporting Day field to Keep Only 'N'.
We then need to make sure we have all of the reporting days in this list so that we can reassign to the closest reporting day. For this we need a separate branch where we filter the Reporting Day field to exclude 'N'.
Then rename Date to Reporting Date and remove all other fields. Once we have the list of reporting dates, we can then join this back to the branch of non-reporting days. This join needs to be an inner join where the Date (from the Non-Reporting branch) < Reporting Date (from the 2nd branch).
This allows us to then have all the reporting dates after the given date, so we can then use a Fixed LOD to identify the minimum - this will identify the next reporting date:
Once we remove the Min field our table should look like this:
Reporting Days
We need a final branch which will include all of the reporting days. Within this branch we want to filter the Reporting Day field to keep only 'Y' and then rename Date to Reporting Date.
From here we can combine both branches by using a Union so that we have a single table that looks like this:
We can then aggregate the view so we have a total number of customers for each reporting date:
Then find the last day of each month:
Then as per the requirements, we want to remove the January 2024 months by excluding this from Reporting Month field.
Then finally calculate the Reporting Day by using this calculation:
Reporting Day
{ PARTITION [Reporting Month]: { ORDERBY [Reporting Date] asc : ROW_NUMBER()}}
After these calculations our table should look like this:
Step 4 - ROI Data
We're now ready to include the ROI New Customer table. Within this input we can remove the reporting day and rename the Reporting Month and New Customer fields with an ROI prefix using the Rename Fields functionality:
We can now align the ROI data to the UK Reporting Days. First we can use a left join to on Reporting Date to ensure that we're matching all the dates from the UK branch and the ones that match in ROI:
We can also do the inverse by using an outer right join on the same fields on a separate branch:
After this join we can rename the Reporting Date-1 field to Reporting Date and then keep only ROI Reporting Month, ROI New Customers, and Reporting Date.
We can then bring both branches together by joining where the UK Reporting Date > ROI Reporting Date.
We can then find the next day by finding the minimum:
Min
Then filter the table where [Min]=[Reporting Date]
We can then combine this back with the other Reporting Days via a Union and the table should look like this:
Step 5 - Final Cleaning
Finally we need to tidy the table a little more. We can replace any nulls in the New Customer & ROI New Customers field with a 0.
Then finally we want to create a flag to identify where the reporting months differ between the two systems.
For this we want to aggregate where we group by Reporting Month, Reporting Day, and Reporting Date, then Sum New Customers and ROI New Customers, and also return the Max ROI Reporting Month.
Then finally we can use this to create a misalignment flag:
Then we are ready to output our table that should look like this: