2024: Week 5 - Solution
Solution by Tom Prowse and you can download the workflow here.
This is the final week of our beginner month and we are focussing on joins and the different types of conditions that we can create from them.
Step 1 - Output 1
After inputting all of our data sources, we are first going to focus on creating the first output from the requirements. For this we first need to create a table that includes information about the Ticket sales and customers.
To get this information we join together the Ticket Sales table with the Customers table using an inner join on the Customer ID field
We then want to include the information about each of the flights, so we can combine the Flights table by using another inner join on the Date and Flight Number fields:
At this stage we'll have some fields that have been duplicated from the join so we can remove these and our final table for the first output should look like this:
Step 2 - Output 2
For the second output we want to go right back to the start and create a different workflow using the Flights and Ticket Sales tables. This will allow us to get the information about what flights have not got any bookings so far.
This time we want to use a slightly different join type as we don't want to return the values that match, but instead we only want to return that values from the Flights table that do not match the Ticket Sales - therefore meaning that these flights haven't had any bookings yet.
For this we can click on the venn diagram so that only the outside section on the flights table is selected. This may differ from being on the left or right depending on which table you dragged to create the join. In our case we are creating an outer left join using the Flight Number and Date as the conditions:
We can again remove any duplicated or unneeded fields and then add a timestamp using a calculated field with a makedate function:
Flights unbooked as of
MAKEDATE(2024,01,31)
The final output should look like this:
Step 3 - Output 3
The final output is looking at information around which customers are yet to book a flight, so for this we need to combine the Ticket Sales and Customer Tables.
This is going to be a similar join as to the unbooked flights as we are wanting to only return the Customers who are not within the Ticket Sales table, so we can utilise the outer join.
This time we want an outer right (or left depending on your setup) join on Customer ID where we return only customers that don't appear in the ticket sales table.
We can then remove any duplicated or unneeded fields but make sure we keep the customer ID field from the Customer table and not from the Ticket Sales. The Ticket Sales will return null.
From here we can then calculate how many days it has been since they had last flown
Days Since Last Flight
DATEDIFF('day',[Last Date Flown],date("2024-01-31"))
Then we can create a customer category based on the number of days since they last took a flight with us:
Customer Category
IF [Days Since Last Flown]>270
THEN 'Lapsed (over 9 months since last flight)'
ELSEIF [Days Since Last Flown]>180
THEN 'Been away a while (6-9 months since last flight)'
ELSEIF [Days Since Last Flown]>90
THEN 'Taking a break (3-6 months since last flight)'
ELSE 'Recent Fliers (less than 3 months since last flight)'
END
The final output should look like this:
After you finish the challenge make sure to fill in the participation tracker, then share your solution on Twitter using #PreppinData and tagging @Datajedininja, @JennyMartinDS14 & @TomProwse1
You can also post your solution on the Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help!