2020: Week 48 - Solution

 

Solution by Tom Prowse and you can download the full workflow here


This week's challenge is a tough one and a bit of a monster workflow! We were looking at allocating gates to different planes in the most efficient way so that we could improve efficiency and customer service. This involved an iterative process which is a little tricky in Tableau Prep but let's take a look at how to solve it!

Step 1 - Split Gates

First we want to use the Stands table, and use the split functionality to split each of the gates into a separate column. We'll need to use a custom split, on all values using a ',' as a separator and this will split the Accessed by Gates field into 7 different columns: 


We then want to reshape these fields using a pivot, so that they are in a single column. We can use a Columns to Rows and a wildcard on 'Split' within the pivot tool: 


Finally, we can clean up the fields after the pivot by removing any empty fields from the Accessed by Gates Split field, and then removing the Pivot1 Names and Accessed by Gates field. Our data should now look like this: 



Step 2 - Time to Reach Stands 10-12

Before bringing in the Remote Stands table, we want to filter our workflow so that we are only focusing on gate/stands that require a bus. So we use the filter to keep only values that equal 'Y' in the 'Requires Bus?' field. We can then join this to our Remote Stands table using the following join condition:



We now want to include all of the gate/stands, so on a separate workflow branch we now want to filter for values where a bus isn't required (Bus Required = 'N') and then union this onto our newly joined table. 


We can also replace any nulls within the Time to Reach Stand field with 0, as these stands can all be accessed without a bus.

Finally, to clean the Stand field, we can use the Remove Letters function to leave us with just the stand number. Our table should now look like this:



Step 3 - Stand Allocation Times

For this step we want to focus on the Stand Allocations 01.02.2020 AM table. You'll notice that there is a data on the end of the table name, which we can use in our date calculations. To bring the field name into our data table, we can use a union field (with only one connection) and this will create a field called 'Table Names'. 

We can then parse out the date from the table name, and this will also help to make the process repeatable in the future. We can parse the date using the following calculations: 

Date
REGEXP_EXTRACT([Table Names],'s (.*) AM')
This extracts all characters between the 's ' and AM.

Date
[Date]+" "+[Time]
Combines the Date and Time field.

Date
DATEPARSE('dd#MM#yyyy hhmm',[Date])
Creates a date field from the string.

As the requirements tell us that each gate in needed for 45 mins, then we can also calculate what time each gate is required until: 

Needed Until
DATEADD('minute',30,[Date])
We only add 30mins as it isn't required up to :45 mins. 

We can now remove the Table Names and Time field, and our table should look like this:


Now we have a start and end time for each gate, we can join this onto our workflow using the Stand field: 


Next we can clean the Gate field in a similar way to the Stand, by removing all letters and converting to a number. Once we have done that, we are ready to use the Gate Availability table as a scaffold to fill in the times in-between. To do this we use the following join condition: 



Finally, we want to clean some fields by removing Gate & Needed Until, then renaming Date to Boarding Begins, and Date-1 to Date. 

Step 4 - Iteration Process

Now we have our data cleaned it's now ready for the iteration process where we assign different gates depending on the instructions from the requirements. 

First up we want to assign all flights that are only accessible from 1 gate. Therefore, first we need to calculate how many flights are from each gate: 

Number of Gates

After calculating this, we then want to split the workflow into two separate branches - Number of Gates >1 and Number of Gates =1.

Gates = 1
For this there is only one possibility, therefore we can join this onto our Gates Availability table using an Inner Right join on the following conditions: 


We can then remove Date, Number of Gates, and Accessed by Gates, then rename Date-1 to Date. 

Gates > 1
All of these flights have more than 1 gate, therefore aren't assigned in the first iteration, therefore we can remove the Number of Gates field and join this to our Gates =1 workflow. The join conditions look like this: 


Then the final step in the first iteration is to identify whether the gate is available for the full period. First we want to keep only nulls from the Flight-1 field and rename Date-1 to Date. We can also remove the following fields: 
  • Time to Reach Stand-1
  • Stand-1
  • Date
  • Boarding Begins-1
  • Gate
  • Flight-1
  • Require Bus-1
After cleaning our table up a bit, we can now calculate whether the gate is available for the whole period. We use a Fixed LOD for this: 

Gate Available for Entire Period?


From this calculation we are CountD on the Date field for each Flight and Accessed by Gates combination. As we are working on 45 mins intervals, we would expect there to be three 15 min periods in the entire period. Therefore, we want to filter the Gate Available for entire period? field to keep only values that equal 3. 

Our workflow for this first iteration looks like this:

Our table looks like this:


Step 5 - Iteration 2: Remote Gates

The next part of this iteration logic is to assign all flights to remote gates via the bus transport. Therefore the first step is to identify whether the flight requires a bus or not, so we filter 'Requires Bus' and keep only 'Y'. 

Next we want to find out what the minimum time to the remote stand for each flight. We calculate this using this LOD: 

Min Time to Stand

We can then use this field in a filter to keep only the fields where the Min Time to Stand = Time to Reach Stand. As per the requirements we are looking to minimise the bus transport times.

We now want to self-join this using the following join conditions: 



After the join we need to merge some of the fields which have come from both sides so that we can eliminate any of the null values. We are going to merge the following fields with their -1 equivalent: 
  • Boarding Begins
  • Time to Reach Stand
  • Stand
  • Flight
  • Requires Bus?
We can then remove Date & Accessed by Gates, then rename Date-1 to Date. Our table should now look like this: 


Notice there are still some nulls, but these will get filled in later. 

The next part of this iteration is to find filter the flights that we have just assigned. To do this, we need a list of just the flight numbers, therefore we use an aggregation tool to return one row for each flight number: 


Then we can use an outer join to our original flight data. Using the outer join will act as a filter and remove any rows where the flight has been assigned. 


We then want to join the newly filtered flight list to all of the remote flights that we merge earlier in the workflow: 


Then the final step in this iteration is to keep only null values from the flight-1 field. 

Our workflow looks like this for the 2nd iteration: 


The data table should look like this:


Step 6 - Iteration 3

The next iteration is similar to the first iteration that we completed where there was only 1 possibility. However, we are going to run this again following the merging of the data from the Remote Gates. 

Again, we are going to start by filtering by number of gates into separate branches.

Gates = 1
The first branch is where the number of gates equal 1, and we are then going to self-join with the following conditions: 


We then want to merge the fields like we did in the previous iteration. We are merging: 
  • Flight
  • Stand
  • Requires Bus?
  • Time to Reach Stand
  • Boarding Begins
Gates > 1
Within this branch we want to filter where the Number of Gates is greater than 1 and then remove the Number of Gates field. 

Once filter, we want to join this back onto the Gates = 1 branch using the following join conditions: 


Then from here we want to keep only null values from the Flight-1 field, and calculate whether the gate is available for the entire period using this LOD: 

Gate Available for Entire Period


Again because we are looking at 45 min period, we only want to keep values that are equal to 3 from the Gate Available for Entire Period field. 

This is the end of this iteration so our workflow looks like this:


The data table looks like this: 


Step 7 - Leftovers

Now we have completed the different scenarios to assigning gates, we have some leftovers that still need to be assigned. 

For this we want to go right back to the start of the workflow to our Remote Stands table. From here we have created a new step then followed these steps: 
  1. Remove Letters from Gate
  2. Change Gate to Number (Whole)
  3. Rank Time to Reach Remote Stands using this calculation: 
    Order to Assign Leftovers


  4. Remove Time to Reach Remote Stands
We now want to join this to our workflow after the third iteration using the following join conditions: 


After the join we want to only keep values where the lowest value in the order is assigned to the leftovers. 

Therefore we use the following calculations: 
  1. Find the minimum order per flight

    Min Order


  2. Filter where Min Order = Order to Assign Leftover
We then want to calculate where each gate is only assigned to 1 flight: 

Each Gate only Assigned to 1 flight?


And then filter where Each Gate only Assigned to 1 flight? = Flight

Our table should now look like this: 


We now want to join the leftovers to where we merged the fields in the final iteration using these join conditions: 


We then want to follow the same process as we have done previously, where we merge the fields, and filter the flights. 

First, aggregate the flight numbers:


Then join this to our workflow:



Then join back to the merged fields: 


Filter to keep only the null values from Flight-1 and then join back to the merged fields: 


Then re-merge the fields as we have done in previous steps. 

After the final merge and removing any additional fields, we are now ready to output our data!


Our full output can be downloaded here

After you finish the challenge make sure to fill in the participation tracker, then share your solution on Twitter using #PreppinData and tagging @Datajedininja@JennyMartinDS14@JonathanAllenby & @TomProwse1

You can also post your solution on the Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help! 

Popular posts from this blog

2023: Week 1 The Data Source Bank

2023: Week 2 - International Bank Account Numbers

How to...Handle Free Text