2023: Week 11 - Solution

Solution by Tom Prowse and you can download the workflow here


This week we are looking at calculating the distance for our closest customers by utilising the Haversine formula as our spatial calculations. We are looking at latitude and longitude and then calculating the distance with some trigonometry.

Step 1 - Combine Data

First we want to input both of our data sources and then combine them together so that all of the customer locations can be compared to all of the branch locations. To append both of these table we want to create a cartesian join using a dummy field from both tables. For the dummy field we have created a new calculated field with the number 1 but you can use anything within this field as long as it matches on both tables. 

From here we can use an inner join on the dummy from each side:


After this our number of rows will massively increase as we now have a row for each combination:


Step 2 - Transform Lat & Long

The next step is to transform each of the lat & long into radians. To make this easier we want to pivot the data so we have all the customer and branch lat longs in a single column. For this we can use a columns to rows pivot with each of the lat long values in the pivot values: 


Then we can calculate the values as radians: 

Transformed Values
[Pivot1 Values]
/
(180/PI())

Then make sure the labels are updated to show we are calculating radians instead:

Radians
[Pivot1 Names]+' - Radians'

At this stage our table should look like this: 


We can then pivot the table back to the original structure that we had before by using a rows to columns pivot where we are grouping by Radians, and Sum Transformed values:


At this stage our table should look like this: 


Then we can calculate the distance between the customer location and branch locations. 

Distance 
ROUND(
3963*
ACOS(
(SIN([Address Lat - Radians])*
SIN([Branch Lat - Radians]))
+
COS([Address Lat - Radians])*
COS([Branch Lat - Radians])*
COS([Branch Long - Radians]-[Address Long - Radians])
),2)

We no longer require the lat and long fields so can remove those and our table should look like this: 


Step 3 - Customer Priority Rating

Finally we want to assign a priority rating to each of our customers based on how close they are to a branch location. 

First we want to rank the distances of the branches from each customer


Then we can use this field to filter to the closest branch for each customer (Keep Only = 1). This then provides us with a list of customers and their closest branch. 

We then want to bring back the detail about the locations, so can join the table from before the first pivot step. By joining on Customer and Branch we can then return all of the details about each of the customer locations: 

 


From here we can again rank the customers based on their distance and give them a priority score


This the final calculation that we need and our table is now ready to output:


You can download all the output from here

After you finish the challenge make sure to fill in the participation tracker, then share your solution on Twitter using #PreppinData and tagging @Datajedininja@JennyMartinDS14 & @TomProwse1

You can also post your solution on the Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help! 

Popular posts from this blog

2023: Week 1 The Data Source Bank

2023: Week 2 - International Bank Account Numbers

How to...Handle Free Text