2023: Week 19 - Solution



Solution by Tom Prowse and you can download the workflow here


This week was a Tableau Conference special! This was the first time that Jenny & Carl were able to get lots of other preppers into a room at TC and work through a Preppin' Data challenge all together.

The challenge has three parts, each with increasing levels of difficulty so let's look at how we can solve them. 

Challenge 1 - Beginner Level

Step 1 - Split Description

First we want to input the Session Description table and from here we can split out the Speaker Name and Presentation Description from the Description field.

To do this we want to split the Description field using a ':' as a separator. This will split the field into two fields - before the ':' is the presenter and after the ':' is the description. 


We can then rename the fields and remove the Description field so the table looks like this: 



Step 2 - Create Initials

We can now extract the first letter of each name by splitting the name field using a ' ' as a separator. This will give us a field for each of the first name and last name where we can use a calculated field to bring both of the initials together: 

Speaker
LEFT([Name - Split 1],1)+LEFT([Name - Split 2],1)

After tidying the fields, the table should look like this: 


Step 3 - Categorise Sessions

Next we want to categorise each of the sessions based on key words within the Description. First we want to make sure that the Description is all lower case so that we don't have any case problems. 

We can then use the following calculation to help create the session Subject: 

Subject 
IF CONTAINS([Description],'prep') 
THEN 'Prep'
ELSEIF CONTAINS([Description],'server') 
THEN 'Server'
ELSEIF CONTAINS([Description],'desktop') 
THEN 'Desktop'
ELSEIF CONTAINS([Description],'community') 
THEN 'Community' 
END

This gives us a key word for each of the subjects and we can then create a Boolean field that will identify whether or not the sessions is about deduplication: 

Deduplication Flag
CONTAINS([Description],'deduplication')

We can then filter to keep only the True values and we are ready to output our first table: 



Challenge 2 - Intermediate Level

Step 1 - Pivot Rooms

Next we want to include a 2nd data set which contains information about the rooms. After inputting we want to pivot the table so that we have the Room and Floor on the same row.

To do this we can use a Columns to Rows pivot where we use a Wildcard Pivot on 'Floor' so that if includes all of the fields with the word Floor:


From here we want to remove any nulls from the Floor field and rename Floor to Session and Pivot1 Names to Floor so the table looks like this: 



Step 2 - Room & Session Details

First we want to create a room number for each session. We can do this by removing all letters and trimming spaces from the Floor field:



Then we can create a room number with the floor number with this calculation: 

Room 
[Floor]+'0'+[Room]

Then we can split the Session field into Session Name, Speaker and Subject by using a custom split of all values using a '-' as a separator.


We can rename the fields so our table looks like this: 


Step 2 - Combine Output 1 

We can now combine our current workflow with the output from the first task. To combine the tables we can join these using the Subject and Speaker fields from each table. 


We only want to focus on floor 2, so we can filter the results to keep only 2 from the Floor field. Then remove the fields that are no longer required and the output should look like this: 



Challenge 3 - Advanced Level

Step 1 - Reshape Data

Next up we have a distance matrix between each of the rooms. For this to be useful within our analysis we first need to change the shape of the table so we have a column for room A, room B and the distance. 

To reshape the data we can use a columns to rows pivot where we can use the wildcard so that all fields that contain a 0 are included:


After the pivot we can rename the fields and then remove any null values from the Meters field. Finally we want to remove any rooms that are the same in both A & B so we can use a filter to where [Room A]!=[Room B]

As a result our table should now look like this: 


Step 2 - Travel Times

We can now calculate how long it will take to get between each of the rooms:

Mins to Next Room
([Meters]/1.2)
/
60

Then we can make sure that these are rounded up to the nearest minute by using the Ceiling function:

Mins to Next Room 
CEILING([Minutes to next room])

Now we have calculated the time we can combine this with our 2nd output from earlier in the challenge. 

To combine these outputs we need to use two different branches, one for Room A and another for Room B.

Room A 

We can join both tables where Room = Room A using an Inner join


Then we can tidy the table to rename Room B to Room A, and Room to Room B.

Room B 

We can do the same join for Room B but this time make sure we are choosing the correct field so Room = Room B: 


This time we don't need to rename any fields.

Both branches are now ready to be brought back together by using a union step. As the tables have the same field structure they 'stack' on top of each other nicely to create a longer table. 

Finally we just want to filter to the room that we are currently in (302) so we can Keep Only 302 in the Room A field. 

Our final table should look like this: 



After you finish the challenge make sure to fill in the participation tracker, then share your solution on Twitter using #PreppinData and tagging @Datajedininja@JennyMartinDS14 & @TomProwse1

You can also post your solution on the Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help! 

Popular posts from this blog

2023: Week 1 The Data Source Bank

2023: Week 2 - International Bank Account Numbers

How to...Handle Free Text