2021: Week 52 - Solution

 


Solution by Tom Prowse and you can download the workflow here


This week was the final challenge of 2021 and also the last instalment of our Departmental December! We focused on the Operations team this week and how we can help the Prep Air team improve by understanding their customers' complaints. 

Step 1 - Complaints per Customer

First up we want to calculate the number of complaints each customer has made. We can do this by inputting the Complaints table and then using a Fixed LOD to calculate this whilst maintaining the same number of rows. 

For each Name we want to do a distinct count of their complaint:

Complaints per Person


As a result of this we should have numbers ranging from 1 complaint to 3 complaints.

Step 2 - Join Department Responsible Table

The next step is to combine our Department Responsible table with the Complaints table in our workflow. The problem here is that we want to join on a keyword, but that keyword is within the sentence of the complaint.

To overcome this, we need to perform a cross join, so that all of the departments and joined to all of the complaints, and from there we can see if the complaint contains one of the key words.

First we want to do the cross join between the tables. For this we need to use a calculation to create a field within both tables that can be used as this join. We have decided to use a 1, but you can use anything as long as it's the same in both tables.


From here we can remove all the created fields containing 1 as we only required these for the cross join.

Next, we can focus on whether the field contains the keyword from each department. Before we filter our results we first need to make sure the Keyword and Complaint fields are in Lowercase. We can use the Clean functionality in Prep to do this easily. 


We can now start to filter the extra rows that were created as a result of the cross join by using a calculation within a filter to determine whether or not the complaint contains the keyword: 

CONTAINS([Complaint],[Keyword])

At this stage our data should look like this with only the complaints that contain a keyword a the corresponding Department:


Step 3 - Comma Separated Field

Now we have given each of the complaints a department, we now need to combine any keywords that appear for each complaint and department combination. By combining these keywords with a comma then we will eliminate any duplicates or multiple rows of the same complaint.

First, we need to pivot our data so that each keyword is it's own field with the keyword in the rows as well. To do this we need to duplicate the Keyword field, and then use this within a Rows to Columns pivot where the Keyword is in the Pivoted Fields and a min of Keyword-1 is in the Aggregated section. 


As a result of the pivot we should have each of the keywords as a field with a null or the keyword in each of the rows: 


The next step is to create the concatenated, comma separated field for each of the keywords. There isn't a nice feature within Tableau Prep to do this but we can use the following calculation to create the concatenated list:

Complaint Causes
iif(isnull([broken]),'',[broken])
+iif(isnull([food]),'',', '+[food])
+iif(isnull([room]),'',', '+[room])
+iif(isnull([toilet]),'',', '+[toilet])
+iif(isnull([luggage]),'',', '+[luggage])
+iif(isnull([price]),'',', '+[price])
+iif(isnull([advertise]),'',', '+[advertise])
+iif(isnull([baby changing]),'',', '+[baby changing])

After this calculation it will create a comma before each of the keywords (even if they don't need a comma), therefore we need to remove these additional commas.

Complaint Causes
IF LEFT([Complaint causes],2) = ', ' 
THEN MID([Complaint causes],3)
ELSE [Complaint causes]
END

As a result of this calculation we have removed the extra commas, and then only returned the keywords as required. After removing all of the keyword fields our table should look like this: 


Step 4 - Unknown or Other Complaints

The final task of this challenge is to categorise any complaint that doesn't have a keyword. To do this we need to include join back to our original complaints table (where we calculated complaints per customer in step 1) so that we can filter the complaints list so that we only have the ones without a keyword/department. 

We can use the join conditions to create this filter by using an outer right (or left depending on your setup) join where Name = Name and Complaint = Complaint: 


After removing and renaming some of the fields our table should look like this: 


Notice how the Department and Complaint Causes fields are null, therefore we can replace the nulls with Unknown and Other in the corresponding fields. 

Now there aren't any nulls within these fields we can combine them back with our other complaints so that we have the full list that we started with. Using a union tool we can bring back the two tables together by 'stacking' them on top of one another:


The final task is to remove the Table Names field and we are ready to output our data.


You can download the full outputs here

After you finish the challenge make sure to fill in the participation tracker, then share your solution on Twitter using #PreppinData and tagging @Datajedininja@JennyMartinDS14 & @TomProwse1

You can also post your solution on the Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help!

Popular posts from this blog

2023: Week 1 The Data Source Bank

2023: Week 2 - International Bank Account Numbers

How to...Handle Free Text