2022: Week 1 - Solution

Solution by Tom Prowse and you can download the workflow here


Welcome to #PreppinData 2022! This is the first solution post of the year so if you are new to the challenges then welcome 👋. Each week we will provide a solution is two ways; 1, a step by step written solution with screenshots and explainers. 2, a video solution where we will walk you you through the solution and you can watch us use Tableau Prep. Please note that these solutions might be different so it might be worth looking at both for some different perspectives.

Let's get into this week's solution!

Step 1 - Input Data

The first step of pretty much any challenge is to input the data. Luckily, this week we only have one table to input therefore we can go ahead and connect to the provided CSV file. 

When Tableau Prep opens for the you are provided with a screen where you can select previous flows that you have been working on, or the option to 'Connect to Data' by pressing the blue button. 


From here the blue menu on the left will open and you can select the 'Text File' as an input because we are connecting to a CSV file.


Next we want to select the CSV file from our list of files, and then Tableau Prep will connect to this file and automatically create the input step for us so we are ready to start preparing data!


Step 2 - Pupil's Name

Now we have access to our data, we can start to follow the requirements to prepare our data for the output. The first step is to create the full name for each of the pupils. 

Currently, we have two fields, pupil first name and pupil last name. We need to bring these together in the format Last Name, First Name. 

To do this we can use a string calculation where we use a '+' symbol to create the full string. The calculation looks like this: 

Pupil's Name

[pupil last name]+', '+[pupil first name]

Notice that we can also include other pieces of text, and these are shown by using inverted commas / speech marks.

As a result of this calculation our table now looks like this (the fields that we have used are in the red box:


Step 3 - Parent Contact Name

This step is similar to the previous one where we need to create a string which contains the first and last name for each parent. However, this is slightly different because there are two parental contacts for each pupil and a number indicating which parent's name should be selected.

To make the selection between the two we need to use an IF statement within the string calculation so the decision can be made on a row by row basis. The calculation looks like this: 

Parent Contact Full Name 
[pupil last name]
+', '+ 
  (IF [Parental Contact] = 1 
  THEN [Parental Contact Name_1]
  ELSEIF [Parental Contact] = 2 
  THEN [Parental Contact Name_2]
  END)

The first part of the calculation just returns the last name for each pupil (for this challenge we are assuming that the children and parents have the same name), then we separate with a comma just like in the previous pupil name calculation.

The second part of the calculation is the IF statement where we make a decision on what parent to take depending on the Parental Contact field (Parent 1 or Parent 2). 

As a result of this calculation our data should now look like this: 


Step 4 - Contact Email

Step 4 builds on the previous step where we used an IF statement within our string calculation by adding a few extra pieces of text that are the same throughout all of the rows. For the email address we want to create this in a FirstName.LastName@ParentsCompany.com and we can do this using the following calculation: 

Parental Contact Email Address
(IF [Parental Contact] = 1 
THEN [Parental Contact Name_1] 
ELSEIF [Parental Contact] = 2 
THEN [Parental Contact Name_2] 
END)
+
'.'+[pupil last name]
+
'@'+[Preferred Contact Employer]+'.com'

This uses the same IF statement as before, but then adds in a '.', the last name, an '@' symbol, the parents company, and finally a '.com'. Each of the pieces of text that are the same throughout the rows are added between inverted commas to ensure that Tableau Prep recognises them as text.

Our table should now look like this: 


Step 5 - Academic Year

The next step takes us away from the String calculation theme that has been running through the previous steps. For this step we need to focus on Date calculations instead as we want to calculate what academic year each of our pupils is in based on their date of birth. 

Within the requirements we are told that each academic year starts on 1st September and that pupils born after 1st Sept 2014 are in academic year 1. Therefore we can use the following calculation to categorise each public into a year: 

Academic Year 

IF [Date of Birth] >= #2014-09-01# THEN 1
ELSEIF [Date of Birth] >= #2013-09-01# THEN 2
ELSEIF [Date of Birth] >= #2012-09-01# THEN 3
ELSEIF [Date of Birth] >= #2011-09-01# THEN 4
END

For each row we have determined whether the Date of Birth field is greater than or equal to (>=) the given date (this is shown between the #'s). We have then done this for each of the different years up to year 4. 

This probably isn't the best way forwards as you would need to update this calculation every year for the new students, but at this stage we are happy with this calculation.

The table of data should now look like this: 


Step 6 - Tidy for Output

We have now done most of the work and the final step is to clean up our table so that it matches the desired output. This can be done in a couple of ways:
  1. Three Dots then Remove 

  2. Select multiple fields (using CTRL/CMD key) and then pressing Keep Only or Remove Fields

We have decided the go for the Keep Only option, therefore we need to highlight the four fields that we want to keep (Academic Year, Pupil's Name, Parental Contact Full Name, and Parental Contact Email Address) then press the Keep Only button. 

Congratulations 🎉 We have finished the first #PreppinData of the year!!

Our output should look like this: 


You can download the full outputs here

After you finish the challenge make sure to fill in the participation tracker, then share your solution on Twitter using #PreppinData and tagging @Datajedininja@JennyMartinDS14 & @TomProwse1

You can also post your solution on the Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help! 

Popular posts from this blog

2024: Week 1 - Prep Air's Flow Card

2023: Week 1 The Data Source Bank

How to...Handle Free Text