2023: Week 22 - Solution


Solution by Tom Prowse and you can download the workflow here


This week we are looking at a challenge created by Tanbir Jalil to analyse and see if there is any correlation between test scores and class attendance.

Step 1 - Combine Data Sources

First we want to create a single table that contains that information from both the Test Scores and Attendance Figures data sets. To do this we can join them using an inner join on Student_Name: 




Step 2 - Clean Data

Next we want to correct some spelling mistakes in the Subject field. To do this we can select each individual option in the profile pane and ensure that English and Science are spelt correctly. 



Step 3 - Split Student Name

We can now split the student name field so that we have a column for each of the first and last name. To do this we want to use a custom split on the '_' and then rename the suitable columns to First Name and Surname. 



Step 4 - Attendance Flags

Before creating the attendance flags, we need to make sure the test scores are rounded to a whole number. 

TestScoreInteger
ROUND([test_score])

We can then create a flag based on the score:

Attendance Flag 
IF [attendance_percentage] >= 0.90 THEN 'High Attendance' 
ELSEIF [attendance_percentage] >= 0.70 THEN 'Medium Attendance' 
ELSE 'Low Attendance' 
END


Now we are ready to output the data that looks like this: 



You can download all the output from here.

After you finish the challenge make sure to fill in the participation tracker, then share your solution on Twitter using #PreppinData and tagging @tanbir_jalil@Datajedininja@JennyMartinDS14 & @TomProwse1

You can also post your solution on the Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help! 

Popular posts from this blog

2023: Week 1 The Data Source Bank

2023: Week 2 - International Bank Account Numbers

How to...Handle Free Text