Posts

Showing posts from October, 2023

2023: Week 43 - Solution

Image
Solution by Tom Prowse and you can download the workflow here . Step 1 - Combine Inputs The first step is to create a single table that contains the information from each of our input tables. We have an input for each of the 3 terms and another input with information around the year that the student is in. First we want to combine the 3 term tables by using an union to 'stack' them on top of each other. We then want to find the total number of days that a student was present and absent from class. We can use an aggregation step to group by first & last name then sum days present and days absent:  Finally we want to include the data around which year each student is in. This time we are going to use an inner join on both of the name fields:  After removing any duplicated fields the table should look like this: Step 2 - Attendance Rate Next we want to combine the first and last names so we have a full name field:  Full Name [First Name]+" "+[Last Name] Then we can c

2023: Week 43 - Best Attendance Awards

Image
 Challenge by: Trea McElhone This challenge is part of the ongoing series related to “Prep” School provided by DS38. The Prep School is facing an epidemic - too many students are being taken out of class for term time holidays! Thus the staff have come up with their best idea to discourage swapping out Maths class for a ski holiday to Val Thorens, awards for good attendance, because there is no better way to incentivise children than with a laminated sheet of paper and a gold star. These awards will be presented at the end of the school year in the annual Prize Giving Ceremony. As such, the Prep School needs to figure out which pupils are eligible for these awards. First, they want to give year group level awards, with the top 5% of students in each year group receiving this ‘Great Attendance!’ award. Secondly, they want to give a ‘Super Star Attendance’ award to the pupil(s) with the best attendance out of the entire school. Input Inputs for each Term 1-3 Year Group Data Requiremen

2023: Week 42 - Solution

Image
Solution by Tom Prowse and you can download the workflow here . Step 1 - Combine Inputs First we want to combine all 4 of our different tables given within the input. This involves a sheet for each year (5 & 6) and also a sheet for each year's contact details.  We want to first use joins to combine each of the contact details with the song choices for each year. This will need two joins Year 5 Year 6  From here we can then combine the results of these joins using a union step and both years data should be 'stacked' on top of each other:  Step 2 - Total Votes Next up we want to remove any song choices that don't have more than 5 votes. First, we need to total the votes for each song by using an aggregation tool to group by Year Group and Song Recommendation, and then sum the number of rows:  After we have the totals we can then filter the Number of Rows field so that it only contains data which is greater than 5  After renaming some fields we are ready to output our

2023: Week 42 - Prep School "Disco Fever"

Image
Created by Zoe Reed This challenge is part of the ongoing series related to “Prep” School provided by DS38. The Prep School are holding a disco for Year 5 and 6. Each student has chosen a song they would like played – but the DJ will only play songs with 5 or more votes! Your challenge is to find out which songs are going to be played at the disco. Input Four different inputs: 1. & 2. are details for the students (one file for year 5 and one file for year 6) 3. & 4. are the details for song choices by the students (again one file for year 5 and one for year 6) Requirements Input the data The organiser would like to join the student details to the students’ song choices – join Year 5 Contact Details to Year 5 Song Choices, and the same for the Year 6 datasets. All the students need to be in the same table in order to count votes for each song. Use a Union step to combine the Year 5 data and the Year 6 data. There are now two columns that describe which song each pupil has voted

2023: Week 41 - Solution

Image
Solution by Tom Prowse and you can download the workflow here . Step 1 - Input Tables First we want to input both of our data tables into the workflow. The next instruction is to join both of these tables together using an inner join on Student ID, however, this isn't actually required as this will only bring in the name of each of the students and this isn't something that we require to complete the task.  We can join these tables but for efficiency we are only going to work from the Nationality table. Step 2 - Clean Nationality Next we want to clean the spelling errors in the Nationality field. To do this we can use Tableau Prep's in-built functionality which will group values based on similar spellings. We can adjust the sensitivity until we have the correct grouping that we require:  Step 3 - Count Students We can now create a count of students based on the classroom and nationality. For this we want to use an aggregation step to group by classroom and nationality and a

2023: Week 41 - An International School

Image
 Created by: Adrien Sourdille This fundamental tableau prepping challenge comes from DS38 - Adrien Sourdille . Over to Adrien to explain the challenge: “Because I studied in a bilingual school, I grew up in an international environment where my classmates had very varied origins. The number of nationalities represented in my classroom always baffled me and is the inspiration for this challenge.” Your task is to group students by nationality and figure out what nationality is the most represented within each classroom. Input Two csv files: 1. Student Name 2. Student Nationality Requirements Input Data 1 and Input Data 2 Join both datasets on the Student ID (p.s. you shouldn't always listen to the requirements :) ) Group Values by Spelling to get rid of spelling mistakes in the Nationality field Aggregate the dataset to get a count of students within each Nationality and classroom Create a calculated field to output the Rank of each Nationality by classroom in descending order. Foll

2023: Week 40 - Solution

Image
Solution by Tom Prowse and you can download the workflow here . Step 1 - Pivot Class First we want to bring all of the classes into a single column so that we have a field for the Subject, Class or Dropped, and Student Name.  To do this we can use a Columns to Rows pivot where we use a wildcard pivot to bring in all fields with the word 'Class': Step 2 - Group & Active Flag Next we want to correct the spelling mistakes within the Subject field by using the in-built grouping features within Tableau Prep. We want to group by 'Spelling' which will group together the subjects that are spelt similarly into the correct spellings.  We can then remove any nulls from this list and this will give us all of our relevant subjects in the correct format. After this we can create a flag to determine whether the class is active or dropped. To determine this we can remove the number from the Class or Dropped Class field, again using the in-built functionality within Tableau Prep. We

2023: Week 40 - Prep School Subject Dropouts

Image
Created by: Ed Hayter This challenge is part of the ongoing series related to “Prep” School provided by DS38 that we started to share earlier in 2023. This Preppin’ Data Challenge requires you to process the school’s log of student class choices that were recorded in a structure that made sense for timetabling and reshape it so that the school has a breakdown of drop-out rates for each subject. Because the data was inputted manually there are typos in subjects that need to be corrected. Input One file which you can download here . Requirements Input the student class choice data   Make sure Prep reads the first line of the file as headers. Pivot the data so that we have a row for each class choice. Group the Subject Names on Spelling to standardise the subject names and exclude nulls. Create a flag to mark the groups that are dropped or active. Aggregate the Data by Subject and Active Flag and count the students in each group (note you'll need to use Count rather than Count Distinc