Posts

Showing posts from January, 2022

2022: Week 4 The Prep School - Travel Plans

Image
 Created by: Carl Allchin The final introductory challenge for 2022 looks at how are students are getting to and from the school. Are the students travelling in a sustainable manner? What's the most popular type of sustainable travel? Input There are two inputs this week. 1. The same input as week 1: 2. Travel choices where each student has filled in how they got to school in the previous week. The students entered these themselves so there are some spelling mistakes to watch out for.  Requirements Input the data sets Join the data sets together based on their common field Remove any fields you don't need for the challenge Change the weekdays from separate columns to one column of weekdays and one of the pupil's travel choice Group the travel choices together to remove spelling mistakes Create a Sustainable (non-motorised) vs Non-Sustainable (motorised) data field  Scooters are the child type rather than the motorised type Total up the number of pupil's travelling by ea

2022: Week 3 - Solution

Image
  Solution by Tom Prowse and you can download the workflow here. It's week 3 of  our introductory month and this week we are introducing some aggregation and joins! These are pretty fundamental data prep techniques so should be useful and very familiar as you continue on your data prep journey. Step 1 - Input Data The first step is to add in our data table, however this week we have two of these instead of just the single table that we had in the previous two weeks.  Therefore, we need to repeat the process for both tables by connecting to both tables and then dragging each of them onto the canvas. At this stage we should have two input steps, 1 for Grades and 1 for Students. Step 2 - Join Tables Now we have both of the tables in the workflow, we need to combine these so that we can use the data from both tables in our workflow. To combine these we are going to use an inner join. The inner join will match the key values that appear in each of the tables and then combine the rows o

2022: Week 3 The Prep School - Passing Grades

Image
 Created by: Carl Allchin Congratulations, you've made it to Week 3 of our introductory month. Hopefully by now you will know when preparing data there are many ways to create the same output. We love the variety of solutions we see and how to approach the challenges in different tools. Please keep sharing your solutions with us on Twitter, Github, writing blog posts or forums.  This week's challenge will introduce a few new concepts that will become very familiar as you increasing prepare your own data sets:  Aggregation - you've probably used the SUM() and AVG() function in Excel and many different tools but when using tools like Tableau you need to think about data a little differently. You need to think about what each row of data represents. The categorical fields like Student ID and Subject are what set the granularity  of the row of data. The more categorical fields your data set has, the more granular  your data set is likely to be. When aggregating data when prepa

2022: Week 2 - Solution

Image
Solution by Tom Prowse and you can download the workflow here . This week continued with our Prep School theme by turning our focus to dates. For this challenge we want to buy a birthday cake for each of our students, therefore we need to use the same data set from last week to work out when to buy each one. Step 1 - Input Data The first step is to input the data. This is the same as last week where we are going to input the CSV file (text file input) into our workflow.  From the input step we can also remove any fields that aren't required for this week. All of the parental fields can be removed, therefore we can use the check boxes to easily remove these from the start of the workflow:  Step 2 - Pupil's Name After inputting the data, we are going to touch on something that we covered last week; String Calculations. For this we want to do a simple calculation to bring the First and Last Name fields together:  Pupil's Name   [pupil first name]+' '+[pupil last name]

2022: Week 2 The Prep School - Birthday Cakes

Image
 Created by: Carl Allchin Welcome back for week two. The first four challenges this year are aimed at people getting used to the fundamental skills of data prep (or maybe using the challenges to learn a new tool / language).  Last week, we introduced you to The Prep School by having you help us create a nice data set with the contact details of the parents of our pupils. This week we are working on a very different problem that you might not find a piece of cake! The Prep School loves any excuse to buy a cake to celebrate and what is better than celebrating one of our student's birthdays? This sounds easy until you realise we have 1,000 students and what do we do about those who's birthday are on a weekend day? We can't have them miss out.  Input This week's input is the same as Week One's. You can download  a copy here if you don't already have one.  Requirements Input the data set Removing any unnecessary fields (parental fields) will make this challenge easi

2022: Week 1 - Solution

Image
Solution by Tom Prowse and you can download the workflow here .  Welcome to #PreppinData 2022! This is the first solution post of the year so if you are new to the challenges then welcome 👋. Each week we will provide a solution is two ways; 1, a step by step written solution with screenshots and explainers. 2, a video solution where we will walk you you through the solution and you can watch us use Tableau Prep. Please note that these solutions might be different so it might be worth looking at both for some different perspectives. Let's get into this week's solution! Step 1 - Input Data The first step of pretty much any challenge is to input the data. Luckily, this week we only have one table to input therefore we can go ahead and connect to the provided CSV file.  When Tableau Prep opens for the you are provided with a screen where you can select previous flows that you have been working on, or the option to 'Connect to Data' by pressing the blue button.  From here t

2022: Week 1 The Prep School - Parental Contact Details

Image
Challenge by: Carl Allchin Welcome to Preppin' Data 2022. When Jonathan and I started Preppin' back in 2019 I never thought we'd be still thinking up challenges after this long. But we are seeing lots of people still wanting to practice their data prep skills so Jenny , Tom and myself will keep creating fun challenges for you to be ready for data prep challenges.  Data preparation is a skill that is increasingly becoming necessary in the workplace. We are all surrounded with data sets that can help us answer questions that can improve how we work. Rarely is that data ready for analysis. Either the values are messy or the data just simply isn't structured in a way that makes analysis easy. If you want to know more about what data structures are good for analysis: read this post . I've found with most organisations I have worked with, it's rare someone will get the chance to practice data preparation skills. After all, which company wants to let their employees

2021: Week 52 - Solution

Image
  Solution by Tom Prowse and you can download the workflow here .  This week was the final challenge of 2021 and also the last instalment of our Departmental December! We focused on the Operations team this week and how we can help the Prep Air team improve by understanding their customers' complaints.  Step 1 - Complaints per Customer First up we want to calculate the number of complaints each customer has made. We can do this by inputting the Complaints table and then using a Fixed LOD to calculate this whilst maintaining the same number of rows.  For each Name we want to do a distinct count of their complaint: Complaints per Person As a result of this we should have numbers ranging from 1 complaint to 3 complaints. Step 2 - Join Department Responsible Table The next step is to combine our Department Responsible table with the Complaints table in our workflow. The problem here is that we want to join on a keyword, but that keyword is within the sentence of the complaint. To overc