Posts

Showing posts from April, 2021

Week 17: Timesheet checks

Image
Challenge by: Pat Lucas ( @PatForData ) This challenge came about from a challenge Pat received as his friends and family have begun to learn about his super data preparation skills. Pat has built out a detailed set of requirements so this might be an easier challenge for some. The Challenge My employees log their hours daily and are contracted to 8 hours per week so I want to check their average number of hours worked over the last 2 weeks. Also, I allow for 20% of their time (not including Chats) to work on their own special projects, meaning they should be spending at least 80% of their time on Client items of work, so I also want to check that they are sticking to instructions by calculating the % of total hours spent on Client work. The task has three sets of requirements as the stakeholder is quite specific. Input One file but three people's data: Requirements Remove the ‘Totals’ Rows Pivot Dates to rows and rename fields 'Date' and 'Hours' Split the ‘Name, A

2021: Week 16 - Solution

Image
  Solution by Tom Prowse and you can download our workflow here .  This week we looked at what would happen to the current English Premier League football table if the 'Big 6' teams were removed and didn't play any games. We wanted to create the current table and also the updated table, so there was a bit of repetition where we could leverage some of Tableau Prep's handy features. Step 1 - Calculate Home & Away Points The first step is to clean the Result field so that we see how many goals each team had scored in each match. There are some games that haven't happened yet so we can remove these by excluding the null values from the Result field.  Next, we split the Result field so that we have a field for home goals and a separate field for away goals. An automatic split should work here, but you can create a custom split using '-' as the separator. After renaming the fields our table should now look like this:  Now the home and away goals are separated,

2021: Week 16 - The Super League

Image
Challenge by Tom Prowse If you are a fan of football or have been following any of the latest news then you would have probably heard about the proposed new 'Super League' that was planned with a selection of European clubs. This would have involved a group of 12 teams playing in a competition each year without having to qualify or be relegated. The lack of  fair competition between other clubs has caused an uproar among fans, players, media outlets, football associations, and even the UK government!  The 'big 6' English teams to propose the Super League were Arsenal, Chelsea, Liverpool, Manchester United, Manchester City, and Tottenham Hotspur. One of the ideas to try and discourage the clubs from proceeding with the new league, was to threaten the English teams with being expelled from the English Premier League. The challenge this week is to try and understand how the current league table would change if these clubs were to be 'kicked out'. Inputs  The input

2021: Week 15 - Solution

Image
Solution by Tom Prowse and you can download our workflow here .  The challenge this week was all about restaurant orders and was provided to us by  Amalia García-Vellido Santías . There were lots of good fundamental data prep techniques in this one so it was good to pull a few of these together.  Step 1 - Menu Table Structure First we want to bring in the Menu table and we want to change the structure so that we can have a single column for each Type, Plate Name, ID, and Price. To do this we want to use a Columns to Rows pivot with the following setup:  As a result our data should now look like this:  From here we need to do some more tidying to get the individual columns that we require. First we want to make sure we have all of the IDs in a single column and mapped correctly for the right item:  ID   CASE [Type] WHEN 'Pizza' THEN [Pizza ID] WHEN 'Pasta' THEN [Pasta ID] WHEN 'House Plates' THEN [House Plates ID] END Then we want to do the same for the Price:  P

2021: Week 15 - Restaurant Menu & Orders

Image
Challenge by Amalia García-Vellido Santías We have another guest challenge creator with this week's challenge coming from Amalia.  This week we want to analyse the orders that customers have made over a period of time in our restaurant Serendipia. In order to identify how much money we earn each day of the week and also to discover who our top customer is. We are going to be using calculations, pivots and aggregations so lots of the fundamental techniques that are used within data prep! Inputs Menu - contains the menu of the restaurant (notice that the structure is not ideal) 9 fields 10 rows (11 + header) Orders  - each row represents the order a single customer have made at a certain date 3 fields 40 rows (41 + headers) Requirements Input the data Modify the structure of the Menu table so we can have one column for the Type (pizza, pasta, house plate), the name of the plate, ID, and Price ( hint ) Modify the structure of the Orders table to have each item ID in a different row (

2021: Week 14 - Solution

Image
  Solution by Tom Prowse and you can download the full workflow here . Prep Air was back this week and we were taking a look at the in-flight purchases within a few of the selected flights. This included combining multiple data sets , parsing strings, and identifying what seat purchases the most on our flights.  Hopefully this challenge allowed you to combine various core data prep techniques and put this in a real world scenario to help make everything easier to put together. Let's have a look at how we solved it!  Step 1 - Where is each seat located?  The first step this week was to identify where each seat was located on the plane (Window, Middle, or Aisle). Using the Seat List table we want to first pivot each seat letter so that we have all of the Seat Letters within a single column with the corresponding Passenger Number and Row:  We can now add a label for each of the seats positions:  Seat Position   CASE [Row Letter]       WHEN 'A' THEN 'Window'      WHEN &