Posts

Showing posts from October, 2022

2022: Week 43 - Missing Training Data 2.0

Image
 Challenge by: Jenny Martin Good news everyone! Joe was thrilled with the solution we came up with to solve his missing data problem. However, he realised that when he replicated it with the actual data, there was still one part that wasn't quite right.  You see, if the missing training sessions occur on the very first date in the dataset, the workflow we built last week will not fill in the scores since we are only filling down from previous sessions. So in the below example, Agility session would only be filled in from the 4th August onwards. Joe would like the values for the 2nd & 3rd August to be 0.  Input You'll need the workflow you built last week (go back and do the challenge if you haven't yet) An updated Player Training file:  Requirements Replace the input data from last week's workflow with the new datasource In a new branch, keep only the first training session for each player and each session  Work out the minimum date in the entire dataset Ad...

2022: Week 42 - Solution

Image
  Solution by Tom Prowse and you can download the workflow here . This week we look at sports training data and filling in the missing gaps where the players don't train. This helps to fill in the gaps and allows us to create some more realistic averages when there is missing data. Step 1 - Date of Next Session The first task is to calculate when the next training session is for each player - this will help us identify what days are missing a session. To fill in the missing dates we first need to create an ID field for each Player and Session combination. We can do this using a Rank calculation:  ID   This ID allows us to give an identifier to each of the sessions and therefore we can use this to identify the next or previous one by using a self join. Before the join we need to calculate the next ID by subtracting 1 from the ID: Next ID   [ID] - 1  This will allow us to bring the current and next session onto the same row and therefore use this within row-level ...

2022: Week 42 - Missing Training Data

Image
Challenge by: Jenny Martin Recently, my colleague Joe Stokes  brought me an interesting challenge that I knew needed to be turned into a Preppin' Data!  He was working with sports player training data. Since the athletes wouldn't take part in the same training sessions every single day, this was leading to a lot of gaps in the data. As there was a metric which calculated the average across all sessions for each day, this could lead to some misleading conclusions, with a lot of variation over time. Instead, what we want is to take the value from the previous session, even if the player didn't partake in the session on that day. This helps to keep the average a bit more stable. It also means we have data for every day, even if the player were off sick that day. Input Our input is very simple, with a row for each training session the player participated in and their resulting score. Requirements Input the data For each player and each session, we want to know what the date is ...

2022 Week 41 - Solution

Image
Solution Video Solution by Tom Prowse and you can download the workflow here . This week's challenge is a continuation of last week, where we created a multiplication table using a static input. This week we want to take that a step further and make it dynamic based on a user input. If you haven't already completed 2022 Week 40 then I would recommend doing that one first as we are going to use that workflow to start this solution. Step 1 - Create Parameter First we need to create the parameter that will allow the user to input a number to create the table.  This is a simple parameter that uses a Whole Number and is required to have an entry at run time:  By creating this parameter we can now input any number and our table will create a row and column up to that number. Step 2 - Filter Table Next we need to use the parameter within our workflow so that it creates the table based on the correct numbers.  We want to go to the start of the workflow where we have our current i...

2022: Week 41 - Dynamic Times Tables

Image
 Created by: Carl Allchin Carrying on from last week, can you now make the solution dynamic so you can create a multiplication grid of any size?  Like last week, this is more of a puzzle than a prescribed challenge so see how you get on. Input Same file as last week Requirements Input the data set Create a parameter that allows the user to set the multiplication grid they want Output the results Output An example output where 4 is selected You can download the  full output here .  After you finish the challenge make sure to fill in the  participation tracker , then share your solution on Twitter using  #PreppinData  and tagging  @Datajedininja ,  @JennyMartinDS14  &  @TomProwse1 You can also post your solution on the  Tableau Forum  where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help! 

2022: Week 40 - Solution

Image
Solution by Tom Prowse and you can download the workflow here . This week we revisit the classic task from our school years - multiplication tables! Our task is to create a multiplication grid using an input that contains two rows - 1 & 9. Step 1 - Create Rows First up this week we need to pad the rows out so that we fill in the missing numbers between 1 & 9. We can do this using the Add Rows functionality within Tableau Prep with the following setup:  This has allowed us to create a list of numbers from 1 to 9  Step 2 - Different Combinations Next we need to again pad out the rows, but this time we want to ensure that we have a row for each of the different combinations within our grid. For this we need to use a couple of different joins. First, we can create a new clean step - we can use this as a self-join back to our New Rows step. The first join is where all of the numbers are equal from both tables: For the second join we want to do the opposite task... this time...

2022: Week 40 Times Tables

Image
 Challenge by: Carl Allchin  I loved multiplication tests as a child... yes, I know that's a little weird but for anyone who knows me they might just be nodding right now.  This week's challenge is to replicate a 9x9 multiplication grid (or times table as I called them as a child). Sounds easy, right? The input file is just the number 1 and 9 this week. Also I'm not going to offer instructions as to what you need to do as I'm intrigued to see what you all come up with.   Input One file : Requirements Input data  Create a 9x9 multiplication table Output the data Output 10 fields: Number 1 2 3 4 5 6 7 8 9 9 rows (10 incl. headers) You can download the  full output here .  After you finish the challenge make sure to fill in the  participation tracker , then share your solution on Twitter using  #PreppinData  and tagging  @Datajedininja ,  @JennyMartinDS14  &  @TomProwse1 You can also post your solution on the  T...

2022: Week 39 - Solution

Image
Solution by Tom Prowse and you can download the workflow here . This week we revisit a favourite challenge of ours... a fill down! We are looking to help the HR department fill in the missing information around key employee dates. Step 1 - Employee Field First we want to focus on the employee field. This will allow us to 'copy' the employee name down until there is another employee.  The first step is to create a table with each of our employees and their ID. To do this we can remove the Null values from the Employee field, then remove the other fields so we are left with Employee ID (renamed from Record ID) and Employee: From here we want to join this table back to our original workflow using an inner join where the Record ID >= Employee ID: This allows us to join all of the rows where the record id is less than the employee id. This creates an expanded table, but we can then use some filters to ensure that the names are correctly labelled. To ensure we are filtering correc...