Posts

Showing posts from April, 2024

2024: Week 18 - San Diego Zoo TC24 Special

Image
Challenge by: Jenny Martin Welcome to the Tableau Conference Special for Preppin' Data. If it's your first time here, then a very warm welcome to you! This week we have something pretty special planned for you, with a 2 stage challenge! Seeing as Tableau Conference is in San Diego this year, we thought we'd take a trip to the world famous San Diego Zoo! They've got lots of data on the animals and plants that live at the zoo, but it's spread across multiple data sources. That's where they need our data prep skills to help them out! Stage One For the first stage, we need to bring together information about the animals and plants at the zoo, with their endangered status and wildlife class. This will help prioritize care for the wildlife in Stage Two of the challenge. Inputs All the data required for this challenge is contained within a single Excel file with multiple sheets. For Stage One, you will need 3 tables: Wildlife - a table listing all the animals and plan

2024: Week 17 - Solution

Image
Solution by Tom Prowse and you can download the workflow here . Step 1 - Month Spending Input First we need to input the Monthly Spending table so that we have all of the months in a single table. To do this we can utilise the 'Union multiple tables' option within the input step. We want to include all tables that have the word 'Month' in the title and include the wildcard symbol afterwards so it picks up multiple months. Once we have all the tables in a single input, we can then clean this table by removing all of the fields with a month-year title (eg, Apr-23, Aug-23 etc) and rename TransactionFees to Transaction Fees. Next, we want to utilise the Table Names field to extract the month name for each of the rows. To do this we can split off the last occurrence after the 'Month-' text:  After removing and renaming some fields our table should look like this:  Step 2 - Budget Input Next we can input the Budget table. For this table we want to identify the Budget

2024: Week 17 - Budget vs Actuals Part 2

Image
Challenge by: Michał Mioduchowski We're continuing with DS43's challenges so over to Michał to explain the her next challenge.  _____________________________________ Superbytes has tasked us with examining their historical budgets once more, this time focusing on 2023, when the company began tracking actual spending on a monthly basis. The objective is to determine primary areas contributing to over-expenditure over the course of the year. CEO Phil suspects that the company may have invested too much into inventory without witnessing a corresponding uptick in sales performance last year. If suspicion turns out to be true, it will provide actionable insights for optimising resource allocation, and enhancing overall financial performance in line with Superbytes' strategic objectives. Inputs There are 13 inputs this week: Forecasted Spending  Actual Monthly Spending (1 table per month)  Requirements Input the Excel file : Combine the Monthly sheets into a single table Since w

2024: Week 16 - Solution

Image
Solution by Tom Prowse and you can download the workflow here . Step 1 - Clean Actuals & Budgets The first step is to input the Actual and the Budget tables so that they are separate branches in our workflow. We want to combine these branches but first we need to clean each table so that they join nicely.  Actual   For this branch we just need to make a change to the 'TransactionFees' value in the 2022 field and make sure there is a space between the two words - 'Transaction Fees'. Budget For this branch we need to make a few changes to the Budget field so that the numbers are all formatted correctly as a number instead of a string. To do this we can duplicate the Budget field, then remove all letters from this field using the Clean functionalities in Tableau Prep  From here we can change this to be a whole number and then use the following calculation to format the budget numbers correctly:  Budget   IF ENDSWITH([Budget],'K') THEN [Budget-1]*1000 ELSE [Budg

2024: Week 16 - Budget vs Actuals

Image
Challenge by: Michał Mioduchowski We're continuing with DS43's challenges so over to Michał to explain the next challenge.  _____________________________________ For this week, Superbytes has requested us to look at their budget sheet for 2022. The company believes that their rough budget estimates were very close to the final budget allocations for that year. Examining historical budget data allows the company to identify trends in spending patterns over time. This can provide valuable insights into how expenses have evolved and help predict future expenditure trends. CEO Phil Down would like us to find the exact areas of mismatch between predicted and actual spending. Inputs There are 2 inputs for this challenge: Forecasted Spending   Actual Spending Requirements Input the Excel file Match Sheet 1 and Sheet 2 in formatting. Both should include: Category field [String] Budget/Actual field [Number (Decimal)] Join both sheets based on Category field to create a single table wit

2024: Week 15 - Solution

Image
Solution by Tom Prowse and you can download the workflow here . Step 1 - Days to Removal First we want to combine both data sets by using an inner join on Product ID:  This then allows us to calculate the number of days that it took to remove the products. We can calculate this using a datediff calculation:  Days to Removal   DATEDIFF('day', #2025-05-13#, [Date]) As a result our table should look like this:  Step 2 - Recall Targets Next we want to categories the products to determine whether or not they had met the target. Using an IF calculation we can give each one a category:  Category IF [Days to Removal]<=7 THEN 'On Target' ELSEIF ISNULL([Days to Removal]) THEN 'Incomplete' ELSE 'Overdue' END This allows us to give each of the products a category based on whether they have been removed on time or not. Step 3 - Output 1 - Store Rank We can now start to create our outputs. The first one is where we rank our stores based on how long it has taken the

2024: Week 15 - Store Data Part 2

Image
Challenge by: Hewan Zewdu We're continuing with DS43's challenges so over to Hewan to explain the her next challenge.  _____________________________________ For this week’s Preppin Data challenge, we’ll continue looking at stock/ store data where some items had to be recalled from the shelves. This time we will be determining how long it has taken to remove those items, and whether or not they have been removed at all. All stores were told the items in the ‘Recalled Items’ table needed to be removed from shelves on 13/05/2025. They were given 1 week to recall all the items (i.e. this must have been completed by 20/05/2025). Inputs There is 1 input file with two sheets which is identical to the previous challenge, but now containing a date field, which is the date the items were removed from the shelves. You can download the output from here . Requirements Input the data Calculate how long it has taken for each item to be removed, based off the day the recall was announced (13/

2024: Week 14 - Solution

Image
Solution by Tom Prowse and you can download the workflow here . Step 1 - Recalled Items The first step is to identify which items have been recalled, these can be found in the Recalled Items table but we need to identify these at a store level.  To do this we can use an inner join on product ID between the two tables:  This acts like a filter and only returns the items that have been recalled. After removing the duplicate fields the table should look like this:  Step 2 - Total Losses Next we can calculate the total losses as a result of the recalled items.  For this we need to calculate the total price of the items recalled. We calculate this by multiplying the Unit Price by the Quantity.  Total Price   [Unit Price] * [Quantity] We can then use this to calculate the total losses split by Category and also by Store. For this we'll need two separate aggregation tools on different branches.  Per Category Per Store   We can then make sure the Total Price field is rounded to 2 decimal p