Posts

2024: Week 47 - Salesforce Agent takes over

Image
 Created by: Carl Allchin & Jenny Martin  Salesforce Agent & JP King In the age of AI, we knew that it would only be so long before the machines takeover. Well Preppin' Data has finally succumbed. At the DataFam conference in London last week, Jenny and I took the opportunity to try to break  build an Agent as we were asked to see what we think. JP (a solutions engineer at Salesforce) helped guide us on a bit of a unique journey for him. We'll let you be the judge as to how well the Agent performed!  We prompted the agent with some simple instructions to look at the Preppin' Data site and build us a challenge about deduplication. Here's how it got on.  [Yes this challenge is simple but we thought it was worth sharing] Input One file that you can download here : Requirement Step 1: Connect to the Data Step 2: Add a Cleaning Step Step 3: Identify Duplicates Step 4: Remove Duplicates Step 5: Output the Cleaned Data Output One output file. 5 data fields: Customer ID

2024: Week 46 - Solution

Image
Solution by Tom Prowse and you can download the workflow here .  Step 1 - London Tube Stations The first branch that we can focus on is the input with information about the London Tube stations. We want to rename a couple of the fields to remove the 'Right_' prefix so we are just left with the Station Latitude or Longitude. We can use the Rename Fields functionality to do this:  After renaming the fields we then want to identify any duplicate rows and then remove these from the workflow. Again, we can use Tableau Prep's inbuild functionality to identify the duplicate rows and then filter to keep only the unique values. After removing the duplicates then we are ready to output our first table:  Step 2 - Attractions  Next we want to focus on our input that includes the information about the London Attractions and their footfall.  The first thing that we need to do is to ensure that the headers are identified within our table. There are a couple of options to do this including

2024: Week 46 - DataFam Europe Special

Image
Challenge By: Carl Allchin, Jenny Martin & Lorna Brown This challenge comes to you live from the first ever DataFam Europe! We've teamed up with the Workout Wednesday team to bring you a special challenge.  Since we're in London and very close to one of the most popular attractions (St Paul's Cathedral), we thought we could centre the challenge around popular London Attractions and their nearest Tube Stations.  You'll need to clean up the inputs to create outputs ready for Tableau Desktop.  Inputs The inputs for this challenge can be downloaded from either Google Drive (as normal) or Data.World . We have 3 inputs, which can be found in one single Excel File: London Tube Stations  Attraction Footfall  Location Lat Longs  Requirements Input the data For the London Tube Stations table: There are a lot of unnecessary fields, only keep information about the station name and location Clean up the field names There are a lot of duplicate rows. Make sure each row is uniqu

2024: Week 45 - Solution

Image
Video Solution Solution by Tom Prowse and you can download the workflow here .  Step 1 - Combine Tables First we want to combine all of our input tables together into a single table. We can do this using two inner joins, first joining the Available Stock and the Product tables using the Product ID:  Then join this table to the Supplier table using Supplier = Supplier ID:  After the joins we can remove any additional fields and our table should look like this: Step 2 - Orders & Quantity Now we have the inputs in a single table we can calculate the Stock and Quantity levels. First we can calculate whether stock has been ordered or not:  Stock Ordered [Quantity Available]<=10 Then we can calculate the quantity ordered:  Quantity Ordered   IF [Stock Ordered] THEN 30-[Quantity Available] END From here we can now calculate the quantity per week using the following calculations:  Previous Week Quantity Available   { PARTITION [Store],[Product] : { ORDERBY [Inventory Date] asc : LOOKUP(

2024: Week 45 - SuperBytes Stock

Image
Challenge by: Holly Jones This is the final challenge that DS43 prepared for us earlier this year - over to Holly:

2024: Week 44 - Solution

Image
Solution by Tom Prowse and you can download the workflow here . Step 1 - Sold & Returned First we want to tidy the dates a item was sold or returned so that the return date isn't before a sale date. We can identify any issues by creating an issue flag:  Issue? [Date Sold]>=[Date Returned] Then using this to recalculate the Order or Return date:  Order Date IF [Issue?] THEN [Date Returned] ELSE [Date Sold] END Return Date   IF [Issue?] THEN [Date Sold] ELSE [Date Returned] END We can then remove the Issue?, Date Sold, & Date Returned fields. Finally, we can calculate how many days it took to return an item by using a DateDiff function:  Days to Return   DATEDIFF('day',[Order Date],[Return Date]) And also calculate what the value of the returned items are:  Return Value   IF [Days to Return]<=60 THEN [Price ] ELSEIF [Days to Return]<=100 THEN round([Price ]*0.5,2) ELSEIF [Days to Return]>100 THEN 0 END At this stage our table looks like this:  Step 2 - Cal

2024: Week 44 - MiniBytes Returns Analysis

Image
 Challenge by:  Jessica Kirk MiniBytes were really happy with the analysis last week on worst performing products. This week they'd like us to look into the returns data so that they can better prepare for how much needs to be set aside for each month for returned items.  They have warned that there has been an issue with the returns systems that means some of the return and sale dates for particular items are in the wrong order (i.e. the sales date has been input as the return date). Input We've been provided with a small subset of data to build the logic from again: Requirements Input the data For all cases where the Date Sold is after the Date Returned, swap the values in these fields Calculate the number of days between the Date Sold and Date Returned For returns within 60 days, customers get a full refund. For returns between 60 and 100 days, customers will be refunded half (rounded to the nearest penny). Returns after 100 days are not refunded MiniBytes want one output to