Posts

Showing posts from June, 2024

2024: Week 26 - SuperBytes Customer Complaints Part 2

Image
 Challenge by: Tobias Colmer We're continuing with DS43's challenges so over to Tobias  t o explain his next challenge.  _____________________________________ After recovering from last week's data breach that targeted our Superbytes complaint data, more discrepancies within the data were discovered, such as complaints that took over 100 days to resolve being considered as timely. We need to fix this. Input The input for this challenge is the output from last week. Requirements Input the data For Customer Complaints that are still In Progress, set the Date Resolved to today's date (for the output below this is set to 26th June 2024) Create a Days to Resolve field for Complaints which are closed Create a Days Open field for Complaints which are In Progress Update the Timely Response field to correctly say whether it was Timely or Not, based on if the Date Resolved was less than 14 days Create a parameter so a user can choose whether to output In Progress complaints or Cl

2024: Week 25 - Solution

Image
Solution by Tom Prowse and you can  download the workflow here .  Step 1 - Split Complaint Description First we want to work on the Complaints data source and then parse the Complaint Description into the Product ID, Issue Type, and the Description. Product ID   The product ID is the first set of numbers and letters before the '-' symbol. For this we can use a split to split only the first occurrence before the '-' and leave the remaining text in a separate field:  This leaves us with the Product ID and the remaining text in separate columns, and we can then remove the original Complaint Description field:  Issue Type   Next from the remain text we can extract the issue type. For this we want to use the ':' as the separator:  Then we can rename this field and remove the original split field:  Complaint Description   Finally we can extract the Description in a nice format by removing the ' " ' from either end. For this we can use two splits, and make

2024: Week 25 - SuperBytes Customer Complaints

Image
Challenge by: Tobias Colmer We're continuing with DS43's challenges so over to Tobias  t o explain his first challenge.  _____________________________________ This week Superbytes needs you to take a look at their complaints data, as an error was spotted in a recent data leak posted by rival supermarket WeakFloats: we need to check if their claims are true! They claim that: we have no category field in our data clearly stating the category of products being complained about & that we were resolving complaints before we’d even received them! Inputs There are 2 inputs for this challenge: The Complaints Data Table  The Product Category Lookup Table  Requirements Input the data Input the data Split the Complaint Description field into: Product ID Issue Type Complaint Description (the text the customer wrote to describe the complaint) Extract the first 2 letters of the Product ID as this represents the Category Code Use this Category Code to join on the Product Category from the

2024: Week 24 - Solution

Image
Solution by Tom Prowse and you can  download the workflow here .  Step 1 - Weekly Salary First we want to input the Tenures & Salaries table and then calculate a single row for each week when a salary will be paid. Currently, the salaries are at a yearly total so therefore we first need to calculate this on a weekly basis:  Salary  [Salary]/52 Now we have the weekly total, we can start to create a row for each of the different weeks. For this we can use the 'New Rows' step but first we need to create an end date for our range.  For this we can make sure that the End date is a Date field, and then replace the last date '17/06/2024' with the end date in the requirements '12/06/2024'. We can then create the new rows step where we create a row between the start and end date for each week:  Step 2 - Output 1 We are now ready to output our first table. For this we need to create a total salary per week so we can create an aggregation step and group by Week then Su

2024: Week 23 - Solution

Image
Solution by Tom Prowse and you can download the workflow here .  Step 1 - Combine Salaries & Tenures In order to combine the Salaries and Tenures tables, we need to ensure that the Names in the Salaries table are in Title case. We can use the Proper() function to do this and make sure that they match the Tenures table.  Name   PROPER([Name]) We can now join these tables together using the Name field:  Step 2 - Tenure  Now we have both tables together we can make sure the End Dates are formatted correctly by changing the field to a date and making sure that the '28/05/2024' is changing to '05/06/2024'. We can then use this end date to calculate the tenure:  Tenure DATEDIFF('day',[Start Date],[End Date]) Then use the Tenure to calculate the Expected Total Salary:  Expected Total Salary   ROUND(([Tenure]/365)*[Salary]) After these calculations the table should look like this:  Step 3 - Amount Paid Finally we can combine the Amount Paid table by using an inner j

2024: Week 24 - SuperBytes Salaries Part 2

Image
Challenge By: Andrew Tobin We're continuing with DS43's challenges so over to Andrew  t o explain his next challenge.  _____________________________________ Manager salaries at SuperBytes are paid weekly. Salaries are consistent throughout a given manager’s employment, but may change when a manager is replaced. SuperBytes wants to know two things: How much have they been spending on manager salaries each week since the company’s founding? What percentage of annual expenses is going to manager salaries? Input This week’s data has two sheets: One is similar data to last week’s, but the cleaning and joining is already done for you, giving a single sheet containing manager salaries and tenure dates The other contains the total expenses by year for the company Requirements Input the data Set the end date for current employees as 12th June 2024 Calculate the weekly salary of each manager (assuming 52 weeks in the year) For each manager, create a row for every week they receive their

2024: Week 23 - SuperBytes Salaries

Image
Challenge By: Andrew Tobin We're continuing with DS43's challenges so over to Andrew  t o explain his first challenge.  _____________________________________ Recently, SuperBytes realised they have underpaid one of their managers, Max Date, by around £5,000 over the course of his career. They want to know if they’ve done the same for any other managers. Assume all managers have received paychecks (correct or incorrect) up to the present date (5th June 2024) Input This week’s input contains three sheets:  salaries , containing annual contracted salaries  tenure , containing manager start/end dates  amount paid , containing the actual total amount each manager has received over their career  Requirements Input the data Input the data Clean the manager names in the Salaries and Tenures sheets so they match Bring together the two datasets Calculate the tenure (number of days) of each manager’s employment Hint: For current employees, see what happens when you change the datatype of

2024: Week 22 - Solution

Image
Solution by Tom Prowse and you can download the workflow here .  Step 1 - Loyalty Points & Email After inputting the table relating to Loyalty Points, we can then make sure that the DateTime_Out field is correctly formatted as a date. For this we can change the data type to a Date then rename the field to just Date.  We can then focus on the loyalty points. We want to extract the numeric part from the Loyalty Points field. We can do this by using the Automatic Split functionality which will nicely bring out just the numeric part of the string. Then we can tidy the field by changing the data type to a decimal, removing the original Loyalty Points field, then renaming the Split field to Loyalty Points. Next we want to extract the First and Last name from the Email Address field. For this we can use a custom split on the '@' and extract everything before the symbol. We can then split this field again using an automatic split which will extract the first name and last name init