Posts

Showing posts from December, 2021

2021: Week 52 Departmental December Operations

Image
 Created by: Carl Allchin One of the constant tasks I have had in my career is dealing with complaints data to ensure the customers / clients of the organisations get a better experience than previously. The final challenge for 2021 is helping our Prep Air team improve by understanding their customers' complaints.  Input One Excel workbook . Two worksheets: 1. Complaints 2. Department Responsible Requirements Input the spreadsheet Count the number of complaints per customer Join the 'Department Responsible' data set to allocate the complaints to the correct department Create a comma-separated field for all the keywords found in the complaint for each department ( help ) For any complaint that isn't classified, class the department as 'unknown' and the complaint cause as 'other' Output the file Output 5 data fields Complaint Department Complaint causes Name  Complaints per person 13 rows (14 incl. headers) You can download the  full outputs here .  After

2021: Week 51 - Solution

Image
  Solution by Tom Prowse and you can download the workflow here .  This week we visited the IT Department to give them some help with how to optimally store their data by using fact and dimension tables. We received a desired schema from the IT department, so it was our job to split apart our table so that it meets their requirements. Step 1 - Store Name The first step is to input our table, and then we can focus on splitting out the Store name from the OrderID field. The OrderID contains the Store, then a '-', and finally a number (the OrderID). We want to separate each side of the '-' into separate fields, therefore we can use the automatic split function to do this. Once we rename the newly split fields we should have separate columns for Store & OrderID: Step 2 - Returned Field Next we want to determine whether or not the item was returned, and then turn this into a binary field. Currently, if the item has been returned then it has the string 'Return Process

2021: Week 51 - Departmental December - IT

Image
 Challenge by: Jenny Martin It's the IT department's turn to receive our help this week. Rather than optimising the data for feeding into a data visualisation tool, we're going to think about how to optimally store data. In particular, we're going to be thinking about breaking a dataset down into fact and dimension tables.  In many challenges, we focus on bringing datasets together that have a common ID field. Databases will often follow a star schema, where each dimension is a separate table that can join onto the main fact table. We have received the following desired schema from IT: Input We have one large table to break down into the desired fact and dimension tables: Requirements Input the Data Split out the store name from the OrderID Turn the Return State field into a binary Returned field Create a Sales field Create 3 dimension tables for Store, Customer and Product When assigning IDs, these should be created using the dimension and minimum order date fields so

2021 Week 50 - Solution

Image
  Solution by Tom Prowse and you can download the workflow here .  This week we continued our Departmental December challenges with a focus on the Sales department and a data set that doesn't have much consistency in it's format. Step 1 - Input Data The first step is to input the data from both sheets in the input (October & November). As they have a similar structure then we can use the wildcard union within the input tool to bring both in at the same time.  Our table should now look like this:  Step 2 - Fill In Salesperson Name The next task is probably the most tricky throughout the whole challenge. This is to fill in the missing values within the Salesperson Name field, by 'filling' upwards because the name is at the bottom of each monthly group. We provided a hint within the requirements, so if you are stuck then make sure you take a look at that first! In order to fill in the missing names, we first need to create a unique row ID for each of the rows. Although

2021: Week 50 - Departmental December - Sales

Image
Challenge By: Jenny Martin Uh-oh, looks like the Sales Team at AllChains have gone rouge with their trackers again. They can't even maintain consistency month to month! And what's the idea with having the Salesperson names at the end of each month?  This challenge was inspired by a dataset my colleague Ollie  encountered, except that dataset didn't have a RowID. If you think of a way to solve this challenge without a RowID we'd love to hear about it! Inputs October Tracker: November Tracker: Requirements Input the data Fill in the Salesperson names for each row (the name appears at the bottom of each monthly grouping) hint Bring out the YTD information from the October tracker and use it to create YTD totals for November too Reshape the data so all the bike types are in a single column Output the data Output 5 fields Salesperson Date Bike Type Sales YTD Total 915 rows (916 including headers) You can download the  full output here .  After you finish the challenge make s

2021: Week 49 - Solution

Image
  Solution by Tom Prowse and you can download the workflow here . The next instalment of our Departmental December challenges is all focused on Human Resources! We look at data related to our sales team and how long they have been with the company and how much that has cost us. Step 1 - Employment Range After we have input our table of data from the input file, we want to create an employment range field that captures how long the employee has been with the company. Within the requirements we need this in a MMM yyyy format for both the dates. This process involves us finding the Min and Max dates for each employee and to make the formatting of dates a little easier, we are going to update the locale within the input step. We want it to be English (United Kingdom) as this will give us the correct date formatting for our requirements.  From here we can then calculate the Min and Max employment dates for each of the employees. We want to maintain the row-level detail, therefore we can use