Posts

Showing posts from October, 2019

2019: Week 38

Image
This week's challenge comes from a Data School consultant - Collin Smith . He used Tableau Prep Builder as described below before he joined The Data School and has been enjoying the Preppin' Data challenges so wanted to contribute his own challenge. I hope you enjoy it and let him know how you get on ! _____________________________________ Last spring, I worked with Health in Harmony , a planetary health organization in Borneo, to help get them set up using Tableau Prep and Tableau Desktop. One of my big jobs was to re-create a medical data workflow that was originally in SPSS with Tableau Prep. The challenge this week comes from one of the steps of that workflow that involves heavy use of the aggregation tool and some clever join logic. I’m heavily indebted to the Tableau Service Corps and Community Forums for guiding me through this the first time. Requirements Use the join and aggregation tools to find metrics o

2019: Week 36 Solution

Image
You can view our full solution workflow below and download it here ! Our full solution workflow. 1.   Convert the height for the Spurs. Firstly, we’re going to create the height conversion step. There’s a number of different ways to convert the feet & inches into meters here. An easy method is as follows: Remove all punctuation from [HT] using the relevant Clean option. Use an automatic split to split [HT] into [Feet] (split 1) & [Inches] (split 2). Calculate the [Height (in)] (height in inches) using: [Feet] * 12 + [Inches]. Convert this to [Height (m)] (height in meters) using: [Height (in)] * 2.54 / 100. We divide by 100 as without it we actually have the height in centimetres. Round this to 2dp using the ROUND() function. After this, make sure to remove the original [HT] fields as well as all the unnecessary fields created along the way. 2.   Convert the weight for the Spurs. We’re now going to create the weight conversion step. There’s less work to be

2019: Week 35 Solution

Image
You can view our full solution workflow below and download it here ! Our full solution workflow. 1. Use a Running Total to calculate the total stock in the warehouse to date. Starting with out output from week 34, we first want to calculate the total stock in our warehouse for each [Product] and [Scent] on any given date. To do this, we first duplicate the data and then join it to itself on: [Product] = [Product] [Scent] = [Scent] [Date] >= [Date] This means each row of data gets every previous row joined to it if it has the same product and scent. From this point, we can use an aggregation step to complete our running total and find our total quantity requested to date: GROUP on: [Supplier] [Date] [Product] [Scent] [Quantity] SUM up [Quantity-1] and rename to [Total Quantity to Date]. Total stock in warehouse aggregation settings. 2. Use a Running Total to calculate the total stock requested to date. We now need to perform another running total on t

2019: Week 37

Image
Aggregating Data Sources to a good level for analysis is always a "fun" decision to take. Leave the data at too detailed level of granularity and you may face slow performance but have high flexibility in what analysis you can do. If you aggregate the data to a too higher level, you might get great performance from the software you are using but suffer with the flexibility of questions you can answer. Well, what if we just want to form a table of those answers? This week's challenge is looking to do exactly that. At Chin & Beard Suds Co. we have had a higher level than normal of returns. We want to take some of our sales data from the first half of the year to understand what might have gone wrong so we can make sure we don't make the same mistakes in the future. To start the analysis, the company's leadership team have asked for '% returns' and then to compare different % returns for different factors in the business: The full data set Month of

How to...Union

Image
Most software that works with data demands the user form a single table of data to work from. However, the world is often not that simple and often the user will find themselves with many tables of data to pull together to form that single table. On technique will frequently be found in data preparation for this task, its called Unioning. What is a Union? A Union can be thought of as stacking one dataset above another. Columns will be stacked on top of each other when they contain the same content. This requires the data structures to be very similar (we'll handle those differences in the next section).  Let's take sales from our York and Leeds Store. Both are separate tables. But when Unioned, the extra set of column headers are removed to leave the rows of data stacked on top of each other. The Union is determined by whether the data fields are: Named the same in each data set - Any column with the same name, will be stacked above the column in the

2019: Week 36

Image
With the release of Prep Builder 2019.3.2, Tableau introduced a new feature called 'Reusable Steps'. This week, your challenge is to create two reusable steps to convert files Imperial Measurements in to Metric Measurements for Height and Weight. If you have read about Reusable Steps, read of this Help Post .  ***If you haven't updated to 2019.3.2 yet, try the challenge by copying and pasting your steps when using the alternative file.*** Requirements Form two reusable steps to automatically convert Height and Weight. Use this file to build the reusable steps. Convert the HT (Height) field from Feet and Inches to Meters and Centimeters  Conversion rate 1 inch = 2.54 cm Convert WT (Weight) from pounds (lbs) to kilograms (kgs) Conversion 1 lb = 0.453592 kg Remove the unrequired fields within the Reusable step Clean up Name field and create Jersey Number field (not in the reusable steps) Test your reusable steps on another team's data - use

How to...Clean by Grouping Data

Image
Data preparation would not be necessary if we were to always have someone else curating a perfect data set for us. Sadly (and more empoweringly) we can do this ourselves and often have to. One of the common challenges we face in data prep is cleaning up string data (so much so  we wrote a post about that here ). One of the challenges with string data is making them consistent to be able to count how many of the same entities we have or be able to create implied relationships between the entities in the form of hierarchies. One technique is required to meet this challenge than any other... Grouping. What do we mean by 'Grouping'? Grouping is the technique where we apply logic to string data fields to recognise that those strings have a common characteristic. For example, we might expect the following data items to be grouped together: Edinburgh Edenburgh Edinborough 3d!nburgh As humans, we can see that the spellings of the different names should probably all read Ed

2019: Week 34 Solution

Image
You can view our full solution workflow below and download it here ! Our full solution workflow. 1. Prepare the Delivery Schedule data for joining. We need to get two bits of data out of the [Delivery Schedule] field. Due to the consistent format of this field, we used the following two calculations to extract the week number in the month and the weekday: [Week Number in Month] [Weekday] INT(     LEFT( [Delivery Schedule], 1) ) SPLIT( [Delivery Schedule], “ “, 2) For the week number we simply pull out the first character, i.e. the week number, using LEFT() and then convert this straight to a number using the INT() function. For weekday we instead use the SPLIT() function to extract the string of characters between the 2nd and 3rd spaces, i.e. the second word in the string. 2. Get the dateparts from the Date Scaffold data. The first step in preparing the Date Scaffold data is to extract the month, day of month, and w