A weekly challenge to help you learn to prepare data and use Tableau Prep
2022: Week 21 - Solution
Get link
Facebook
X
Pinterest
Email
Other Apps
Solution by Tom Prowse and you can download the workflow here.
The challenge this week focussed on a real world data structure that needed some cleaning in Tableau Prep so that we could easily visualise the data within Tableau Desktop. There are lots of sheets and some key metrics that we want to focus on... let's see how we completed it!
Step 1 - Input Sheets
The first task is to input all of the different sheets which are based on the different stores. After connecting to the Input file, we then need to use the Data Interpreter to clean the sheets so the table is easier for us to work with. The Data Interpreter will split the sheets into two tables, Key Metrics and Additional Metrics - we want to just focus on the Key Metrics.
As we are using the Data Interpreter, we can't use the Wildcard union, therefore we need to input all of the tables (with A4:T15) in the title, until we have all of the stores in a single table.
The workflow will look something like this:
As a result, we should now have a table that looks like this:
Now we have all the stores data in a single table, we can extract the Store name from the Table Names field. The Table Names is automatically created from the Union step.
The extract the store name we can use the Regex_Extract function:
Shop
REGEXP_EXTRACT([Table Names],'.*?/(.*?)\$.*')
As a result we should now have each of the store names extracted into a single field:
Finally to clean our table and have a nice input, we can remove the fields that contain quarter totals. We can remove the fields named 'FY22 Q1', 'FY22 Q2', and 'FY22 Q3'.
Step 2 - Reshape
Next up we want to reshape the data so that we have all of our date fields in a single column. To do this want to pivot the data using a Columns to Rows pivot that includes all of the date fields.
We can use the Wildcard Pivot at this stage to bring in all fields that contain a '-'. This will also allow us to future-proof out workflow by bringing in new fields that contain a '-' when they are added.
After the pivot we can rename the Pivot Names to Date, so our table now will look like this:
Step 3 - Orders, Returns, or Complaints
We now need to filter our data so that we are focussing on the correct values. Within the Breakdown field we want to filter so that we keep only:
# Received
% Processed in 3 Days
% Processed in 5 Days
% Shipped in 3 Days
% Shipped in 5 Days
We then want to extract the '#' or '%' from each, so we can use a custom split on ' ' (space) to do this:
Then we want to use that field to remove the symbol from the original Breakdown field. To do this we can use a Replace calculation (REPLACE([Breakdown],[Breakdown - Split 1],'')) or alternatively we can use the Remove Punctuation from the Clean options.
We can now use these fields to recreate a Measure Name field, using this calculation:
After these changes our table should now look like this:
Step 4 - Actual or Target
At this stage we need to split the workflow into separate branches, an Actual and a Target branch.
Actual
First we can remove the 'Target' field, and then we want to pivot the data using a Rows to Columns pivot. This time we going to include all of the Measure Name fields and then sum the pivot values in the aggregation field.
As a result we should now have a table that looks like this with our measure in separate fields:
Target Next we can create a new branch and focus on the target values.
First we need to include Target in each of the Measure Names using this calculation:
Measure Name 'Target - '+[Measure Name]
Then as we are only interested in each of the Target values, we can use an aggregate step group by Measure Name & Target and we don't need an aggregated field here.
Next, we want to retain just the numbers from the target field. To do this we can use the remove punctuation from the Target field, then use this calculation to remove '>' as well:
Target
FLOAT(REPLACE([Target],'>','')) / 100
At this stage our Target branch table should look like this:
Step 5 - Combine Branches
The final step this week is to combine both branches so we have a single table with all of our Key Measures included.
First we need to create a dummy field to join our branches on, this can be anything but we have create a calculation on each branch with a 1 in.
Before we join the branches, we need to ensure that the Targets branch is pivoted so each measure is in a separate field. We do this after the Dummy field have been created or else an error will be created due to not having any fields remaining in the field list.
The pivot is a Rows to Columns with the following setup:
Now that both branches have the same shape, we can join these together using the Dummy field and an inner join. This will allow us to append the row of Targets onto each of the different Actual values so that we can compare them.
After the join we should be ready to output our data:
You can also post your solution on the Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help!
Created by: Carl Allchin Welcome to a New Year of Preppin' Data challenges. For anyone new to the challenges then let us give you an overview how the weekly challenge works. Each Wednesday the Preppin' crew (Jenny, myself or a guest contributor) drop a data set(s) that requires some reshaping and/or cleaning to get it ready for analysis. You can use any tool or language you want to do the reshaping (we build the challenges in Tableau Prep but love seeing different tools being learnt / tried). Share your solution on LinkedIn, Twitter/X, GitHub or the Tableau Forums Fill out our tracker so you can monitor your progress and involvement The following Tuesday we will post a written solution in Tableau Prep (thanks Tom) and a video walkthrough too (thanks Jenny) As with each January for the last few years, we'll set a number of challenges aimed at beginners. This is a great way to learn a number of fundamental data preparation skills or a chance to learn a new tool — New Year...
Free isn't always a good thing. In data, Free text is the example to state when proving that statements correct. However, lots of benefit can be gained from understanding data that has been entered in Free Text fields. What do we mean by Free Text? Free Text is the string based data that comes from allowing people to type answers in to systems and forms. The resulting data is normally stored within one column, with one answer per cell. As Free Text means the answer could be anything, this is what you get - absolutely anything. From expletives to slang, the words you will find in the data may be a challenge to interpret but the text is the closest way to collect the voice of your customer / employee. The Free Text field is likely to contain long, rambling sentences that can simply be analysed. If you count these fields, you are likely to have one of each entry each. Therefore, simply counting the entries will not provide anything meaningful to your analysis. The value is in ...
Created by: Carl Allchin Welcome to a New Year of Preppin' Data. These are weekly exercises to help you learn and develop data preparation skills. We publish the challenges on a Wednesday and share a solution the following Tuesday. You can take the challenges whenever you want and we love to see your solutions. With data preparation, there is never just one way to complete the tasks so sharing your solutions will help others learn too. Share on Twitter, LinkedIn, the Tableau Forums or wherever you want to too. Tag Jenny Martin, Tom Prowse or myself or just use the #PreppinData to share your solutions. The challenges are designed for learning Tableau Prep but we have a broad community who complete the challenges in R, Python, SQL, DBT, EasyMorph and many other tools. We love seeing people learn new tools so feel free to use whatever tools you want to complete the challenges. A New Year means we start afresh so January's challenges will be focused on beginners. We will u...