This week was the first of our basic & intermediate challenges throughout July & August. The challenge is aimed at anyone just starting out and covers some of the fundamentals that you'll need to get started. Let's see how we can solve the challenge.
Step 1 - Separate Product Name
First we need to connect to our data source and bring the single table into our workflow. Once we have the data, we can go about splitting the Product Name field into two separate fields (Product Type & Quantity).
The Product Name contains a common structure of Product Type - Quantity, therefore we can use a Custom Split on the '-' to separate either side into different fields:
As a result of the split we now have 2 additional fields that we can rename to be Product Type and Quantity. The table should look like this:
Step 2 - Liquid & Bar Path
We can now split our workflow into two separate path/branches. To do this we want to create two new clean steps (one for Liquid, one for Bars) so now we should have two separate branches that we can rename to be Bar and Liquid:
Step 3 - Bar Path
As there are slightly different steps required for the different paths, lets focus on just the Bars first.
In the Bars clean step, we can use the filter to 'Keep Only' Bars from the Product Type field. Then once we have filtered for Bars, we can then remove any Letters from the Quantity field so we are left with just the numeric values.
Now we've filtered and cleaned the table, we need to calculate the total sales and number of orders for each Store, Region, and Quantity combination. To do this we can use an aggregate tool with the following step up:
Group by: Quantity, Store Name, and Region Aggregate: Sum Sale Value, and CountD OrderID
After the aggregation we can rename the Order ID field to Present in N Orders and we are ready to output our first table:
Step 4- Liquid Path
Now we can go back to our Liquid branch and make some changes to clean this table.
First we need to filter so that we only have the Liquid values in Product Type and again we need to remove the letters from the quantity field.
In addition, we also need to convert the quantity into millilitres using the following calculation:
Quantity
IF INT([Quantity]) = 1 THEN INT([Quantity])*1000
ELSE INT([Quantity])
END
Make sure you are using the INT function to convert the Quantity field into a number.
Once we have made these changes to again need to aggregate to calculate the total sales and orders. The aggregation should look like this:
Group by: Quantity, Store Name, and Region Aggregate: Sum Sale Value, and CountD Order ID
After renaming the Order ID field to Present in N Orders we are now ready to output our second table:
You can also post your solution on the Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help!
Created by: Carl Allchin Welcome to a New Year of Preppin' Data challenges. For anyone new to the challenges then let us give you an overview how the weekly challenge works. Each Wednesday the Preppin' crew (Jenny, myself or a guest contributor) drop a data set(s) that requires some reshaping and/or cleaning to get it ready for analysis. You can use any tool or language you want to do the reshaping (we build the challenges in Tableau Prep but love seeing different tools being learnt / tried). Share your solution on LinkedIn, Twitter/X, GitHub or the Tableau Forums Fill out our tracker so you can monitor your progress and involvement The following Tuesday we will post a written solution in Tableau Prep (thanks Tom) and a video walkthrough too (thanks Jenny) As with each January for the last few years, we'll set a number of challenges aimed at beginners. This is a great way to learn a number of fundamental data preparation skills or a chance to learn a new tool — New Year...
Free isn't always a good thing. In data, Free text is the example to state when proving that statements correct. However, lots of benefit can be gained from understanding data that has been entered in Free Text fields. What do we mean by Free Text? Free Text is the string based data that comes from allowing people to type answers in to systems and forms. The resulting data is normally stored within one column, with one answer per cell. As Free Text means the answer could be anything, this is what you get - absolutely anything. From expletives to slang, the words you will find in the data may be a challenge to interpret but the text is the closest way to collect the voice of your customer / employee. The Free Text field is likely to contain long, rambling sentences that can simply be analysed. If you count these fields, you are likely to have one of each entry each. Therefore, simply counting the entries will not provide anything meaningful to your analysis. The value is in ...
Created by: Carl Allchin Welcome to a New Year of Preppin' Data. These are weekly exercises to help you learn and develop data preparation skills. We publish the challenges on a Wednesday and share a solution the following Tuesday. You can take the challenges whenever you want and we love to see your solutions. With data preparation, there is never just one way to complete the tasks so sharing your solutions will help others learn too. Share on Twitter, LinkedIn, the Tableau Forums or wherever you want to too. Tag Jenny Martin, Tom Prowse or myself or just use the #PreppinData to share your solutions. The challenges are designed for learning Tableau Prep but we have a broad community who complete the challenges in R, Python, SQL, DBT, EasyMorph and many other tools. We love seeing people learn new tools so feel free to use whatever tools you want to complete the challenges. A New Year means we start afresh so January's challenges will be focused on beginners. We will u...