Posts

2019: Week 38 Solution

Image
You can view our full solution workflow below and download it here!

There are 3 aggregations required for this challenge. We’ll tackle the easier two first.
Aggregate to get the total patient visits and first visit dates After importing the data simply create a new aggregation step with the following settings:
GROUP on [Patient ID]COUNT DISTINCT[VisitID] fields and rename to [Total Patient Visits].You could also SUM the [Number of Rows] field for this.Take the MIN of [Date of Service] and rename to [First Visit Date]. This gives us the [Total Patient Visits] and [First Visit Date]s for each patient. We can now set this to one side whilst we go back and perform the last aggregation required. Use a self-join & aggregation to number the visits. We now need to number each patient’s visits so that their first visit is number 1, second visit is number 2, and so on. To achieve this we can use the running sum trick.

First, duplicate the data in a new clean step. Then join the data to itsel…

2019: Week 39

Image
The Head Honchos of Chin & Beard Suds Co want to understand how their website is performing. Are the web pages being tested on the right devices and Operating Systems? Are we considering language and cultural references?

To conduct this analysis, our website stats are held in a frustrating way but this analysis won't be done just once so we want to set up the flows so they are available in a reusable way to conduct this analysis whenever needed.  Requirements Input the Excel workbookCreate a single table for each of Origin, Operating System and BrowserClean up values and percentagesConvert any < 1% values to 0.5If percent of total fields don't exist in any files, create these and make them an integerCreate a consistent output for all three files Outputs
3 Files with the following 6 columns: Type (either Origin, Browser or Operating System)Change in % page viewsThis Month Pageviews ValueThis Month Pageviews %All Time Views ValueAll Time Views % The number of rows (excludin…

2019: Week 37 Solution

Image
You can view our full solution workflow below and download it here!
Use a wildcard union to import all the data. When importing the six different files, we could individually add each file to the canvas and then use five union steps to combine them. A much more efficient and dynamic solution is to add one of the files to the canvas and then use a Wildcard Union.

Drag one file onto the canvas.Select the Wildcard Union option.
If the folder with the files in has no other files in it then click Apply.If the folder with the files in has other files in it, or may have other files in it in the future, then set a Matching Pattern to ensure only the appropriate files are imported. A matching pattern for this import could be: MOCK_DATA-*.csv. Click Apply after setting the matching pattern.
After clicking apply, a new field called File Paths should appear in the table of fields. Make sure this is ticked as we need it for the next stage.
Creating the date field. To create the date field, we need t…

2019: Week 38

Image
This week's challenge comes from a Data School consultant - Collin Smith. He used Tableau Prep Builder as described below before he joined The Data School and has been enjoying the Preppin' Data challenges so wanted to contribute his own challenge. I hope you enjoy it and let him know how you get on! _____________________________________
Last spring, I worked with Health in Harmony, a planetary health organization in Borneo, to help get them set up using Tableau Prep and Tableau Desktop. One of my big jobs was to re-create a medical data workflow that was originally in SPSS with Tableau Prep. The challenge this week comes from one of the steps of that workflow that involves heavy use of the aggregation tool and some clever join logic. I’m heavily indebted to the Tableau Service Corps and Community Forums for guiding me through this the first time.
Requirements Use the join and aggregation tools to find metrics on each patient based…

2019: Week 36 Solution

Image
You can view our full solution workflow below and download it here!
1.   Convert the height for the Spurs.Firstly, we’re going to create the height conversion step. There’s a number of different ways to convert the feet & inches into meters here. An easy method is as follows:

Remove all punctuation from [HT] using the relevant Clean option.Use an automatic split to split [HT] into [Feet] (split 1) & [Inches] (split 2).Calculate the [Height (in)] (height in inches) using: [Feet] * 12 + [Inches].Convert this to [Height (m)] (height in meters) using: [Height (in)] * 2.54 / 100.
We divide by 100 as without it we actually have the height in centimetres.Round this to 2dp using the ROUND() function. After this, make sure to remove the original [HT] fields as well as all the unnecessary fields created along the way.

2.   Convert the weight for the Spurs. We’re now going to create the weight conversion step. There’s less work to be done in order to convert the weight.

Remove all letters a…

2019: Week 35 Solution

Image
You can view our full solution workflow below and download it here!
1.Use a Running Total to calculate the total stock in the warehouse to date. Starting with out output from week 34, we first want to calculate the total stock in our warehouse for each [Product] and [Scent] on any given date. To do this, we first duplicate the data and then join it to itself on:
[Product] = [Product][Scent] = [Scent][Date] >= [Date] This means each row of data gets every previous row joined to it if it has the same product and scent. From this point, we can use an aggregation step to complete our running total and find our total quantity requested to date:
GROUP on:[Supplier][Date][Product][Scent][Quantity]SUM up [Quantity-1] and rename to [Total Quantity to Date]. 2.Use a Running Total to calculate the total stock requested to date. We now need to perform another running total on the store orders for each [Product] and [Scent]. We duplicate the data again and then this time join on:
[Product] = [Prod…

2019: Week 37

Image
Aggregating Data Sources to a good level for analysis is always a "fun" decision to take. Leave the data at too detailed level of granularity and you may face slow performance but have high flexibility in what analysis you can do. If you aggregate the data to a too higher level, you might get great performance from the software you are using but suffer with the flexibility of questions you can answer.

Well, what if we just want to form a table of those answers? This week's challenge is looking to do exactly that.

At Chin & Beard Suds Co. we have had a higher level than normal of returns. We want to take some of our sales data from the first half of the year to understand what might have gone wrong so we can make sure we don't make the same mistakes in the future. To start the analysis, the company's leadership team have asked for '% returns' and then to compare different % returns for different factors in the business:
The full data setMonth of trading…