2019: Week 37 Solution

November 02, 2019

You can view our full solution workflow below and download it here!

Our full solution workflow.

Use a wildcard union to import all the data.

When importing the six different files, we could individually add each file to the canvas and then use five union steps to combine them. A much more efficient and dynamic solution is to add one of the files to the canvas and then use a Wildcard Union.

Wildcard Union input settings.

Drag one file onto the canvas.
Select the Wildcard Union option.

If the folder with the files in has no other files in it then click Apply.
If the folder with the files in has other files in it, or may have other files in it in the future, then set a Matching Pattern to ensure only the appropriate files are imported. A matching pattern for this import could be: MOCK_DATA-*.csv. Click Apply after setting the matching pattern.

After clicking apply, a new field called File Paths should appear in the table of fields. Make sure this is ticked as we need it for the next stage.

Creating the date field.

To create the date field, we need to extract the number from the file names and use it in one of the date creation functions. There are a few different options for both parts of this, however our reasonably simple and effort-free solution is as follows:

Use the Clean options to remove both letters and punctuation from the [File Paths] field.
Change the data type for the [File Paths] field to a whole number.
Use the MAKEDATE() function to create our new [Date] field:

[Date]

MAKEDATE( 2019, [File Paths], 01)

This means we’re using the number from the file path as the month in our new date field.
This creates our new date field. The data type will actually be a DateTime instead of Date, so either wrap the above calculation in the DATE() function, or just convert the field to a Date type afterwards.

Group the scents into fruit & non-fruit.

We need to create a field which lets us know whether each scent is fruit-based or not. Unfortunately “is a fruit” is not one of Tableau Prep’s grouping options so we need to do it ourselves.

Duplicate the [Scent] field. The duplicate will become our [Scent Category] field.
Hold CTRL and select all the fruits in [Scent Category].
Click Group from along the top of the data pane.
Rename the group as “Fruit”.
Repeat steps 2-4, except with the remaining non-fruits and rename the group “Non-Fruit”.

Grouping the scents.

Final pre-aggregation preparation

There’s now two last things to do to help make the following aggregations as smooth as possible.

The first is to create a field for our ‘All’ aggregation. This simply takes the form of a field called [All] that just contains the text “All”. When we aggregate across our whole data set, we don’t need a field to group-by, however by creating and grouping on this [All] field, it gives us the value entire-data set aggregation in the [Type] field in the final output.

Secondly, we can also create a copy of [Return] that gives us a 1 or a 0 instead of a TRUE or a FALSE:

[Returned Orders]

IF [Return]

THEN 1

ELSE 0

END

This gives us a value we can sum up when aggregating to calculate how many orders were returned.

Aggregate on the various categories.

We now need to perform four similar aggregations. Each aggregation will aggregate the same fields but group on different ones.

The four aggregations.

GROUP on:

a) [All]
b) [Date]
c) [Product Type]
d) [Scent Category]

SUM up:

[Number of Rows]
[Returned Orders]

For each aggregation a) to d), this will give us the total orders and the number of returned orders for each category.

Union all the aggregations together.

Unfortunately, Prep currently does not:

a) let you union more than two steps together at once, nor
b) let you union fields with different names together.

To get around b) we could rename all the fields we grouped on to [Type] before unioning them together, but a quicker method is described below.

We will need three unions to combine all our data. Union two of the aggregations together by dragging one over the other. Then union in the next aggregation by dragging it over the first union. Then repeat this again for the last aggregation.

Unioning the aggregations together.

After this, you should have 9 fields and 13 rows. These fields include three [Table Names-*] fields which we can remove right away, however we’ll also have four other fields for each of our different groupings. We can quickly combine these by merging them together!

Hold shift and select all four of the fields we grouped on.
Click Merge Fields.
Rename the new combined field as [Type].

Merging the fields together.

Because each row in our data will be NULL for three of the four fields as each row came from a different grouping, merging them together like this is quicker than pre-emptively renaming all four fields as [Type] and significantly quicker than using a calculation to find out with of the four fields isn’t null and then deleting the original fields.

Calculate the return percentage.

Finally, we need to calculate how percentage of orders were returned for each [Type]. After renaming [Number of Rows] to [Total Orders] we can get the percentage like so:

[% Returned]

ROUND(

( [Returned Orders] / [Total Orders] ) * 100

)

We divide [Returned Orders] by [Total Orders] to get a decimal percentage. We then multiple this by 100 to give us a percentage out of 100. Finally, we ROUND() it all to 1 decimal place to give us a nice clean value.

After outputting this data, we’re finished!

Search This Blog

Preppin' Data