2020: Week 11 Solution




You can view the full workflow described in the post below or download it here.
You can also view a video guide above for a slightly alternate method.



This workflow can be broken down into two halves, one for each output:

1. Calculating Boxes Per Order

In order to calculate how many boxes of each size are required for order, we're going to apply the following steps:
  1. Append all sizes of box onto each order.
  2. Pivot the box sizes so we have one order per row with a field for each box size.
  3. Calculate the number of each box size required for each order.

1.1 Appending the box sizes.

Add the 'Orders' and 'Box Sizes' inputs to the canvas.

Add a Clean Step after each input and create a new [Join] field for us to join the two inputs together such that every [Box Size] gets appended to every row in the 'Box Sizes' table.

[Join]

1    //A value to join on.


Then simply join the two inputs together on this field.

1.2 Pivoting the box sizes

We want a field per [Box Size] for each [Order Number] so we will need to use a Rows to Columns' pivot to achieve this.

Before pivoting, we can create ourselves some nice new headers in a Clean Step for the fields that we will be pivoting.

[Box Size Header]

“Boxes of “ + STR( [Box Size] )


Now we can actually pivot our box sizes. Use [Box Size Header] as your field to pivot and use [Box Size] as the field to aggregate for new columns. You can leave the aggregation as SUM as there is only one value for each field.


The pivot settings and results.

1.3 Calculating boxes per order


Finally, we can now calculate the number of boxes of each size required for each order.
As Tableau does not currently have any looping or iterative functionality, we need to do a little bit of calculative work. By using 3 calculations in turn, we can get to our final numbers.

Firstly, we can use FLOOR() to figure out how many boxes of 120 soaps we need:

[Boxes of 120]

FLOOR( [Order Size] / [Boxes of 120 )


This divides the order size by 120, but rounds it down so that every box of 120 is completely full.

Secondly, we can use FLOOR() again to help calculate how many boxes of 24 we need:

[Boxes of 24]

FLOOR(
  ( [Order Size] - 120*[Boxes of 120] )  //How many soaps are still unboxed.
  /
  [Boxes of 24]
)


This is similar, but instead of just dividing the order size, we instead calculate how many soaps are still unboxed after packaging up our [Boxes of 120].

Finally, we can use CEILING() to help put all the remaining soaps in boxes of size 6:

[Boxes of 24]

CEILING(
    ( [Order Size] - 120*[Boxes of 120] - 24*[Boxes of 24] ) //How many soaps are still unboxed.
    /
    [Boxes of 6]
)

The first output.

This again calculates how many soaps are left after boxing up the soaps in box sizes of 120 & 24. However, CEILING() is used to round of boxes of size 6 up as any remaining soaps can go into a partially filled box of size 6.

At this point we have completed our first output! Add an output step to the end so you can save your data.



2. Calculating Soaps per Box

Continuing on from our last step before our first output, we can now progress to calculating our table of soaps per box for each order. To do this, we'll apply the following method:
  1. Pivot the box count fields back into rows so we have a single box size per row.
  2. Scaffold the data so we have a row for every individual box.
  3. Use calculations to figure out whether each box is full and how many soaps are in partially filled boxes.
  4. Assigning each box a number within each order.

2.1. Unpivot the box count fields

Now we have our counts of how many boxes of each size we need for each order, we want to return to having each size of box on it's own row for each order number. We can do this using a 'Columns to Rows' pivot and selecting all our [Boxes of *] fields to pivot.

Pivoting the box counts into rows.
We then need to clean this up a little bit. Firstly, we want to turn [Pivot1 Names] into numerical [Box Sizes]. Using the Clean > Remove Letters option gets rid of all the text and means we can simply convert the field from a string to a whole number.


Finally, we should also rename [Boxes of] to [Number of Boxes], as this is what that field actually contains.
A GIF showing the cleaning process.

2.2 Scaffold the data

As there isn't currently a way to generate rows in Prep, in order to get a row for every box of every size in every order we're going to scaffold our data using the Scaffold input.

By joining on [Number of Boxes] > = [Scaffold] we get the desired result. Because the scaffold starts at 1, any rows where the [Number of Boxes] = 0 gets filtered out as it doesn't join to anything in the scaffold.

2.3 Calculating the number of soaps in each box

We know that every box of size 120 or size 24 is going to be completely full. However, we need to find a way to figure out which boxes of size 6 aren't completely full and how many soaps are in these partially empty boxes.

We can use an LOD to get the largest scaffold number for each box size in each order.

[Last Box Per Box Size]

{FIXED [Box Sizes], [Order Number] :
    MAX([Scaffold])
}


This isn't useful for boxes of size 120 or 24 but helps us single out the box of size 6 that might be unfilled in the next IF-ELSE calculation:

[Last Box Per Box Size]

IF (
    [Box Sizes] = 6
    AND [Scaffold] = [Last Box Per Box Size]
    AND [Order Size]%[Box Sizes] != 0
)
    THEN [Order Size]%[Box Sizes]
    ELSE [Box Sizes]
END


Essentially, this calculation is saying:
  • If the box is a 6 soap box,
  • AND the box is the last box for that box size,
  • AND the order size doesn't divide by 6,
  • Then return the remainder after dividing the order size by 6,
  • Otherwise, just return the size of the box as the box must be full.

2.4 Assign a box number

Finally, we can assign a [Box Number] for each box in each order by using the new PARTITION functionality and the ROW_NUMBER() calculation:

[Box Number]

{ PARTITION [Order Number] :
    { ORDERBY [Soaps in Box] DESC :
        ROW_NUMBER()
    }
}


  • The PARTITION defines that our ROW_NUMBER() calculation should restart for every [Order Number].

  • The ORDERBY [Soaps in Box] DESC defines that our ROW_NUMBER() should be applied after ordering all our rows within our partition in descending order of [Soaps in Box]. This means the larger and more full each box is, the small the row number.

  • Finally, ROW_NUMBER() simply applies a unique row number to each row of data within our partition.
After removing any left over fields, we can now add the output tool for our second output and complete the challenge!


Popular posts from this blog

2023: Week 1 The Data Source Bank

2023: Week 2 - International Bank Account Numbers

How to...Handle Free Text