2020: Week 1 Solution




Preppin Data is back for the first challenge of the new year! We have noticed that there are plenty of different solutions on Twitter, so take a look at these if you want some more inspiration. We have a special mention to Jenny, who has created a video solution post for this week, which is slightly different to ours so make sure you check it out here


The first Preppin Data challenge of the new year was all focused around financial statements and specifically formatting a Profit statement for Chin & Beard Suds' board meeting.


This task is broken down into various different steps which we need to follow to get to our final output.


Step 1 - Extract ‘Levels’ from Item Fields


Our first task is to isolate each of the ‘levels’ within the financial statement. These are indicated by the ‘X.X.X item’ with each ‘X’ being a different number/level. There are various different ways of doing this, however I have decided to use Tableau Prep’s in-built Split functionality to extract the numbers from the Item field. 


Split 1
The first split is a custom split to split off the first field before a space (‘ ‘) separator. This creates a new column which includes just the ‘numbers’ from the Item field.





Split 2
The second split is going to be on the newly created field. This time we are going to use an automatic split, which breaks each number into a separate column. (We can also split all values by a ‘.’). 


This has now broken each of the ‘levels’ into a separate column, so therefore we can rename the columns to Level 1, Level 2, Level 3 as below: 





Our next step is to tidy up some of the null/empty rows within the newly split fields. To do this we can right click on the null value, then select ‘Edit Value’. 



We have replaced the Null/Empty fields with the following:
Level 2 - Replace Null with ‘0’
Level 3 - Replace ‘Empty’ with ‘Null’


The final step is to create a string field which will tell us what level each product is part of (we will use this later!). We can calculate this by using an IF statement, with the newly split fields. The calculation that we have used is as follows: 


Level Name
IF [Level 2] = '0'
AND
ISNULL([Level 3])
THEN 'Level 1'

ELSEIF NOT ISNULL([Level 2])
AND
ISNULL([Level 3])
THEN 'Level 2' 

ELSE 'Level 3' 
END

This is step 1 complete!


Step 2 - Calculate Totals

Now we have separated each of the levels, we can now use the aggregate tool to calculate the totals and therefore replace the 'Null' values that are currently there.

Level 1
First we will calculate the total for each of the level 1 categories, as mentioned previously we will use the aggregation tool to do this. The set up of the aggregation tool is to Group By Level 1, then Sum the Profit, as seen below:


As you can see we now have a total profit for each number within the Level 1 field, and these match the 'X items' from the challenge post.

Next we need to calculate the totals for Level 2. Again we use an aggregation tool, but this time we want to Group By Level 1 and Level 2, then Sum Profit. This will now calculate the total profit for each of the X.X items, but also we have carried through some Null values. Before we move to the next step we will want to remove the Nulls.


We now need to bring all of the data back together into one table. To do this we have decided to use the union tool, which essentially 'stacks' both the total tables on top of one another, then again we made sure that we replaced the Null in Level 2 with a 0.

To bring the data back into our original flow, we can use the join tool, where we join on Level 1 & Level 2. This needs to be a right join as we want to include all of the rows from the original flow (note, this could be a left join for you depending on what table dragged for the join). The join is set up like the below, and you should have 31 records:


Step 3 - Format Output

The final step of this weeks challenge is to format each of the levels by adding the correct amount of spaces before the item. We need to add the follow amount of spaces:

Level 1 - 0
Level 2 - 5
Level 3 - 10

However, before we do this we first need to merge the two profit fields that we have. This will give us one column, and will remove any Null values that we have in the profit field. From here we can remove any unwanted fields, until we have Level Name, Item, & Profit remaining.

These three fields will be used to create the spacing and the formatting required for the output. The calculation needed for the spacing is:

Item
( IF [Level Name] = 'Level 2' 
THEN SPACE(5)

ELSEIF [Level Name] = 'Level 3' 
THEN SPACE(10)
ELSE

SPACE(0)
END )
+ [Item] 

The first part of the IF statement is documenting how many spaces to add to the item field. We have used the SPACE() function to do this, as it returns a string composed of the specific number of repeated spaces. At the end of the string we then need to add the original item, which then creates the item field in the formatting that is required for the output.

All that is needed now is an Output tool and we are complete!

If you have any questions about the solution or #PreppinData in general, please do not hesitate to reach out to us and we will be more than happy to help!

Popular posts from this blog

2023: Week 1 The Data Source Bank

2023: Week 2 - International Bank Account Numbers

How to...Handle Free Text