2023: Week 8 - Solution

Solution by Tom Prowse and you can download the workflow here


It's the final week of the Intermediate month challenges and this week we are visiting the trading floor of our fictional bank - The Data Source Bank.

We want to clean up some of the trading data and then identify the top trades across a series of different ways that we look at companies.

Step 1 - Input Data

First we want to input all of our input files for each month. To do this we want to utilise the 'Union Multiple Tables' feature within the Tableau Prep input step. 

After inputting one of the files, we can then choose 'Union Multiple Tables' and then use the term 'MOCK_DATA*' as our matching pattern. This will bring all of the fields that match this pattern into our data set, whilst also adding any others that will come into that folder in the future.


Note: depending on your version of Tableau Prep, this might look slightly different.

After using the wildcard input we should now have all of our tables into a single workflow and the table should look like this: 


Step 2 - Create Month

Next, we want to create a file date using the month from the file name field. When using the union input this automatically creates the File Paths field and we can utilise this within our analysis.

First, we want to remove any punctuation and letters from the field using the 'Clean' functionality:


We then need to make sure that the field is a number and replace the null with a 1 (the first file didn't have a month number). Once we have all of the month numbers, we can create a date using the following calculation: 

File Date
MAKEDATE(2023,[File Paths],1)

Once we have created the date our table should look like this: 


Step 3 - Clean Market Cap & Purchase Price

Next, we want to clean the Market Cap field so that the numbers are correct for Billions vs Millions. Within the Market Cap field we want to remove any n/a values, and then remove the $ symbol using a replace calculation: 

Market Capitalisation
REPLACE([Market Cap],'$','')

We can then replace the B or M with the correct number of 0s using this calculation: 

Market Capitalisation 

IF CONTAINS([Market Capitalisation],'B') 
THEN FLOAT(LEFT([Market Capitalisation],LEN([Market Capitalisation])-1))*1000000000

ELSEIF CONTAINS([Market Capitalisation],'M') 
THEN FLOAT(LEFT([Market Capitalisation],LEN([Market Capitalisation])-1))*1000000

ELSE FLOAT([Market Capitalisation])
END

This will allow us to replace the B or M and transform the field into a number.


We can then categorise the Purchase Price and Market Cap as per the requirements.

Purchase Price Categorisation

IF [Purchase Price] <25000 THEN 'Small'
ELSEIF [Purchase Price] <50000 THEN 'Medium' 
ELSEIF [Purchase Price] <75000 THEN 'Large'
ELSE 'Very Large'
END

Market Capitalisation Categorisation
IF [Market Capitalisation] < 100000000 THEN 'Small'
ELSEIF [Market Capitalisation] <1000000000 THEN 'Medium'
ELSEIF [Market Capitalisation] <100000000000 THEN 'Large'
ELSE 'Huge'
END

After these calculations our table should look like this: 


Step 4 - Rank

The final step that we need for this week is to rank the highest purchases per combination of file date, Purchase Price Categorisation, and Market Capitalisation Categorisation. 

To do this we can use the analytical rank calculation to group by File Date, Purchase Price Categorisation, and Market Capitalisation Categorisation, then Rank the Purchase Price from high to low:  


Then we want to filter the Rank field to Keep Only 1-5 and remove the ID, first_name, and last_name fields.

After filtering the table we are ready to output: 


You can download all the outputs from here

After you finish the challenge make sure to fill in the participation tracker, then share your solution on Twitter using #PreppinData and tagging @Datajedininja@JennyMartinDS14 & @TomProwse1

You can also post your solution on the Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help! 

Popular posts from this blog

2024: Week 1 - Prep Air's Flow Card

2023: Week 1 The Data Source Bank

How to...Handle Free Text