2020: Week 12

Would you believe that Chin & Beard Suds Co have encountered yet more messy data? It seems someone was trying to be helpful by creating an aggregated view of sales per week for each scent of soap. However, in doing so we've lost the lower level of detail of the product sizes that make up these sales for each scent. We really need this for other analysis we've been carrying out!

Fortunately, we know what percentage of sales each product size makes up for each product in each week. Unfortunately, the data isn't stored in a way that will make it easy to join all the necessary information together.


There is 1 input file with 3 sheets:
Total Sales by Week for each Scent

Percentage of Sales for different product sizes per product per week.

A lookup table for linking Scent, Product ID and Size


  • Input data
  • Our final output requires the Date to be in in the Year Week Number format. 
    • (Check out 2020.1.4 for an easier way to clean dates)
  • We don't care about any product sizes that make up 0% of sales.
  • In the Lookup Table, it seems the Product ID and Size have been erroneously concatenated. These need to be separated.  
  • You'll need to do some cleaning of the Scent fields to join together the Total Sales and the Lookup Table.
  • Calculate the sales per week for each scent and product size.
  • Output the data


One file:
  • 5 Data Fields:
    • Year Week Number
    • Scent
    • Product Type
    • Size
    • Sales
  • 307 rows (308 including headers)
The outputs can be found here for comparison. Don’t forget to fill the participation tracker and share your solutions using #PreppinData on Twitter. 

Popular posts from this blog

2023: Week 1 The Data Source Bank

2023: Week 2 - International Bank Account Numbers

How to...Handle Free Text