Posts

Showing posts from June, 2020

2020: Week 26 - Solution

Image
You can find our full solution workflow below and download it here .  This week the challenge was all about working with unique IDs and how to deal with things when they don't quite all match up! Step 1 - Join Inputs The first step this week is to bring in both the 'Internal Data' and '3rd Party Data' tables and then identify which IDs match up perfectly, and the ones that don't. To do this we can use the following joins:  Perfect Match This is an inner join on ID = 3rd Party ID We can then add a step after the join where we remove the [Scent-1] field and add a string calculation called 'Status' that labels all of the rows as 'Matched'. Unmatched Internal Right Outer Join to include all values from the Internal table, joining on 3rd Party ID = ID Similar to before, we are then going to add a clean step, then remove any 3rd Party fields (ID & Sales) and also rename [Scent-1] to [Scent] Unmatched 3rd Party Left Outer Join to include all values

2020: Week 26

Image
This week's challenge is inspired by some real life consulting work, replacing a very manual task with an automated process! Unique IDs are a great way to label your data, but issues can arise when trying to match internal IDs to 3rd Party IDs, as there could have been changes made to the naming convention by any number of people. So how do we go about making some educated guesses and getting together a list of IDs that we can take back to the 3rd Party to complain about? Input 1 input this week, with 2 sheets: Requirements Input the data . Find the IDs that match perfectly and label them as such. For the remaining unmatched Internal and 3rd Party IDs, create all the possible matching IDs for each Scent. For each 3rd Party ID, find the Internal ID with the lowest sales difference.  For each Internal ID, find the 3rd Party ID with the lowest sales difference. You should now have no duplicated IDs. Classify these IDs as "Matched on Scent". Join t

2020: Week 25 - Solution

Image
You can find our full solution workflow below and download it from the Preppin' Data Community page ! This week we looked at a similar technique that we covered a couple of weeks ago in Week 20 , where we looked at cipher solving and some techniques that aren't necessarily native to Prep. However, this week we are going to be converting roman numerals to normal numbers.  Step 1 - Input Numerals The first step this week is to input the number data set, this is a single string that we need to convert from numerals to numbers. After inputting the data, we want to find out how many characters are in the string, so we use the Len function in the following calculation:  Length  Len([Number]) The length of our string is 9 characters long, so now we can add in the Scaffold data set.  After inputting the Scaffold, this is ready to join straight to our original workflow. We can do this with the following join condition: As a result of this join, we have made our data set increase in size

Tableau Prep Up and Running links

Image
If you have purchased Tableau Prep Up and Running (thank you), I have a number of data sources and resources linked within the book. Here is a single page to support all of those links: Chapter 3 - Planning Your Prep  Link to data sets for follow along 2019 Challenge 33 Chapter 8 - Dealing with Dates 2019 Challenge 1 2019 Challenge 11 2019 Challenge 23 2019 Challenge 28 Chapter 9 - Dealing with String Data 2019 Challenge 19 2019 Challenge 32 Chapter 15 - Aggregating in Prep Builder 2019 Challenge 1 2019 Challenge 13 Chapter 16 - Joining Data Sets Together 2019 Challenge 29 2019 Challenge 33 Chapter 17 - Unioning 2019 Challenge 2 2019 Challenge 6 2019 Challenge 12 2019 Challenge 15 Chapter 23 - Filtering 2019 Challenge 18 2019 Challenge 29 2020 Challenge 4 Chapter 26 - Clean by Grouping Data 2019 Challenge 2 Chapter 27 - Dealing with Nulls 2020 Challenge 8 Chapter 32 - Completing Advanced Joins 2020 Challenge 8 Chapter 33 - Creating Level of Detail Calculations 2019 Challenge 15 Chapte

2020: Week 25

Image
This week's challenge is a bit of a companion to Week 20's Cipher Solving Challenge , so if you liked that then hopefully you'll be a fan of this too! We're working with Roman Numerals and using the power of Prep to translate them into numbers that we can more easily manipulate in Tableau. In theory, once you've completed this challenge, you could use your resulting workflow to translate any Roman Numeral! Pretty cool, huh? Remember that ordering is very important when reading Roman Numerals... Inputs Requirements Input the data . Find a way to split the Roman Numeral into individual characters (remember, order is important!) Join the information relating to Roman Numeral values. Depending on the ordering of characters in the Roman Numeral, some of these values will need a negative multiplier.  Aggregate the values to complete the translation from Roman Numeral to number. Output the data. Output 2 fields Number Numeric Equiva

2020: Week 24 - Solution

Image
You can find our full solution workflow below and download it from the Preppin' Data Community page ! This week we went back in time and looked at some historical battles that have happened across the UK throughout history. It included cleaning some webscraped data and also parsing some unusual dates.  Step 1 - Split Strings The first step is to bring in the data, this consists a string containing details about each battle that has taken place. Our aim is to split this string so that we can isolate some of the key details like Battle Name, War, Date etc.  To do this we want to identify a repeating pattern that is similar across all rows, which will help to break apart the string into different sections. In this case, there is a repeating '<br />', so we can use this to split the string:  We use the custom split to break apart all fields when the '<br />' occurs:  After the split we now have our different categories in separate columns which can then be i

2020: Week 24

Image
There has been many a battle fought on British soil across the years and this website provides a lot of information about many of the battlefields. It's a great website to practice our webscraping skills, but webscraping doesn't always lead to the cleanest of datasets. So this week's focus will be on cleaning up a subset of our webscraped data. Input Requirements Input the data. Find a natural way to split the data into different fields. Remove rows which are incomplete. i.e. if they do not have information in each field. Clean battle names. Ensure each row has a unique battle name. Clean the dates. For those dates with a date range, just use the start date. The dateparse function may be useful here. Clean the Victors, War and Description fields. Output the data. Output 5 fields Date Battle War Victors Description 63 rows (64 including headers) The full output can be  found here  for comparison. Make sure to fill in the 

2020: Week 23 - Solution

Image
You can find our full solution workflow below and download it from the Preppin' Data Community page ! For week 23 we took a look at Quiz results and how we can use Tableau Prep Builder to help us calculate how many correct answers each team answered correctly. Step 1 - Pivot! The first step this week is to change the shape of our data so we can have a row for each of the questions.  Participant Answers First, we need to bring in the Participant Answers table and use a Columns to Rows pivot with each of the 'Rounds'. We can use a wildcard pivot on 'Round':  After this pivot we are left with a row for each Name pair for each round, so 6 rows for each round.  We can now use a Split function to split each of the answers into a separate column. Luckily, all answers are split with a ',' therefore we can just use the automatic split to break these apart:  Now we have a column for each answer, we can use another Pivot tool to bring these back together. This time it