Posts

Showing posts from February, 2021

2021: Week 8 - Karaoke Data

Image
Challenge by: Jenny Martin Recently I was helping a colleague prep some karaoke data and I thought it was too fun a subject to resist turning into a Preppin' Data challenge! I had a lot of fun creating the dataset and imagining the type of person who may sing one song and then not bother with the rest of the session.  We will need to make some assumptions as part of our data prep: Customers often don't sing the entire song Sessions last 60 minutes Customers arrive a maximum of 10 minutes before their sessions begin I will warn you that this challenge may be a little on the trickier end of the spectrum! Inputs Karaoke song choices and what time they began  Customer entry times  Requirements Input the data Calculate the time between songs ( help ) If the time between songs is greater than (or equal to) 59 minutes, flag this as being a new session ( help ) Create a session number field ( help ) Number the songs in order for each session ( help ) Match the customers to the co...

2021: Week 7 Solution

Image
Solution by Jonathan Allenby and you can download the  full workflow here .  This week in Preppin' Data we were highlighting the slightly confusing and inconsistent world of vegan shopping. A world where Sweet Chilli crisps, Tesco jam doughnuts, and Kellogg's Crunch Nut Clusters aren't vegan but somehow Bacon Rasher crisps, Co-op jam doughnuts, and Cookie Crisp cereal are somehow. Perhaps most enlightening of all however is that Jammie Dodgers use raspberry flavoured apple jam instead of just using raspberry jam. What a world. Step 1 - Prepare the keywords In order to check whether each product contains any of our non-vegan keywords within their ingredients we're going to take every keyword and E number and place them into a single field, one keyword per row, and append them to the list of products. Prepare the E Numbers There's a few different ways and points where you could attach an 'E' to the start of every number. One great way that we used was to use t...

2021: Week 7 - Vegan Shopping List

Image
Challenge by: Jonathan Allenby Now that Veganuary has come and gone we thought it would be interesting to take a look at some common supermarket products and use Prep to figure out whether or not they are vegan. Some results may surprise you! For the sake of this analysis we're taking bee by-products as non-vegan (beeswax, honey, etc). Inputs A shopping list of products and their ingredients (or allergens when ingredients were not available). I have a child-like palate so its mostly full of sweet treats, some of which you'd expect to be vegan and some of which you'd expect not to be, however everything is commonly found in UK supermarkets so no specialist shops required. Two lists of common non-vegan ingredients and E numbers ( source ) Requirements Input the data Prepare the keyword data Add an 'E' in front of every E number. Stack Animal Ingredients and E Numbers on top of each other. Get every ingredient and E number onto separate rows. Append the keywords onto t...

2021: Week 6 - Solution

Image
  Solution by Tom Prowse and you can download our  full workflow here .  This week's challenge was all about exploring the data within Tableau Prep before bringing it into Desktop. We took the data set from last month's #SportsVizSunday and looked at comparing the prize money for Women and Men's golf. As always there are multiple ways of completing the challenge, I've given you a couple of options here. Both use similar techniques but are just broken out in a couple of ways that might help others understand or when handing over a workflow.  For alternative solutions make sure you check out the submissions on Twitter and the Tableau Community forum. Solution 1 Step 1 - Total Prize Money The first question that we need to answer is 'What's the Total Prize Money earned by players for each tour?'. We can answer this with a simple aggregate where we group by Tour and Sum Money : Step 2 - Number of Players Question number two was - 'How many players are in th...

2021: Week 6 - Comparing Prize Money for Professional Golfers

Image
Challenge By: Jenny Martin What's one of the benefits of preparing your own data? Being able to start your analysis sooner! Sometimes I can find opening Tableau Desktop to explore my data gets a little distracting by trying to visualise it before I've decided on the story. Starting my analysis of the dataset in Tableau Prep helps me, personally, to stay more focused! It's clear where the outliers are, what the distribution of the dataset is and therefore what the story should be. For this week's challenge we're looking at a dataset that was used in December 2020 for Sports Viz Sunday  (thanks to Kate Brown for sharing!) This dataset comes from the PGA and LPGA  2019 Golf tours and lists the total prize money for the top 100 players. For those of us who aren't too familiar with golf, the PGA is the men's tour, whilst the LPGA is the women's tour. Input We have one input this week: Official Money Requirements Input the data Answer these questions: What...

2021: Week 5 - Solution

Image
Solutions by Jenny Martin & Jonathan Allenby and you can download the  full workflows here .  This week we were looking at de-duplicating and updating our training attendee data. In this challenge we mentioned that you can use Prep to avoid having to use LODs in Tableau Desktop, especially as Prep is so great at de-duplication using Aggregate steps. However, if you're comfortable with LODs then you can use them within Prep where it is more manageable to track and modify your changes. In the first solution, Jenny demonstrates how to avoid these LODs completely and in the second Jonathan shows how you can use LODs to achieve the same outcome. Both solutions are valid and which one you gravitate towards will likely just be a matter of personal preference and instinct. Solution A : No LODs Step 1 - Getting the current Account Managers and IDs In order to avoid LODs and table calculations we're going to make use of the Aggregate and Join steps in order to summarize and update o...

2021: Week 5 - Dealing with Duplication

Image
Challenge by: Jenny Martin Have you ever been working with a dataset in Tableau Desktop and noticed some duplication occurring? Of course, this is something you can fix with some potentially tricky LODs or Table Calc filters, but wouldn't it be nicer for your dataset to be viz ready before heading into Desktop?  If you attended the Tableau Fringe Festival last year, this concept may feel familiar, as I did a quick demo explaining why I, personally, would prefer to use Prep to solve my duplication issues. You can find the video here  if you like. Input The dataset we'll be working with for this challenge follows the same theme as the Fringe Festival. We have information relating to which of our Clients are attending our training sessions. Also included in our dataset is which Account Managers look after which Clients. However, we have historical information about Account Ownership which is leading to duplication. So how can we fix it? Requirements If you're new to the techni...

2021: Week 4 - Solution

Image
Solution by Tom Prowse and you can download our full workflow here.   This week was our final fundamentals challenge in January, so hopefully you have come a long way and are starting to understand what Tableau Prep is all about! For the challenge this week we focused on combining a few of the different techniques that we have covered in the previous week. This included joins, pivots and calculations, whilst we also added some new analytical calculations in the form of a Rank.  Let's take a look how we solved it! Step 1 - Wildcard Union Data The first step is to bring all of our sales data in for each store by using the wildcard union within the input tool. This is the same technique that we used last week, however this time we want to exclude the Targets sheet. So our input looks like this:  Notice how the targets table is no longer in the 'Included Sheets' column.  After the input we can clean up a couple fields by removing 'File Paths' and then renaming 'Tab...