Posts

Showing posts from December, 2019

2019: Week 46 Solution

Image
This week we got all festive for a Christmas edition of Preppin' Data! In this task we had to help Santa out with u nderstanding how much time he has allocated to his Elves to build presents that have been wasted due to peo ple being on the naughty list! The first task is to bring the Naughty and Nice lists together, and make sure that we could identify who was on each list. To do this we bring both tables into the view, and then use the union tool to 'stack' both tables on top of one another. As a result of this, we have a FilePath field that contains the information about what list the data had come from. Therefore, we can parse out this information by using a custom split on the last word after the '/'. The next step is to match the address on the Naughty/Nice list with the address on the Present list. The problem we have here is that on the Present's list there is only part of the address, but not the same part on each row. Sometimes it is the

2019: Week 44 Solution

Image
In this week’s Preppin’ Data we continued with some end of year tasks, and this week we looked at how the team performed over the year.  Our first task is to find out how many staff worked in each store on each day. To do this we input the Team Member table, then use an aggregate tool to group by Store & Date, then sum the No. of Rows within the table. As there is one row for each member of staff who worked on that given day, this aggregation allows us to total how many staff worked in each store on each individual day. We have then renamed the No. of Rows field to something more familiar such as Staff Numbers.  Next up we need to join the Team Member and Store Sales tables together. To do this we need to prepare the Store Sales table so that the store field is in the same format as the Team Member table. Firstly, we need to use the pivot tool to pivot the two store fields so the values are in a singular column, therefore we bring the ‘Lewisham’ & ‘Wimbledon’

How to... Split Data Fields

Image
One of the most common actions you will take when preparing data for analysis will be to split a column in to it's sub-parts. This happens as data is picked up from operational systems, outputted as a unique ID for that record of data, or, squeezed together to fit in to a poorly designed database table. The human brain is fantastic at spotting patterns in data (that's why we create visual analytics) so you will often spot the need to split data fields (columns) by just looking at the original data. Probably needs to become split apart to form the three separate columns that would help with the analysis of this data set. Basic Splits Splitting data in most data tools can be very easy to achieve; Tableau Prep is no different. Simply choose the data field you want to split, go and find the '...' icon, select, 'Split Values' and then 'Automatic Split' and Prep will decide what is the best logic to split the field.  In this case, the automati

How to... Not Need Data Prep at all

Image
Sometimes we all overthink problems and this happens a lot in Data Preparation. Once you have some strong skills, it can be easy to overthink the issue and take additional steps you simply don't need. History of Data Preparation in Tableau When trying Tableau Public for the first time in version 5.2 and becoming a heavy user of Desktop version 7 onwards, completing your Data Prep in Tableau was tough unless you were doing very simple use cases. Tableau Prep was just a glint in the eye of the Tableau Developers at that point.  Common data prep tasks had to happen outside of Tableau Desktop for a number of reasons, like performance (slower processing speeds), but it was the lack of functionality that caused the most significant issues. There was often the need to use external tools to complete tasks. The Tableau Excel Add-In Extension was the principle way to deal with Pivoting data fields to change columns into rows of data. This was very useful for dealing with survey data

How to... Connect to Data in Files

Image
One of the first steps you will take in any Data Preparation, Visualisation or analytics project is to Input Data. These inputs can come in many different forms but the most common still remains Files. Files, upon files, upon files For many, our first experiences of working with data probably involve one piece of ubiquitous software, Microsoft Excel. The spreadsheet has become the jack-of-all-trades for data storage, manipulation and basic reporting. Every organisation across the world has tens, if not hundreds of spreadsheets saved on every computer. All of these will contain useful information, raw data or reference tables that will help answer the questions experts are posing.  The issues spreadsheets create There are a number of potential issues this creates that whilst aren't unique to spreadsheets, they are considerations to make: Accuracy - how has the data been changed since being added to the spreadsheet?  Timeliness - how old is the data?  Optimisation -

How to... explain 'Why Self Service Data Prep'?

With every organisation swimming in Data Lakes, Repository and Warehouses, never before have people in organisations had such an enormous opportunity to answer their questions with information rather than just using their experience and gut instinct. This isn't that different from where organisations stood a decade ago, or even longer. What has changed is who wants access to that data to answer their questions. No longer is the expectation that a separate function of the business will be responsible for getting that data, now everyone feels they should have access to the data. So what has changed? Self Service Data Visualisation. What is about to change to take this to the next level? Self Service Data Preparation. A Short History of Self Service Data Visualisation  More than a decade ago, all things data related were the domain of specialist teams. If you were in an organisation and had a question, you either settled for trusting your gut instinct for the answer or set up a p

2019: Week 43 Solution

Image
We’re getting closer to the end of the year so it’s about time to do some end of year appraisals and encourage our stores to set some targets for next year. Automating this would be ideal for us, so that’s the challenge for us this week.  First up is cleaning the Monthly Sales file so we can remove any fields that don’t contain store information, in case we remove the ‘Total’ record from the Store field. The next step, is to pivot the data so that all of the monthly totals are in one column. The pivot should look like this:  After pivoting the data, we can then rename the newly created fields so that they are more recognisable, we’ve called the new fields Month Year and Sales Value. After the pivot, we have multiple rows of data for each month, and we can use Tableau Prep’s grouping functionality to group the specific months in quarters. For example, Month 1, Month 2, Month 3 would be grouped together and renamed as Q1. This technique works for this solution, howeve

2019: Week 46

Image
Ho ho ho, Merry Christmas everyone. Well not for some people! Some people have ended up on Santa's Naughty list but Santa needs to understand how much time he has allocated of his Elves to building presents that will ultimately not be delivered. Can you help Santa match who is on the Nice and Naughty List so he can deliver the correct presents to those who deserve them? Requirements Input all three tables in the Spreadsheet Pull together the Naughty and Nice list but make sure you understand who should be on each Parse the address to match Santa's list to the Naughty/Nice list Only return one row per recipient in Santa List (both Naughty and Nice) Who's missing off both lists? Also, aggregate the nice and naughty list to understand how many hours have been allocated to building both types of presents Santa only does his time planning on whole hours so make sure you round up Output the data Output 2 Files: Summary: 2 Columns: Total Hours B