How to... explain 'Why Self Service Data Prep'?

With every organisation swimming in Data Lakes, Repository and Warehouses, never before have people in organisations had such an enormous opportunity to answer their questions with information rather than just using their experience and gut instinct.

This isn't that different from where organisations stood a decade ago, or even longer. What has changed is who wants access to that data to answer their questions. No longer is the expectation that a separate function of the business will be responsible for getting that data, now everyone feels they should have access to the data. So what has changed? Self Service Data Visualisation. What is about to change to take this to the next level? Self Service Data Preparation.

A Short History of Self Service Data Visualisation 

More than a decade ago, all things data related were the domain of specialist teams. If you were in an organisation and had a question, you either settled for trusting your gut instinct for the answer or set up a project to get the information you required. Data projects were either reporting requests that went to specialist Business Intelligence (BI) teams or IT teams to set up data infrastructure projects to produce reports from. This was expensive, time consuming and often resulted in products that were less than ideal for all concerned.

The reason this methodology doesn't work is the iterative nature of BI. Humans are fundamentally intelligent creatures who like to explore, learn and then ask more questions because they are intrigued. With the traditional IT or BI projects, once the first piece of analysis was delivered, the project was over. However, if one question was answered, others were triggered but as the skills were in different hands to those who have the questions, they simply went unanswered. The business users still tried to cobble together the answers but they were from disparate reports or different levels of aggregation.

This all changed with the rise of Self Service Data Visualisation tools like Tableau Desktop. Suddenly with a focus on the user, individuals were able to drag and drop the data fields around the screen to form their own analysis, answer their own questions, ask their next questions straight away. The previous decade has seen data visualisation and analysis become closer to everyone's role, if not a significant part of many roles that are now not considered Information Technology roles or part of the data team. The analytical capacity has come to the business, rather than the business having to go and ask specialists to get the data. This represents a big transformation in how we work and poses a challenge as to what skills people now require.

Accessing the 'Right Data'

The rise, and entrenchment, of self service data visualisation in to individual's roles has raised other needs and tensions in the analytical cycle. To enable self service, access to data sources has become the next pain point in this cycle. With the right data, optimised for the use in the tools that empower the visual analysis, answers can be found at the speed that the business expert can form the questions. But accessing the 'right data' is not that easy. The data assets that have been formed by organisations have been optimised for storage, optimised for tools that now seem to work against the user rather than with them and are held behind strict security layers to handle greater regulation. 

Many Data Projects are now focused on extracting data from their storage locations. The specialist skills are focused on using data skills to:
  • Find data in existing repositories - is it a file lurking on someone's laptop or a nicely curated database?
  • Find data in public or third party repositories - publicly available data sources are increasingly becoming key to deeper analysis.
  • Create feeds of data from previously inaccessible sources / systems - for example, getting data out of operational systems in businesses to be able to answer questions on what's going on.
The gap in the process now sits between taking these sources and making them ready for visual analytics. 

The Self Service Data Preparation opportunity

This gap is being challenged by new tools that are allowing the business experts, government workers and academics, using self service visual analytics to solve their questions, to access this data. Tableau Prep Builder has brought the same logic that empowered visual analytics to this data preparation step. By using a similar user interface to the one that data visualisers are already accustomed to, Prep Builder has made the transition to self service data preparation a simple one. 

As a user who went through the pain of waiting for reports to be built for them, learning how to build the analysis for them, wrangle the data sets in databases and eventually be a trainer of the skills for both data visualisation and data preparation, there is still a significant gap between all potential data preparers and those that have the requisite skills needed, the awareness of what to do with the Self Service Data Preparation tools and why they are needed. 

Preppin' Data 

Preppin' Data is designed for exactly this gap. Learning how to utilise the tools needed to tackle the tasks that are acting as the current road blocks in delivering answers to our questions. The challenges will introduce commonly used techniques to solve these problems away from the pressure of the workplace. Over time, strategies can be formed to make the data exactly how you want it whether it is from files to databases, surveys to pivot tables and messy data to tangled text fields. There isn't a straightforward recipe to follow bu through practising, you'll soon be able to handle these challenges. Dig in to the challenges, read the 'How to...' posts and soon the Self Service Data Preparation road blocks will be gone and your super data powers will shine through. 

Popular posts from this blog

2023: Week 1 The Data Source Bank

2023: Week 2 - International Bank Account Numbers

How to...Handle Free Text