How to... Choose an Output
Tableau Prep is built primarily for preparing data for visual analysis in Tableau Desktop. This inevitably means Tableau has designed the tool to be very easy to output the data when it is ready to Desktop. The level of simplicity might mean that you miss the optimal output type for the purpose you are using Prep for though.
What output types are there?
Within Prep there are four main output types to consider. Each has their own main reason why you would use them so let's explore each in turn:
File Types - Hyper
Tableau's new form of extract made lots of data work faster; in some cases, a lot faster! Hyper files came in to Tableau Desktop and Server in version 10.5. Opening any data extract in Tableau automatically updated the extract to a Hyper format.
If you use the Tableau tools in an version newer than 10.5 (the last version before the naming changed to 'Year.Version' (ie 2018.1) then outputting to a Hyper file is a safe bet that the data is in the best file format for use in Desktop. Hyper files are optimised for use in Desktop because of fast ingestion and analytical query speeds.
File Types - TDE
Before Hyper files, Tableau Data Extracts (or .tde's as they are more commonly known) were the file type to use in Desktop. If you are using a version of Tableau of 10.4 or earlier, then you will need to output to a tde instead of Hyper file. These cases are becoming rarer but are still necessary due to most users having to upgrade Desktop and Server at the same time and not having the IT resource to do so.
File Types - CSV
Although Prep is primarily developed for preparing data for use in Tableau, the output to csv file enables you to share the output with users of other data software products. csv files are also useful for when someone wants a table of data to answer a bespoke question too.
Comma Separated Variable (csv for short) files are a basic file format. csv files are not optimised for use in data tools so performance will be slower than in Tableau optimised extracts like Hyper or csv.
Publish to Server
This option is primarily when you are sharing your data and/or analysis with others. Making your data set available for others to use is a key part of Self Service Data Preparation as enabling you to let your analysts and experts explore the data for themselves. Using Tableau Prep Conductor (a server add-on) can enable you to publish the flow to Tableau Server to refresh the data on a schedule of your choosing.When can you output data in Prep?
Output Step
The Output Step gives the user the majority of the options provided by Prep in terms of outputting the data for further analysis. Here's the main considerations to take when using the Output Step depending on whether you are outputting the data to a File or Server.
Save to File
1. File Name
Version control and Naming Convention within large teams and organisations is a key factor in ensuring the right data is used for the right purpose. By Naming Convention, we are referring to the Name of the file. This is set when 'Browsing' the file structure in either a Windows File Explorer window or Mac OS 'Finder' browser.
2. Location
The location be set as you browse through the file structure of your computer. The default location will be your 'My Tableau Prep Repository' Data Folder. Prep creates this file structure when it installs on your computer. If the output is purely for your own use then this location will probably be fine. If you are producing the output for others to use then you will need to change the location's file path to somewhere where others will be able to access it.
3. Output Type
Ironically, this is actually the first thing to set after deciding if you choose to 'Save to File'. This is because as soon as you Browse the file structure, it will set a file extension type. Therefore, picking whether you want to output to a csv, tde or hyper file is a good first step to take as it saves 're-browsing' to reset the file type.
Publish to Server
1. Select the Server and Site
As most users have just one server, entering that server's URL will be simple. Tableau Server and Online instances are divided in to Sites that ring fence data sources and workbooks.
2. Project
Sites are able to be sub-divided in to Projects. These are often thought of as individual team or department's spaces where the Project Owner can control the content and access. Choosing the correct Project continues to set the permissions for the Data Source.
3. Name
Like the Name for the files, Version Control and Naming Convention is key to ensuring the correct data is used when server users form their analytics.
4. Description
Adding a description to your data source can add more clarification than just your Data Source Name could possibly do so.
Preview in Desktop
The other option in Prep is to output a temporary Hyper file that will form and then open automatically in Prep. This functionality allows the user to try the dataset they have built so far to determine if other changes are required, or whether a final output can be formed. Data Preparation is often an iterative, learning process so this allows fast prototyping.
One other benefit of the Preview in Desktop is that if the data is being solely prepared for a fast, one-off solution then this temporary file will more than suffice and doesn't require the steps and decisions above to be chosen.
What scenarios should you consider?
Publishing the same Data Source to Multiple Sites / Locations
If you want to publish the data source to multiple sites, you will probably want to add multiple Output Steps. Publishing to multiple server sites is not possible (as at the time of writing) but multiple projects can be written to.
Updating the dataset will clear all previous data
At the time of writing (version 2020.1), there is no way to partially update a dataset within Tableau Prep. Therefore, the user should take care to ensure data sources remain available for future rerunning of the flow to ensure all historic data is retained if required.