How to... deal with Dates

Sorry to anyone looking for dating advice - this isn't for you. For those who are battling dates in their data sets, then this is for you! 

Why are dates important?

A basic date field can start to tell us so much:
  • How many sales were made on Saturday?
  • How many students joined the programme this term?
  • How many games does my team play this month?
All of this can be formed by counting the number of rows and breaking this down by the different parts of the date.

What do we mean by parts of dates?

When most of us think of dates we think of days, months and years. Depending on which side of the Atlantic Ocean we are on depends on whether the months or days come first in that order. We are Brits so will unapologetically put days before months for the rest of this post. 

The date format we commonly use are dd/MM/yyyy or, for example, 25/09/2019. The 'd' represents the day part and as we have two digits to represent the day then we use 'dd'. 'M' stands for month or again 'MM' as there are potentially two digits in the month. 'y' is for Year and you'll often either have two digits for the last two year digits, ie 19 for the date above, or four digits represented by 'yyyy'.

The date given can tell us more than just what day it is, in what month, in what year. Weekday, Week and Quarter can all be determined from this basic date. 25/09/2019 is a Wednesday, in the 39th week and the third quarter of 2019. The newest Business Intelligence tools let you dissect the date fields in all of these ways but that wasn't always the case.

Date Lookup Tables

Traditionally, within most database set-ups, you will often have access to a reference table that will give different dissections of the same date. For example, you will often find columns in the date reference table including:
  • Date 
  • Year
  • Quarter
  • Month
  • Week
  • Day
  • Day of Year
  • Weekday
  • Financial Year
  • Financial Quarter 
  • Financial Month
  • Public Holiday
  • Weekend
This matters as whenever you use a tool that doesn't dissect the date more natively, then you are having to join and work with all the relevant columns to pick out the parts of the date for your analysis (as well as your Database Administrator keeping the table up to date).

Epoch Date

Another format of dates that you may commonly find in data sets but make little sense to anyone at first glance is Unix Epoch dates. As Integers (whole numbers) are easy to store in databases (unlike long strings), the Epoch date is recorded as the number of seconds from 1st January 1970 00:00:00. For example, let's use midnight of our example date 25/09/2019 would result in a value of 1569369600.

The challenge for many data preppers is to convert this for use in modern business intelligence tools. In Tableau, you can convert this number back to a date as:

As an aside, if you want to check your Epoch conversion is correct, I recommend this tool: https://www.unixtimestamp.com/index.php

Excel Serial Number

If you have worked with data, you may have come across a value that is similar to the Unix Epoch Date but is a much smaller number. Excel's Serial Number is the number of days since 01/01/1900. Using our example date, Excel's Serial Number equivalent will be 43731. Therefore, this number can be turned in to a date for analysis in a very similar manner to the Epoch solution above but this time we will use increments of days and start date of #1900-01-01#. 

Entering Dates

You may have noticed in the image above, we have set the constant date of 01/01/1970 or 01/01/1900 with the hash marks surrounding them. This is a common syntax you will find in many calculations and tools you use. There are more common ways to convert dates too. In Tableau, there are few date calculations that will help for dates that will make your analysis very easy.

Makedate()
If you have each part of the date as numbers you might want to explore using the MAKEDATE() function.

Makedate() gives the user the ability to take the individual components of the date and piece it together in a way that it can then be analysed however you want to.

Strings to Date
Strings are very common in working with dates. After all here are some of the ways to right September:
  • S
  • Sep
  • Sept
  • September
Coupled with the different ways to seperate the different parts of the dates (' ' , . / -) string data that represents dates can come in many different forms:
  • 25 Sep 19
  • 25th.Sept.2019
  • 25-9-19
We could go on... but let's instead show you how to turn these string fields in to dates. Using the function DATEPARSE() can take any string and let the user pick out the parts of the date as long as they are consistently positioned throughout the date field.

For the date format '25.September.2019' can be turned in to a date with:

                         ____________________________________________________

Dates are fundamental to your analysis so getting comfortable using them is a key skill in your data preparation kitbag.

Here's some challenges to practice on:


Popular posts from this blog

2024: Week 1 - Prep Air's Flow Card

How to...Handle Free Text

2023: Week 1 The Data Source Bank