How to... deal with Nulls
Nulls, or the absence of data, are a fickle challenge within data preparation. Experienced data preppers will almost instinctively know how to deal with them, or at least manage the challenges that come with a dataset that contains nulls. Newer data preppers, often do not have the same set of use cases to draw on to know how to handle the null fields. Therefore, this post is looking to share the considerations you should make when working with a dataset with a null in. What is a null? A null is the absence of a value in a data field within a dataset. The absence of data is very different to a zero, a new row or a space. These are all values that although look similar to the absence of the data, they are actually a value of some kind. Nulls appear in datasets for many reasons including: The result of mismatched fields in a Union Mismatched fields in a Left, Right or Full Outer Join No original data entry for that record but other data points for that record existing (ie other f