2021: Week 5 - Dealing with Duplication

Challenge by: Jenny Martin

Have you ever been working with a dataset in Tableau Desktop and noticed some duplication occurring? Of course, this is something you can fix with some potentially tricky LODs or Table Calc filters, but wouldn't it be nicer for your dataset to be viz ready before heading into Desktop? 

If you attended the Tableau Fringe Festival last year, this concept may feel familiar, as I did a quick demo explaining why I, personally, would prefer to use Prep to solve my duplication issues. You can find the video here if you like.

Input

The dataset we'll be working with for this challenge follows the same theme as the Fringe Festival. We have information relating to which of our Clients are attending our training sessions. Also included in our dataset is which Account Managers look after which Clients. However, we have historical information about Account Ownership which is leading to duplication. So how can we fix it?


Requirements

If you're new to the technique of deduplicating data, then check out this blog post for some helpful thoughts about how to approach this challenge.
  • Input the data 
  • For each Client, work out who the most recent Account Manager is (help)
  • Filter the data so that only the most recent Account Manager remains (help)
    • Be careful not to lose any attendees from the training sessions!
  • In some instances, the Client ID has changed along with the Account Manager. Ensure only the most recent Client ID remains
  • Output the data

Output


  • 7 fields
    • Training
    • Contact Email
    • Contact Name
    • Client
    • Client ID
    • Account Manager
    • From Date
  • 13,528 rows (13,529 including headers)

The full output can be downloaded here.

After you finish the challenge make sure to fill in the participation tracker, then share your solution on Twitter using #PreppinData and tagging @Datajedininja@JennyMartinDS14 & @TomProwse1

You can also post your solution on the Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help! 






Popular posts from this blog

2023: Week 1 The Data Source Bank

2023: Week 2 - International Bank Account Numbers

How to...Handle Free Text