2022: Week 12 - Gender Pay Gap

 Challenge by: Jenny Martin

I spent my International Women's Day being really fascinated by the Gender Pay Gap Bot on Twitter. I decided to look at the publicly available data that it was based on and found it a little confusing. It made me realise that the thing I enjoyed most about the Pay Gap Bot, was that it made it the insight from the data clear in a succinct manner. So let's do that for the historical data too! 

Inputs

We're using the data currently available on the Gender Pay Gap Service from 2017 to 2022:

 There are 5 input files.

Requirements

  • Input the data
  • Combine the files
  • Keep only relevant fields
  • Extract the Report years from the file paths
  • Create a Year field based on the the first year in the Report name
  • Some companies have changed names over the years. For each EmployerId, find the most recent report they submitted and apply this EmployerName across all reports they've submitted
  • Create a Pay Gap field to explain the pay gap in plain English
    • You may encounter floating point inaccuracies. Find out more about how to resolve them here
    • In this dataset, a positive DiffMedianHourlyPercent means the women's pay is lower than the men's pay, whilst a negative value indicates the other way around
    • The phrasing should be as follows:
      • In this organisation, women's median hourly pay is X% higher/lower than men's.
      • In this organisation, men's and women's median hourly pay is equal.
  • Output the data

Output


  • 7 fields
    • Year
    • Report
    • EmployerName
    • EmployerId
    • EmployerSize
    • DiffMedianHourlyPercent
    • Pay Gap
  • 41,288 rows (41,289 including headers)

You can download the full output here

After you finish the challenge make sure to fill in the participation tracker, then share your solution on Twitter using #PreppinData and tagging @Datajedininja@JennyMartinDS14 & @TomProwse1

You can also post your solution on the Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help! 

Popular posts from this blog

2023: Week 1 The Data Source Bank

2023: Week 2 - International Bank Account Numbers

How to...Handle Free Text