2019: Week 24

Previously on Preppin' Data... (I'm still a 24 fan) a lot of Data Schoolers have been working through the challenges and this week they wanted to post their own so a big thank you goes out from Jonathan and I to Kamilla and Bona of our fourteenth cohort at the Data School for this challenge. They don't know whether it's too hard or not so please let them know!

Both of the challengers this week love to use regex so wanted to give those who haven't had the chance to use the language before the chance to explore it. If you want to stick with the string calculations you can but it might be a little tougher!

If you haven't used regex before then the team recommend you use https://regexr.com/ to help you get to grips with what is going on.

With all of that in mind, what is the challenge? Analysing messages from the Data School What's App group. Don't worry we are not sharing any insights in to what the Data Schoolers think of their coaches (phew!) as they have replaced their actual messages with The Bards of Wales by Janos Arany, translated by Neville Masterman. The real data is who sent the message and when plus an additional data source that let's you know whether the day was a weekday / weekend day or public holiday (what us Brits call a Bank Holiday).

Kamilla and Bona want the following questions answered:

  1. Who sent the most messages overall?
  2. Who sent the highest percentage of messages whilst at work?
    1. Working hours are 9a to 5pm with a lunch break 12-1
  3. Who sent the longest message?
  4. Who has the highest amount of words per message


  • Use both input files found here
  • You're not allowed to use the split function (they're mean aren't they!)
  • Answer the questions above from one single output


  • 6 Columns
    • Name
    • Total Messages
    • Work Hours Messages
    • % sent whilst at work
    • Total words sent
    • Average words per message
  • 16 rows data (17 rows with headers)
For comparison, here's our output files. Don't to forget to fill in our participation tracker!

Popular posts from this blog

2024: Week 1 - Prep Air's Flow Card

How to...Handle Free Text

2024: Week 2 - Average Price Analysis