2020: Week 30

This challenge is from the mind of Carl.

I'm British. As a nation we are obsessed about the weather (almost as much as I'm obsessed about Prep). I was really hoping I was going to be able to create a fun incremental refreshing challenge about the weather but I failed to find a data source that has either a number or date in the initial load that Prep can reference for refreshing. Instead, I have come across a data source you can fully refresh and make relevant for yourself. 

The data source I have used for this challenge is from weather.com. We'd love to see you use your own city's data for this week challenge so the values will differ but the structure should stay the same!

Here's how I got the data source this week:
1. Go to weather.com 
2. Search for your city (or use London if you want to see some Precipitation percentages!)
3. Take the URL of the page the search returns. it should look something like this (with a lot of advertising my screenshots are trying to remove)
4. In a Google Sheet, enter the following formula in cell A2 (in cell A1 just put the word 'Data')
The formula '=importhtml' allows you to pick up the data from either tables of lists. weather.com has 3 different lists we are using for this week's data sets: 
  • List 1 - Next 24 Hours
  • List 2 - Next 5 Hours
  • List 3 - Next 5 Days
Create a tab of data with these names for the different list types.

Input

There are three Inputs as described above:
1. Next 24 Hours

2. Next 5 Hours

3. Next 5 Days

Remember you might have your own values but these are the three lists you should be using. 

Requirements

We want a data set that can allow us to analyse each data set but there is quite a lot of variance in the data sets therefore, we are building a data set that can be filtered to a single input source in Desktop rather than using all three at once:
  • Input the data (you can use my google sheet to check your own or my Excel)
  • Find a way to break up each cell of data into a separate value (this doesn't seem as easy as it should be)
  • Clean symbols and punctuation out of the data (again not so easy!)
  • Create consistent titles for each data set (these are manually entered) 
  • Union the data together
  • Output the data as a file

Output


One File:
6 Data Fields:
  • Forecast Type
  • Date or Time
  • Temperature
  • Max Temp
  • Min Temp
  • Precipitation Chance
14 Rows (15 incl. field headers)

Here is the Output file to let you check your structure.

After you finish the challenge make sure to fill in the participation tracker, then share your solution on Twitter using #PreppinData and tagging @Datajedininja@JennyMartinDS14@JonathanAllenby & @TomProwse1

You can also post your solution on the brand new Tableau Forum where we have a Preppin Data community page. Post your solutions and ask questions if you need any help! 

Popular posts from this blog

2023: Week 1 The Data Source Bank

2023: Week 2 - International Bank Account Numbers

How to...Handle Free Text