2024: Week 35 - Solution
Solution by Tom Prowse and you can download the workflow here.
Step 1 - Matchday Number
The first task is to create a row number for each row of our data. This is nice and easy in Tableau Prep as it has been automatically created for us within the Input step. However, by default, this is a hidden field so we need to go into the input, right click and unhide that field.
We can then create a Matchday number for each of the different weeks across the season. First we want to extract each of the matchday numbers from the Matchday 1 of 38 field using this calculation:
Matchday
TRIM(
IF CONTAINS([Matchday 1 of 38],'Matchday')
THEN RIGHT([Matchday 1 of 38],8)
END
)
This will extract the 'n of 38' piece of the string, then we can split on a ' ' (space) to extract the first number from the string to give us the matchday:
Matchday
SPLIT([Matchday],' ',1)
As a result the table should now look like this:
Because there is only 1 row per matchday with this information a lot of nulls have been created as a result. We can use the Fill Down functionality to fill in these null values with the correct Matchday number:
This fills in all of the matchdays apart for number 1, and we can use another calculated field to complete the field:
Matchday
IFNULL([Calculation1],1)
At this stage we should have a Matchday field which is complete with each of the matchday numbers for the whole season:
Step 2 - Matchday Information
Now we have a matchday for each row, we can start to tidy the information up so that we can use it for analysis.
First, we can filter to remove all of the rows from the Matchday 1 of 38 field that contains the information about the matchday number:
Then we can pivot the table so that all of the match information is within a single column. For this we use a columns to rows pivot where we include the two fields that contain the match information:
From here we can exclude any null values from Pivot Values and remove the Pivot Names field so the table looks like this:
Step 3 - Extract Matchday Information
We now want to extract the information about each of the matches from the string. For this we first need to replace the \n with a '||' so that we can easily split the info field:
Details
REPLACE([Pivot1 Values],char(10),'||')
We can then extract the following details:
Date
DATE(SPLIT([Details],'|',3))
This gives us the date of the match.
For the other fields we can use the split functionality to extract the information.
To extract the information about the Home and Away teams we can split on ' ' (double space). This will give us a column for Home and another one for Away:
We can remove the Split 1, then rename Split 2 to Home and Split 3 to Away.
From here we want to split the Home field using '||' for all values to extract the required information:
Split 1 = Home Score
Split 2 - Remove
Split 3 = Home Team
Split 4 - Remove
Split 5 - Remove
You can then repeat the same process but for the Away field and we should end up with a table that looks like this:
You can view the output here.
After you finish the challenge make sure to fill in the participation tracker, then share your solution on Twitter or LinkedIn using #PreppinData and tagging @Datajedininja, @JennyMartinDS14 & @TomProwse1
You can also post your solution on the Tableau Forum where we have a Preppin' Data community page. Post your solutions and ask questions if you need any help!