2022: Week 22 - Solution
Solution by Tom Prowse and you can download the workflow here.
This week we joined forces with a couple of other community initiatives (#GamesNightViz and #DataFamCon) to bring you a challenge all about Dungeons & Dragons. We want to look at how long each character speaks during each episode of the podcast 'Critical Role' so that we can create a visualisation from this.
Step 1 - Union Data
Step 2 - Timestamp Rank
Next we need to rank the timestamps in each episode so that we can identify the earliest to latest ones. We can use the Rank analytical calculation here, and we want to use Dense Rank to ensure that each line has an equal number if the time_in_secs is the same:
Step 3 - Compare Previous Timestamp
At this stage we want to remove all of the additional fields as well. We should have 3 fields remaining - Episode, Time_in_secs, and Next Line.
Now we have identified the next line, we can join these together using an inner join on Episode = Episode and Order = Next Line.
Now that we have the current and next line on the same line, we can calculate the difference between the two timestamps:
Duration
ZN([time_in_secs-1])-[time_in_secs]
At this stage our data should look like this:
Step 4 - Parse Names
Some of the character names have been separated with a comma, so we want to split these out and then reshape the data so that we have a row per character.
First we split the names at each comma using the custom split:
This creates 4 new fields that contain different character names.
To bring these into a single column and a row for each character, we can use a Columns to Rows pivot. In the setup of the pivot, we want to use the Wildcard Pivot to bring in all of the fields that contain the word 'Split'.
Then from here we can remove the Pivot Names field, then exclude any blank ('') values from the Name Split field. At this stage our table should look like this:
Step 5 - Filter & Deduplicate
After the aggregation we are ready to output our data ready to create the visualisation.