On this episode of The Download, Christina is on location at RenderATL, but is still here to offer the latest developer news, including:0:00 Intro0:59 Mainta...
Videogrep is a command line tool that searches through dialog in video files and makes supercuts based on what it finds. It will recognize .srt or .vtt subtitle tracks, or transcriptions that can be generated with vosk, pocketsphinx, and other tools.
What if I was analyzing "GitHub Infocus" with videogrep ?
This short post will guide through this first trial on videogrep and what I have been able to produce, discover... and the fun I also had.
โ๏ธ Notice that I used the following excellent tutorial to perform this experience ๐
๐ฅ Get the video with yt-dlp
First I want to get the YT video https://youtu.be/awQ7LFxfXWE locally, therefore you can choose many encoding options and choose the one that best fits your needs (-F option) but in our case, we'll get the default one :
Then you are ready for the next step : use videogrep.
๐ Text analysis with ngrams
videogrep makes it possible (and super easy) to analyze text within the (downloaded vtt files) subtitles.
So, what are the trendiest group of word ( called ngrams) in the video ? Let's find out !
While the single word analysis is not really interesting :
โฏ videogrep --input propelling_your_devops.mp4.webm --ngrams 1 | head -10
to 449
and 352
that 347
you 323
the 322
we 306
a 255
of 251
so 167
is 157
2-ngrams are much more interesting about the underlying intents of the video :
โฏ videogrep --input propelling_your_devops.mp4.webm --ngrams 2 | head -7
want to 97
that we 61
you can 55
you know 54
going to 51
we have 45
we can 45
... soon confirmed with the 3-grams :
โฏ videogrep --input propelling_your_devops.mp4.webm --ngrams 3 | head -9
we want to 30
you want to 20
a lot of 19
want to make 19
make sure that 18
i'm going to 17
to make sure 17
we have a 16
i want to 13
๐ฌ Short analysis
With the help of ngrams, within less than a second we discover, by grepping the text of the video that
"GitHub focuses it attention on what they want... and also on what you want to achieve... and make"
๐ That first fact already tells us a lot.
โ๏ธ It also puts in evidence
"the inclusive approach while using a lot of "I" and "We"
... which is also pretty exciting to onboard us on the product they are showcasing โฃ๏ธ
โ๏ธ๐๏ธ Cut & get shorts
Now, the fun part.
You have made a text analysis but... wouldn't it be fun to see the movie of these grepped terms ?...
โ ๏ธ Spoiler alert : Yes it is โ (and it's easy) ๐คฃ
These are called fragments. Let's get some of them.