A (WIP) back-link tracking / Union reach project.
May 17, 2018 22:28:34 GMT
crul and bigmonmulgrew like this
Post by John Becket on May 17, 2018 22:28:34 GMT
Hi all,
So I've written a set of scripts (testing in progress) that I believe will help us calculate our impact on the platform, these are all written in PHP and interact directly with the YouTube Data API as the source. I think I'm done testing and it seems ready to go live, If anyone has any ideas on improvements etc, it would be a great time!
How it works is this - Every 6 hours, a script triggers that runs multiple searches through the YouTube Data API.
These searches can return a maximum of 50 results each time to avoid the excess YouTube bloat that is provided for second and third pages, usually once YT runs out of keyword results.
If the script only requests the first page of results, the API will only return results that match the criteria exactly, meaning if there is less results than 50 found, it will not provide extra semi-related results to make up the additional count / pages - this was key to honing the accuracy of the script's results.
The scripts search for the following strings, ordered by the following filters, searching the API for videos that match this criteria exactly.
This was achieved with some degree of accuracy and success, as YouTube is so far at least, providing results that either have the following in their title, or in their description - in effect, this script is directly asking YouTube for any videos that have back-linked to the movement, though through a method that is a little unorthodox I'll admit!
The videos returned by the search, must somewhere have written, one of the following (back-links).
"youtubersunion.freeforums.net"
"facebook.com/groups/youtuberunion"
"youtubersunion.org"
Each string is searched multiple times, each time a different filter is applied to it in an attempt to cover all the bases of videos no matter how obscure, and also to ensure any new videos uploaded are caught as well - these filters are:
Date - Ordered by the 50 most recently created videos that match the search string.
Rating - Ordered by the 50 highest rated videos that match the search string.
Relevance - 50 videos sorted by YouTube, based on relevance to the search string.
Title - 50 videos ordered by the accuracy between the video title and the search string.
View Count - Ordered by the 50 highest viewed videos that match the search string.
For each of these 15 searches, ran every 6 hours, videos that are matched by this algorithm, provide via a separate API search, the YouTube Channel ID that published it, and the statistics for that channel!
So this has now asked YouTube for any videos that have back-linked to the movement, and the corresponding Channel ID that published it, and the corresponding statistics for that YouTube Channel!
A Profile / database entry is then automatically generated, and the channel added to the tally list, which can be found here - tubefrog.com/u/ytu.php
All the channels added to this list, are then added to the totals calculations - providing an automated system that tallies the actual reach of the movement, by demonstrating actual back-linking via these channels, and thus basing the calculation on these channel's subscribers as a semi-reliable measurement of exposure to these back-links, and thus the movement.
Thoughts, ideas, suggestions?
So I've written a set of scripts (testing in progress) that I believe will help us calculate our impact on the platform, these are all written in PHP and interact directly with the YouTube Data API as the source. I think I'm done testing and it seems ready to go live, If anyone has any ideas on improvements etc, it would be a great time!
How it works is this - Every 6 hours, a script triggers that runs multiple searches through the YouTube Data API.
These searches can return a maximum of 50 results each time to avoid the excess YouTube bloat that is provided for second and third pages, usually once YT runs out of keyword results.
If the script only requests the first page of results, the API will only return results that match the criteria exactly, meaning if there is less results than 50 found, it will not provide extra semi-related results to make up the additional count / pages - this was key to honing the accuracy of the script's results.
The scripts search for the following strings, ordered by the following filters, searching the API for videos that match this criteria exactly.
This was achieved with some degree of accuracy and success, as YouTube is so far at least, providing results that either have the following in their title, or in their description - in effect, this script is directly asking YouTube for any videos that have back-linked to the movement, though through a method that is a little unorthodox I'll admit!
The videos returned by the search, must somewhere have written, one of the following (back-links).
"youtubersunion.freeforums.net"
"facebook.com/groups/youtuberunion"
"youtubersunion.org"
Each string is searched multiple times, each time a different filter is applied to it in an attempt to cover all the bases of videos no matter how obscure, and also to ensure any new videos uploaded are caught as well - these filters are:
Date - Ordered by the 50 most recently created videos that match the search string.
Rating - Ordered by the 50 highest rated videos that match the search string.
Relevance - 50 videos sorted by YouTube, based on relevance to the search string.
Title - 50 videos ordered by the accuracy between the video title and the search string.
View Count - Ordered by the 50 highest viewed videos that match the search string.
For each of these 15 searches, ran every 6 hours, videos that are matched by this algorithm, provide via a separate API search, the YouTube Channel ID that published it, and the statistics for that channel!
So this has now asked YouTube for any videos that have back-linked to the movement, and the corresponding Channel ID that published it, and the corresponding statistics for that YouTube Channel!
A Profile / database entry is then automatically generated, and the channel added to the tally list, which can be found here - tubefrog.com/u/ytu.php
All the channels added to this list, are then added to the totals calculations - providing an automated system that tallies the actual reach of the movement, by demonstrating actual back-linking via these channels, and thus basing the calculation on these channel's subscribers as a semi-reliable measurement of exposure to these back-links, and thus the movement.
Thoughts, ideas, suggestions?