VTechWorks staff will be away for the Independence Day holiday from July 4-7. We will respond to email inquiries on Monday, July 8. Thank you for your patience.
 

YouTube Video Analysis

dc.contributor.authorBachubhay, Akhilen
dc.contributor.authorChhour, Dannyen
dc.contributor.authorDeng, Hejien
dc.contributor.authorTran, Trungen
dc.date.accessioned2021-05-12T14:50:34Zen
dc.date.available2021-05-12T14:50:34Zen
dc.date.issued2021-05-07en
dc.description.abstractYouTube (youtube.com) is an online video-sharing platform that allows users to upload, view, rate, share, add to playlists, report, comment on videos, and subscribe to other users. Over 2 billion logged-in users visit YouTube each month, and every day people watch over a billion hours of video and generate billions of views. UGC (User-Generated Content) makes up a good portion of the content available on YouTube, and more and more people post videos on YouTube, many of which become well-known YouTubers. A notable trend to look at for these YouTubers is how their channel grows over time. We were tasked with analyzing how certain YouTubers become successful over time, how their early videos differ from later ones in terms of scripts, and how comments change with fame. Such analysis requires us to look into two sets of data. The first set is numerical data of the channels, which consists of view counts of videos, likes and dislikes on videos, published dates of the video, the interactions between the video creator and the audience, etc. The second set is textual data, which consists of the auto-generated scripts from videos as well as comments from the users. With the help of YouTube APIs and other available helper tools, we are able to scrape the metadata from data of videos and output them as CSV files for future studies. For the analysis, we generate some scatter graphs where each dot stands for one instance of the video, where the x-axis represents the published date while the y-axis represents the views it gets, and then the color of the dot represents some other metrics for evaluation (for instance, the duration of videos). With the Python NLTK package, we are able to conduct analyses over the transcripts from the videos and comments, to see what words are spoken the most, what words appear frequently in the comments and if they are positive or negative, how many words the creator says in a minute, etc. Combining these data we can generate a more thorough scatter graph for discovering if there is a pattern on how certain YouTubers become more and more successful.en
dc.description.notesThis project was developed using data solely from one channel called Biffa Plays Indie Games as the basis, but it is expected to function correctly when used on other channels as well. The two versions of the final report are in YouTubeVideoAnalysisReport.docx (Word) and YouTubeVideoAnalysisReport.pdf (PDF). The two versions of the final presentation are in YouTubeVideoAnalysisPresentation.pptx (PowerPoint) and YouTubeVideoAnalysisPresentation.pdf (PDF).en
dc.identifier.urihttp://hdl.handle.net/10919/103255en
dc.language.isoen_USen
dc.publisherVirginia Techen
dc.rightsCC0 1.0 Universalen
dc.rights.urihttp://creativecommons.org/publicdomain/zero/1.0/en
dc.subjectCommentsen
dc.subjectData Analysisen
dc.subjectFrequency Counten
dc.subjectYouTubeen
dc.subjectSocial Mediaen
dc.subjectTranscriptsen
dc.subjectPythonen
dc.subjectPlotlyen
dc.subjectNLTKen
dc.subjectJupyter Notebooken
dc.subjectData Collectionen
dc.subjectWeb Scraperen
dc.titleYouTube Video Analysisen
dc.typePresentationen
dc.typeReporten

Files

Original bundle
Now showing 1 - 4 of 4
Loading...
Thumbnail Image
Name:
YouTubeVideoAnalysisPresentation.pdf
Size:
923.31 KB
Format:
Adobe Portable Document Format
Name:
YouTubeVideoAnalysisPresentation.pptx
Size:
3.31 MB
Format:
Microsoft Powerpoint XML
Name:
YouTubeVideoAnalysisReport.docx
Size:
4.01 MB
Format:
Microsoft Word XML
Loading...
Thumbnail Image
Name:
YouTubeVideoAnalysisReport.pdf
Size:
3.17 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
Name:
license.txt
Size:
1.5 KB
Format:
Item-specific license agreed upon to submission
Description: