ScrapingGenAI

TR Number

Date

2024-05-10

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

AI has been widely used for many years and has been a constant front-page news topic. The recent but fast development of generative AI inspired many conversations, from concerns to aspirations. Understanding how the topic develops and when people become more supportive of generative AI is critical for social scientists to pinpoint which developments inspire public discussions. The use of generative AI is relatively new. The data and insight gathered could be used to determine if use in a commercial setting (like in Travel/Hospitality) is viable and what the potential feedback from the public might look like. We developed two specialized web scrapers. The first targets specific keywords within Reddit subreddits to gauge public opinion, and the second extracts discussions from corporate earnings calls to capture the business perspective. The collected data were then processed and analyzed using Python libraries, with visualizations created in Matplotlib, Pandas, and Tkinter to depict trends through line charts, pie charts, and bar charts. We limited our analysis period from August 2022 to March 2024, which is significant as ChatGPT was released in November 2022, allowing us to observe notable changes. These tools not only show changes in public interest and sentiment but also provide a graphical representation of temporal shifts in the perception of AI technologies over time. The final product is designed for anyone interested in company transcripts and in comparing them to the public perspective. The product offers users access to detailed data representations, including numerical trends and visual summaries to further understand the correlation between the company and the public. This comprehensive overview assists in understanding how public and corporate sentiments towards AI have shifted during a recent 20-month period. A significant hurdle was using the PRAW API for Reddit data scraping. Through review of documentation, tutorials, and additional support from a teaching assistant, we successfully implemented the functionality needed to extract and process the data from subreddits effectively. To make our findings more accessible and engaging, future additional work transforming this product into a fully functional website would be beneficial. This platform would make the insights more readily available to a wider audience, including the general public and industry stakeholders. Doing so could enhance the impact and usefulness of our project.

Description

ScrapingGenAIpresentation.pdf: PDF file for the ScrapingGenAI presentation. ScrapingGenAIpresentation.pptx: Powerpoint file for the ScrapingGenAI presentation. ScrapingGenAIreport.docx: Word document for the ScrapingGenAI report. ScrapingGenAIreport.pdf: PDF file for the ScrapingGenAI report.

Keywords

Travel, Web-Scraping, Reddit, MarketBeat

Citation