Airbnb Scraping
Files
TR Number
Date
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Inside Airbnb is a project by Murray Cox, a digital storyteller, who visualized Airbnb data that was scraped by author and coder Tom Slee. The website offers scraped Airbnb data for select cities around the world; historically data is also available. We were tasked with creating visualizations with listing data over Virginia and Austria to see what impact Airbnb was having on the communities in each respective region. The choice was Virginia and Austria because our team was familiar with both regions, with parts of our team being familiar with Virginia and other parts being familiar with Austria. The eventual goal is to expand past analysis of these 2 regions and expand further to say the rest of the United States. Since July 2019, Tom Slee has abandoned the script2 to collect data. To collect data on Virginia and Austria, we needed to update the script to collect more recent data. We began inspecting the script and found it was not collecting as much data as it once was. This was almost certainly due to Airbnb’s website layout changing over time (a common nature of websites). After finding out how the script worked, we eventually found out the various problems related to the script and updated it to the new Airbnb website design. Doing so, we were able to get even more data than we thought possible such as calendar and review data. From there, we were able to begin our data collection process. During all the time fixing the script, our team was making mock visualizations to be displayed on a website for easy viewability. Once data collection was complete, the data was transferred over to be used for these mock visualizations. We visualized many things such as how many listings a single host had, how many listings were in a given county, etc. The main visualization created was to see where all the listings for Airbnb were on the map. We displayed this on a map. We also made maps to visualize availability, prices, and the number of reviews. Further, we created pie charts and histograms to represent Superhosts, instantly bookable listings, and price distributions. We expect that in the future the script and the data collected and visualized will be used by both future CS Students working on subsequent iterations of the project as well as Dr. Zach himself, our client.