Browsing by Author "Liu, Han"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
- Airbnb ScrapingYu, Wang; Huang, Baokun; Liu, Han; Pham, Vinh; Nikolov, Alexander (Virginia Tech, 2020-05-13)Inside Airbnb is a project by Murray Cox, a digital storyteller, who visualized Airbnb data that was scraped by author and coder Tom Slee. The website offers scraped Airbnb data for select cities around the world; historically data is also available. We were tasked with creating visualizations with listing data over Virginia and Austria to see what impact Airbnb was having on the communities in each respective region. The choice was Virginia and Austria because our team was familiar with both regions, with parts of our team being familiar with Virginia and other parts being familiar with Austria. The eventual goal is to expand past analysis of these 2 regions and expand further to say the rest of the United States. Since July 2019, Tom Slee has abandoned the script2 to collect data. To collect data on Virginia and Austria, we needed to update the script to collect more recent data. We began inspecting the script and found it was not collecting as much data as it once was. This was almost certainly due to Airbnb’s website layout changing over time (a common nature of websites). After finding out how the script worked, we eventually found out the various problems related to the script and updated it to the new Airbnb website design. Doing so, we were able to get even more data than we thought possible such as calendar and review data. From there, we were able to begin our data collection process. During all the time fixing the script, our team was making mock visualizations to be displayed on a website for easy viewability. Once data collection was complete, the data was transferred over to be used for these mock visualizations. We visualized many things such as how many listings a single host had, how many listings were in a given county, etc. The main visualization created was to see where all the listings for Airbnb were on the map. We displayed this on a map. We also made maps to visualize availability, prices, and the number of reviews. Further, we created pie charts and histograms to represent Superhosts, instantly bookable listings, and price distributions. We expect that in the future the script and the data collected and visualized will be used by both future CS Students working on subsequent iterations of the project as well as Dr. Zach himself, our client.
- Comparison of Computational Notebook Platforms for Interactive Visual Analytics: Case Study of Andromeda ImplementationsLiu, Han (Virginia Tech, 2022-09-22)Existing notebook platforms have different capabilities for supporting visual analytics use. It is not clear which platform to choose for implementing visual analytics notebooks. In this work, we investigated the problem using Andromeda, an interactive dimension reduction algorithm, and implemented it using three different notebook platforms: 1) Python-based Jupyter Notebook, 2) JavaScript-based Observable Notebook, and 3) Jupyter Notebook embedding both Python (data science use) and JavaScript (visual analytics use). We also made comparisons for all the notebook platforms via a case study based on metrics such as programming difficulty, notebook organization, interactive performance, and UI design choice. Furthermore, guidelines are provided for data scientists to choose one notebook platform for implementing their visual analytics notebooks in various situations. Laying the groundwork for future developers, advice is also given on architecting better notebook platforms.
- Front-End Kibana (FEK) CS5604 Fall 2019Powell, Edward; Liu, Han; Huang, Rong; Sun, Yanshen; Xu, Chao (Virginia Tech, 2020-01-13)During the last two decades, web search engines have been driven to new quality levels due to the continuous efforts made to optimize the effectiveness of information retrieval. More and more people are becoming satisfied during their information retrieval processes, and web search has gradually replaced older methods, where people obtained information from each other or from libraries. Information retrieval systems are in constant interaction with users and help users interpret and analyze data. Currently, we are building the front end of a search engine, where users can explore information related to Tobacco Settlement documents from the University of California, San Francisco, as well as the Electronic Theses and Dissertations (ETDs) of Virginia Tech (and possibly other sites). This submission introduces the current work of the front-end team to build a functional user interface, which is one of the key components of a larger project to build a state-of-the-art search engine for two large datasets. We also seek to understand how users search for data, and accordingly provide the users with more insight and utilities from the two datasets with the help of the visualization tool Kibana. Already, a search website, where users can explore the two datasets, Tobacco Settlement dataset and ETDs dataset, has been created. A series of functionalities of the searching page have been realized, for instance, the login system, searching, filter functions, a Q&A page, and a visualization page.