US State Tourism Websites

dc.contributor.authorShere, Danyaen
dc.contributor.authorAyub, Ahmaden
dc.contributor.authorMueller, Rebeccaen
dc.contributor.authorFabian, Lexien
dc.contributor.authorShah, Akshaten
dc.date.accessioned2020-05-14T03:32:52Zen
dc.date.available2020-05-14T03:32:52Zen
dc.date.issued2020-05-11en
dc.description.abstractIn the United States, every state has a tourism website. These sites highlight the main attractions of the state, travel tips, and blog posts among other relevant information. The funding for these websites often comes from occupancy taxes, a form of taxes that comes from tourists who stay in hotels and visit attractions. Therefore, current and past tourists fund the efforts to draw future tourists into the state. Since state tourism is funded by the success of past tourism efforts, it is important for researchers to spend their time and resources on finding out what efforts were successful and which weren’t. With this comes the importance of seeing trends in past tourism endeavors. By examining past tourism websites, patterns can be drawn about information that changed, from season to season and year to year. These patterns can be used to see what researchers deemed as successful tourism efforts, and help guide future state tourism decisions. Our client, Dr. Florian Zach of the Howard Feiertag Department of Hospitality and Tourism Management, wants to use this historical analysis on state tourism information to help with his research on trends in state tourism website content. Iterations of the California state tourism website, among other sites, are stored as snapshots on the Internet Archive and can be accessed to see changes in websites over time. Our team was given Parquet files of these snapshots dating back to 2008. The goal of the project was to assist Dr. Zach by using the California state tourism website, visitcalifornia.com, and these snapshots as an avenue to explore data extraction and visualization techniques on tourism patterns to later be expanded to other states’ tourism websites. Python’s Pandas library was utilized to examine and extract relevant pieces of data from the given Parquet files. Once the data was extracted, we used Python’s Natural Language Processing Toolkit to remove non-English words, punctuation, and a set of unimportant “stop words”. With this refined data, we were able to make visualizations regarding the frequency of words in the headers and body of the website snapshots. The data was examined in its entirety as well as in groups of seasons and years. Microsoft Excel functions were utilized to examine and visualize the data in these formats. These data extraction and visualization techniques that we became familiar with will be passed down to a future team. The research on state tourism site information can be expanded to different metadata sets and to other states.en
dc.description.notesUSStateTourismReport.docx - Word Document of the US State Tourism Report USStateTourismReport.pdf - PDF of the US State Tourism Report USStateTourismWebsitesPresentation.pdf - PDF of the US State Tourism Presentation USStateTourismWebsitesPresentation.pptx - PPTX of the US State Tourism Presentationen
dc.description.sponsorshipNSF CMMI-1638207, CRISP Type 2/Collaborative Research: Coordinated, Behaviorally-Aware Recovery for Transportation and Power Disruptions (CBAR-tpd)en
dc.description.sponsorshipNSF IIS-1619028, Global Event and Trend Archive Research (GETAR)en
dc.description.sponsorshipInternet Archiveen
dc.identifier.urihttp://hdl.handle.net/10919/98257en
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectParqueten
dc.subjectTourismen
dc.subjectState Tourismen
dc.subjectData Extractionen
dc.subjectVisualizationen
dc.subjectNatural Language Processingen
dc.subjectPandasen
dc.titleUS State Tourism Websitesen
dc.typePresentationen
dc.typeReporten

Files

Original bundle
Now showing 1 - 4 of 4
Name:
USStateTourismReport.docx
Size:
1.97 MB
Format:
Microsoft Word XML
Loading...
Thumbnail Image
Name:
USStateTourismReport.pdf
Size:
1.51 MB
Format:
Adobe Portable Document Format
Loading...
Thumbnail Image
Name:
USStateTourismWebsitesPresentation.pdf
Size:
600.65 KB
Format:
Adobe Portable Document Format
Name:
USStateTourismWebsitesPresentation.pptx
Size:
2.46 MB
Format:
Microsoft Powerpoint XML
License bundle
Now showing 1 - 1 of 1
Name:
license.txt
Size:
1.5 KB
Format:
Item-specific license agreed upon to submission
Description: