2024-03-28T15:55:27Zhttps://vtechworks.lib.vt.edu/server/oai/requestoai:vtechworks.lib.vt.edu:10919/776132023-11-29T16:40:18Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Bologna-Jill, Stephen
author
Duong, Kevin
author
Ha, Jason Yongjoo
author
Zurita, Jazmine
author
Sume, Tinsaye
author
Smith, Ryan
author
2017-04-28
This project provides users with a means to organize, graph, and analyze specific data recorded from Stroubles Creek. The website will be utilized by an Undergraduate Biological Systems Engineering class to help with their labs that deal with the health of Stroubles Creek.
Our team was designated with the task of improving a website that was created by a past Computer Science capstone team. The website we started with was barely functional and could not yet be used by the Undergraduate Biological Systems Engineering class. The website required many modifications in both the front end interface as well as the backend. Our team split up into three two-person groups based off of skill and desired learning objectives. These teams include a backend team, a front end user-interface team, and a data graphing team.
The main front-end improvements that were enacted on the website include a complete overhaul of the entire user-interface and the addition of a usable navigation bar that enables users to easily use all features of the website.
Backend improvements include major changes to the tables in the MySQL database as well as PHP functions that make utilizing the database extremely easy for the data graphing team. The changes made to the database tables allowed for a more straightforward representation of the data and enabled saving graphs for a specific experiment.
Most of the improvements were on the data graphing aspect of the website. Users are now able to analyze six years of data collected from Stroubles Creek. They can analyze this data by creating either line graphs or scatter plots of whatever specific creek data they want. The graphs provide users with the ability to see trends in creek health over the course of many years.
Currently, the website is ready to be used by Undergraduate Biological Systems Engineering classes. It provides all the functionality that our client required and does so in a clean, easy-to-use manner.
Even though the website is ready for use, there are still areas that can be improved upon. These areas include more graph options, easier ways to upload new data sets, graphing large amounts of data points, and the aesthetics of the graphs.
http://hdl.handle.net/10919/77613
Stroubles Creek
Experiments
website
graph
dataset
Fusality for Stream and Field
oai:vtechworks.lib.vt.edu:10919/709372023-11-29T16:40:19Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
DeYoung, Tyler
author
Kahn, Amanda
author
Darivemula, Deepika
author
Russell, John
author
2016-05-04
In MUS 3065, Computer Music and Multimedia, students learn how to use the Max programming environment in order to compose interactive digital music. This project aims to assist these students by making video tutorials about the more useful aspects of Max in order to reiterate core concepts and methods that Dr. Nichols teaches in class. The selected topics for the videos are coll object, additive synthesis, audio modulation, quad-speaker spatialization, timing in Max, and Max basics. Each video describes the recorded screen of the Max environment as well as voice-overs explaining what the narrator is doing in the video and why.
http://hdl.handle.net/10919/70937
Max
computer music
music
MIDI
audio
signal processing
video tutorial
Max Video Tutorials
oai:vtechworks.lib.vt.edu:10919/1070692023-11-29T16:40:19Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Bansal, Sparsh
author
Agarwal, Aditya
author
2021-12-15
The project is about analyzing and visualizing metadata of tourism websites of three states (Virginia, Colorado, and California) from 1998 to 2018.
Each state in the United States has its own state website that is used as a resource to attract new tourists to this location. Each of these sites usually includes great attractions in this state, travel tips and facts about this place, blog posts, and reviews from other people who have been there. Suggestions regarding what might attract potential customers could emerge from examining past tourism websites and looking for any patterns amongst them that would determine what worked and what didn’t. These patterns can then be used to determine what was successful and use that information to make better-informed decisions on the future of state tourism. We will use the historical analysis of past government tourism websites to further support research on content and traffic trends on these websites. The various iterations of each state's tourism website are saved as snapshots in the Internet Archive. Our team was given the Parquet files having the snapshots of data containing the information recording tourism for California, Colorado, and Virginia dating back to 1998. We used a combination of Python’s Pandas library and Beautiful Soup to examine and extract relevant pieces of data from the given Parquet files. This data was scraped to extract the meta tags used for the website as of that date. With this data, we plotted the presence of all the variations on a state's tourism website in chronological order. This made it possible for us to analyze the addition and removal of keywords and to see other changes that were made like using phrases, capitalizations, keywords in languages other than English, and updating of keywords based on internet trends. This led us to conclude that meta tags play a very important role in a website's search engine ranking and a lot of analysis needs to be done keeping in mind the primary user base of the website.
http://hdl.handle.net/10919/107069
tourism
tourism websites
Virginia tourism
Colorado tourism
California tourism
keyword analysis
meta tags
plotly.dash
Tourism Websites
oai:vtechworks.lib.vt.edu:10919/1033072023-11-29T16:40:20Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Imondo, Daniel
author
2021-05-03
This project aims to create an easily browsable interface with a collection of NDJSON formatted tweets from Twitter.
In the Fall of 2020, 4 teams in CS5604 built a system to manage 3 of the University’s archive collections, of ETDs, Webpages, and Tweets. This system included a webpage front-end to serve these collections to users, as well as a feature for researchers and curators to manage data using a KnowledgeGraph and Apache Airflow.
The one front-end team developing this website had a very large task and as such, they were unable to fully flesh out all its features. Specifically, the tweets portion of this website was lacking advanced searching functionality, as well as a clear interactive user interface. My project focused on extending the tweets functionality of this website and managed to accomplish this.
My project features a GUI where users can search through the tweets collection and will have results displayed to them one at a time. In my implementation, I used React, CSS, and ElasticSearch. However, the website it is contained in also uses Docker, Flask, Kubernetes, and Python 3.6. The search fields are text, location, and a range search between two dates. When a query is conducted, results will be displayed to the user 5 at a time. Each tweet result contains both the information contained within the tweet result (i.e., username, display name, tweet text, date, favorites, replies, and retweets) as well as data on the user who published the tweet (i.e., total favorites, total posts, total followers, and a link to the source tweet). Also, if a tweet contains a hashtag, each of these are linked to a search on Twitter for that hashtag.
This project can be used to browse an archive of tweets. It will be useful in querying tweets for research, such as searching for all tweets made about a subject that were posted from a certain location at a certain time.
http://hdl.handle.net/10919/103307
Twitter
Tweets
Library
ElasticSearch
React
Indexing
Library Tweet Support
oai:vtechworks.lib.vt.edu:10919/1171182023-12-08T22:00:29Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Padath, Mathew
author
Wang, Wenmiao
author
Jiang, Westin
author
McGovern, Ryan
author
Wan, Yifei
author
2023-11-26
The objective of this innovative project was to create an automated web application for the assessment and scoring of computer science-related short answers. This solution directly addresses the often labor-intensive and time-consuming process of manually grading written responses, a challenge that educators across various academic disciplines frequently encounter. The developed web application stands out not just for its efficiency but also for its versatility, being applicable to a wide range of subjects beyond computer science, provided that appropriate teacher answer files are supplied.
At the heart of the application lies a user-friendly interface created using ReactJS. This frontend allows educators to seamlessly upload 'teacher' and 'student' files in .tsv format. Following the upload, the application's backend, developed using Flask, takes over. It processes these submissions by comparing student responses against predefined model answers. The scoring mechanism of the application is particularly noteworthy. It employs an advanced semantic analysis approach, utilizing a pre-existing deep learning model, RoBERTa Large. This model is integral to the AutoGrader class, which is responsible for the semantic evaluation of the text.
The grading logic embedded within the AutoGrader class is both innovative and sophisticated. It assesses student responses by breaking them down into phrases and then computing the semantic similarity between each phrase and the concepts outlined in the model answers. The process employs SentenceTransformer to generate text embeddings, allowing for a nuanced evaluation based on cosine similarity between vector representations. This method ensures a grading system that transcends simple keyword matching, delving into the semantic content and understanding of the student answers.
The application boasts several key features that enhance user experience and provide educators with comprehensive insights into student performance. These include the ability to display scores and grades directly on the web application, download detailed Grade Reports that include each question, student's response, the grade awarded, and the model answer. Additionally, the application allows for the viewing of previous submissions and the downloading of historical documents such as past versions of 'teacher file', 'student file', and grade reports.
In terms of future development, the project team has outlined several ambitious goals. These include implementing a dataset-driven strategy for enhancing the training of deep learning models, thereby significantly advancing the current framework. Another focus will be on allowing for a variety of file types to be uploaded for both teacher and student files, thereby increasing the accessibility and usability of the system. Lastly, there are plans to update the functionality and appearance of the web application, incorporating features such as scrolling, standardized formatting, and improved design elements to enhance the overall user experience.
The project was developed with the invaluable guidance and support of Dr. Mohamed Farag, a research associate at the Center for Sustainable Mobility at Virginia Tech. Dr. Farag's expertise in computer science and his commitment to educational innovation have been instrumental in steering the project towards success.
In conclusion, this project marks a significant advancement in the field of educational technology, particularly in the realm of academic grading. By leveraging the power of artificial intelligence and modern web technologies, it provides an efficient, reliable, and versatile tool for educators, streamlining the grading process and offering a scalable solution adaptable to various academic contexts. The future developments outlined promise to further enhance the capabilities of this already impressive tool, pointing towards a new era in academic assessment.
https://hdl.handle.net/10919/117118
Automated Students' short answers assessment
oai:vtechworks.lib.vt.edu:10919/232852023-11-29T16:40:21Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Katz, Ben
author
Hotinger, Eric
author
2013-06-30
Virginia Tech has many groups engaged in work related to the environment. In an effort to alleviate server strain for the Virginia Water Resources Research Center (VWRRC), we have begun to archive over 300 PDF documents into VTechWorks. This will make more than five decades of Virginia Tech’s water research more searchable and accessible than ever before. This permanent archive supports searching and browsing by issue date, author, title, subject, series, and more. It may lead to other efforts in support of the College of Natural Resources and Environment.
http://hdl.handle.net/10919/23285
Water
links
vwrrc
pdf conversion
jsoup
opencloud
tag cloud
html parsing
resources
CS4624: Environment - Virginia Water Resources Research Center (VWRRC) PDF Documents to VTechWorks
oai:vtechworks.lib.vt.edu:10919/186962023-11-29T16:40:23Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Fesseha, ZeleAlem
author
2012-05-10
The goal of this project was to collect YouTube videos for carefully selected events. The videos were manually collected and verified to be relevant to the specific events. The collection together with short description included with each video can later be used to automate the process of collecting videos pertaining to past disasters. We hope that the sample video collection presented here can help build a successful model relating metadata with relevance of video.
http://hdl.handle.net/10919/18696
Disaster Video Gallery Project
oai:vtechworks.lib.vt.edu:10919/776202023-11-29T16:40:23Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Manchester, Emma
author
Srinivasan, Ravi
author
Crenshaw, Sean
author
Masterson, Alec
author
Grinnan, Harrison
author
2017-04-28
Global Event and Trend Archive Research (GETAR) is a research project at Virginia Tech, studying the years from 1997 to 2020, which seeks to investigate and catalog events as they happen in support of future research. It will devise interactive and integrated digital library and archive systems coupled with linked and expert-curated web page and tweet collections. This historical record enables research on trends as history develops and captures valuable primary sources that would otherwise not be archived. An important capability of this project is the ability to predict which sources and stories will be most important in the future in order to prioritize those stories for archiving. It is in that space that our project will be most important.
In support of GETAR, this project will build a powerful tool to scrape the news to identify important global events. It will generate seeds that contain relevant information like a link, the topic, person, organization, source, etc. The seeds can then be used by others working on GETAR to collect webpages and tweets using tools like the Event Focused Crawler and Twitter Search. To achieve this goal, the Global Event Detector (GED) will crawl Reddit to determine possible important news stories. These stories will be grouped, and the top groupings will be displayed on a website as well as a display in Torgersen Hall.
This project will serve future research for the GETAR project, as well as those seeking real time updates on events currently trending.
The final deliverables discussed in this report includes code that scrapes Reddit and processes the data, and the webpage that visualizes the data.
http://hdl.handle.net/10919/77620
GETAR
CS 4624
Global Event Detector
D3.js
SNER
NLTK
Cluster
News
Reddit
digital library
webpage
tweet
Global Event Crawler and Seed Generator for GETAR
oai:vtechworks.lib.vt.edu:10919/709362023-11-29T16:40:24Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Kafley, Somn
author
Steele, Derek
author
Singh, Samyak
author
2016-05-04
The IDEAL Climate Change submission to VTechWorks contains five types of files: Final Project Report, Presentation Powerpoint, and Python Scripts for extraction of URL and indexing (two files) to SOLR.
The Final Project Report has two formats: Word document and PDF. It includes all work done for the project including seven chapters on the following topics: User's Manual, Design & Requirements, Developer's Manual, Prototype & Refinement, Testing, Future Work, and Lessons Learned. In addition to these chapters, the report also includes tables, figures, acknowledgements, and bibliography. It is organized well and contains tables of contents, tables, and figures.
The Presentation has two formats: PowerPoint and PDF. It includes the objective of the project, high level system map, functionality descriptions with screenshots of features, and description of testing and future work. The purpose of his document is to present the overall progress and accomplishments of the project to interested parties. It contains more visual aids and figures than text for a better cognitive understanding and interest to the audience.
The first Python script is used to extract URLs from the raw tweet data. It is well commented to help future developers and parties interested in further developing the project. The second and third Python scripts are used to index tweets and webpages to SOLR; they're also well commented for readability.
http://hdl.handle.net/10919/70936
IDEAL
Climate Change
SOLR
Searchable IDEAL Climate Change Collections
oai:vtechworks.lib.vt.edu:10919/709462023-11-29T16:40:25Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Downs, Jonathan
author
Pant, Yash
author
2016-04-26
In 2014, the Ebola virus, a deadly disease with a fatality rate of about 50 percent, spread throughout several countries. This marked the largest outbreak of the Ebola virus ever recorded. Our client is gathering data on the Ebola virus to create the largest database of information ever made for this disease. The purpose of our project is to verify our client’s data, to ensure the integrity and accuracy of the database.
The main requirements for this project are to locate multiple sources of data to be used for verification, to parse and standardize multiple types of sources to verify the data in our client’s database, and to deliver the results to our client in an easy-to-interpret manner. Additionally, a key requirement is to provide the client with a generic script that can be run to validate any data in the database, given a CSV file. This will allow our client to continue validating data in the future.
The design for this project revolves around two major elements: the structure of the existing database of Ebola information and the structure of the incoming validation files. The existing database is structured in RDF format with Turtle7 syntax, meaning it uses relational syntax to connect various data values. The incoming data format is in CSV format, which is what most Ebola data is stored as. Our design revolves around normalizing the incoming validation source data with the database content, so that the two datasets can be properly compared.
After standardizing the datasets, data can be compared directly. The project encountered several challenges in this domain, ranging from data incompatibility to inconsistent formatting on the side of the database. Data incompatibility can be seen clearly when the validation data matches the date range of the database data, but the exact days of data collection vary slightly. Inconsistent formatting is often seen in naming conventions for the data and the way that dates are stored in the database (i.e., 9/8/2014 vs. 2014-09-08). These issues were the main hindrance in our project. Each was addressed before the project could be considered complete.
After all data was converted, standardized, and compared, the results were produced and formatted in a CSV file to be given to our client. The results are given individually, for each time the script is run, so if the user runs the script for 4 different datasets over 4 different sessions, there will be 4 different result files.
The second main goal of our project, to produce a generic script that allows the user to validate data on his own, uses all previously mentioned design elements such as parsing RDF and CSV files, standardization of data, and printing results to a CSV file. This script builds a GUI interface on top of these design elements, providing a validation tool that the users can employ on their own.
http://hdl.handle.net/10919/70946
database
validation
GUI
dataset
RDF
Ebola
Python
Ebola RDF Database Validator
oai:vtechworks.lib.vt.edu:10919/1150252023-11-29T16:40:26Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
DiPerna, Vincent
author
Choi, Sungjeon
author
Blair, Michael
author
Kamran, Safa
author
2023-04-27
The Covid-19 virus is a respiratory illness that causes the isolation and retreat of people globally. People wanted updates and information in real time related to the virus such as what regions/areas are affected, to what degree are the regions affected (heavily infected, none infected, etc), how to prevent catching the virus, and cures for the virus. Social media became a popular platform for people to share information, news, and opinions about the virus. As much positive information that may be spread among social media, just as much, if not more, misinformation can be spread on social media platforms.
Misinformation is harmful because it can directly affect the health of individuals who fall victim to the misinformation. For example, say a twitter user tweets medical advice about Covid-19, and people who see the tweet choose to follow the advice. Now consider the scenario where they were intentionally spreading false information, which is indeed the opposite of what you should do. The individuals who followed the twitter trolls medical advice may have their own health at risk, and anyone in their sphere of influence.
Our aim is to understand the types of misinformation spread in social media, and help people identify misinformation spread on Twitter related to the subject of Covid-19. We’re going to do this by extracting relevant information such as the content of a tweet (the tweet itself) or the author of the tweet. Then, we will identify whether the tweets include true information or fabricated information. Once we do this, we are going to test and train an AI model to identify whether a tweet is spreading misinformation, or real information. After we train an AI model to identify the type of information, we will categorize the tweet into the category it was trying to spread information about. Our end goal is to integrate the preprocessing script and the AI model with a website that shows the analysis of the tweets. We want users to be able to insert a tweet into our website related to Covid-19, and the user should be returned with the relevant classification of the tweet. Also, users will be able to download a Web Archive file (WARC) of the archived tweet. Overall, we think the combination of these tasks will help aid users in identifying misinformation related to Covid-19.
http://hdl.handle.net/10919/115025
COVID 19
Fake News
Fake News Detection
Python
Machine Learning
TWARC
MySQL
Data Processing
Text Classifier
Tweets
Twitter
Covid-19 Fake News Detection
oai:vtechworks.lib.vt.edu:10919/528692023-11-29T16:40:28Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Cobb, Jack
author
Ahuja, Harjas
author
Shapiro, David
author
2015-05-14
For the CS4624 class, our clients wanted to record and document the atmosphere and events of THATCamp VA 2015 that was held at Virginia Tech. We gathered extra footage of the camp events as well as interviews of camp attendees about their views of camp, and digital humanities in general.
http://hdl.handle.net/10919/52869
THATCamp
THATCampVA
Unconference
Documentary
Digital Humanities
Humanities
THATCamp Documentary
oai:vtechworks.lib.vt.edu:10919/1150092023-11-29T16:40:28Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Toms, Devin
author
Sabanov, Daniel
author
Grzybowski, Kurt
author
Lao, Jason
author
2023-05-10
For this project our team was tasked to create an AI for an immersive role playing game that contains a modular ability system. The AI must be capable of interacting with the ability system effectively. The AI must be fair, and as such will be constrained by the same constraints as the players. It must have the same abilities, follow the same rules, and only operate on information that is also available to the players. Additionally, the AI must be capable of being an adversary or an ally to the player. We have implemented a basic routine-based approach to solve the problem. This approach uses a "Black Box" function to deduce a general course of action to take, such as an attack, defensive, or healing ability. Then, the routine decides on the best approach to achieve its goal and adds specific actions to an action queue. Additionally, we have added ability tags that the AI uses in order to understand what purpose each ability serves. With the use of such an approach, we hoped to create an AI that is both challenging and is immersive to play with or against. We performed manual testing by inserting our AI into various pre-defined scenarios. We have also documented our code with comments that explain the functionality of our code. Our deliverable content is mainly contained within the files "BlackBox.cs" and "State.cs" with small additions to other files. As the game is still in the process of development, much of its functionality is still incomplete. As such, we had to work around some missing functionality, such as incomplete character classes and abilities, by making some assumptions about the final vision of the client. Therefore, the Black Box and the routines will need to be modified as new functionality is introduced into the game in order to better make use of that new functionality.
http://hdl.handle.net/10919/115009
video game
AI
NPC
game design
role playing game
decision optimization
Unity
Role Playing Game AI System
oai:vtechworks.lib.vt.edu:10919/1100552023-11-29T16:40:29Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Lyman, Matthew
author
Hudson, Matthew
author
Bishop, Cory
author
2022-05-11
Our project assists the SeaQL Lab of Virginia Tech's Department of Fisheries and Wildlife Conservation. Working with the Marine Management Organisation of the UK, the Lab's project entails developing an autonomous drone swarm that can fly predetermined routes around the Chagos Archipelago and send alerts about potential poaching boats, based on machine learning image analysis in the drones' attached computing modules. The main goal of this project is to save the sharks and the ecosystem of those waters while decreasing the time, money, and effort for the local Coast Guard to perform regular monitoring. Instead, the drones will send detection alerts to a remote server being monitored by a ranger if it spots a potential poaching boat. Our report details our contributions to the overall project.
Our team took responsibility for several smaller tasks integral to the overall project. First, we familiarized ourselves with the Robotic Operating System (ROS) to connect, calibrate, test, and record video using the cameras provided. ROS will control much of the drones' added functionality such as running the poaching boat detection algorithm, sending flight commands to the drones, and streaming video over a cellular connection. Next, we aided the larger project team in repairing one off-the-shelf drone for potential flight testing. After unsuccessful troubleshooting, we moved to help finish construction of the primary hexacopter. Finally, we wrote a script to start the 4G cellular connection automatically when a drone is powered on.
The AntiPoachingDroneControlReport details this work amidst the larger project goals of the SeaQL Lab. The AntiPoachingDroneControlPresentation gives a brief summary of our project work and the lessons learned. This was presented to our CS4624: Multimedia, Hypertext, and Information Access class to summarize our project work and experiences.
http://hdl.handle.net/10919/110055
poaching
drone
autonomous
shark
Chagos
conservation
image analysis
AI
Jetson Nano
ROS
Robotic Operating System
hexacopter
Anti-Poaching Drone Control
oai:vtechworks.lib.vt.edu:10919/1171102023-12-07T22:00:55Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Nguyen, Anthony
author
DiGiovanna, Luke
author
Siegel, John
author
Lin, Alex
author
2023-11-30
SEM (Scanning Electron Microscopy) is a strong imaging technique used in many scientific domains, including materials science, biology, and nanotechnology. Researchers can use SEM to obtain high-resolution images of specimen surface morphology and topography, providing a precise glimpse of structures at the nanoscale. SEM photos reveal intricate surface details, allowing scientists to investigate the texture, shape, and size of particles, cells, or materials with incredible accuracy. Currently, manual segmentation of SEM images is an important stage in the analysis process for researchers. Manual segmentation entails painstakingly drawing and naming sections of interest within photographs, such as specific structures or particles. Researchers often trace object boundaries using sophisticated software tools built for picture processing and analysis.
We built a gameified multiplayer online application allowing individual contributors to manually segment a SEM picture in real time due to the time and effort required. One important goal was to involve the next generation of scientists and researchers with a demonstration at the Virginia Tech Science Festival in November 2023.
We designed a comparison score technique for a given segmentation to a reference segmentation for a specific SEM picture to provide participants with fast feedback. This enabled individuals and groups to measure their performance while also incorporating a gaming element.
We now have a comprehensive understanding of how to create a full-stack project thanks to this initiative. We discovered how to leverage Amazon Web Services, such as EC2, to scale the infrastructure of our website from the backend. Through the use of Javascript frameworks and packages such as NextJS, Socket.io, and ThreeJS, we have created an intuitive user interface for group manual segmentation.
https://hdl.handle.net/10919/117110
SEM
Scanning Electron Microscopy
SEM Segmenting
Image Segmentation
Web Application
Web Development
Virginia Tech Science Festival
Painter Canvas App
Behind Density Lines: An Interface to Manually Segment Scanning Electron Microscopy Images
oai:vtechworks.lib.vt.edu:10919/832162023-11-29T16:40:30Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Woodson, Tianna
author
Simmons, Gabriel
author
Park, Peter
author
Doan, Tomy
author
Keys, Evan
author
2018-05-02
In order to understand and track emerging trends in school violence, there is no better resource than our current population. Sixty-eight million Americans have a Twitter account and with the help of the GETAR (Global Events and Trend Archive Research) project, we were able to create datasets of tweets related to 10 school shooting events. Also, we have retrieved the URLs of news headlines relating to the same shootings. Our job is to use both datasets to develop visualizations that may depict emerging trends.
Based on the data that we had available, we were able to come up with a few ideas such as word clouds, maps, and timelines. The goal was to choose appropriate representations that would provide insight into the changing conversation America was having about gun violence. We have been successful in creating these visuals and shifted our focus to cleaning our data.
http://hdl.handle.net/10919/83216
Tweets
Twitter
Visualizations
Data Collection
Data Mining
Multimedia
Hypertext
Information Access
URL
Sentiment
Visual Displays of School Shooting Data
oai:vtechworks.lib.vt.edu:10919/1150942023-11-29T16:40:31Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Shaffer, Zachary
author
Macht, Henry
author
Shirazi, Adrian
author
Campbell, Mitchell
author
2023-05-17
A model of a realistic marine environment is needed for training a rugged, onboard optical sensor designed by Cell Matrix Corporation, a VTCRC COgro member (i.e., a small company in Virginia Tech's Corporate Research Center), a project led by Dr. Peter Athanas, an ECE professor at Virginia Tech. This will be accomplished within Blender, a free and open 3D modeling and rendering tool. The chosen environment is the intercoastal waters of the Palm Beach Inlet in Florida, between the Port of Palm Beach and the Inlet, approaching the Inlet from the south side of Peanut Island. This active inlet and port area gives the scene of the Blender model.
To build an accurate representation of the specified area we will construct a terrain model for the Palm Beach Inlet water area from the Port of Palm Beach to the Inlet, including where the Intercoastal Waterway channel meets the Inlet channel, south of Peanut Island. This covers the surrounding islands and land masses, bridges, and large structures. There will also be roughly five types of boats to model (i.e., yachts, sailboats, mega-yachts, cargo ships, fishing boats, and other boats commonly found in the area), to represent different situations. Different looking classes of boats are needed to train the marine sensor to recognize them, so we choose different classes and create or find-and-customize a model for a boat from each class. The team will be provided with the trajectories of individual boats traveling this area from AIS ship tracking data published by the US Coast Guard. To simulate these realistic situations we have written a Blender script that allows boats to transit along these AIS tracks.
The renders we created from our blender project are representations of the Palm Beach Inlet water area, and will hopefully serve as a useful resource for AI model training.
http://hdl.handle.net/10919/115094
Blender
Animation
Marine Environment
Model
Palm Beach Inlet
Marine Blender
oai:vtechworks.lib.vt.edu:10919/220432023-11-29T16:40:32Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Stevens, Kyle
author
Zhou, Yin
author
Komarov, Val
author
2013-05-15
Contemplative practices are a collection of various techniques that are focused on the betterment of the individual through various self-improvement techniques. There are various forms of contemplative practices such as Tai Chi, Qigong, Yoga, Mindfulness, and Meditation.
The goal of the project is to develop a documentary on the Contemplative Practices Conference that happened at the Virginia Tech Conference Center on April 11-13. The extent of our work was to generate the content to make this documentary, edit and publish the interviews, and successfully hand off the content for further development at a later date. So far, the videos have been edited and published on the Contemplative Video YouTube channel, the link to which is posted below, and our professor, Dr. Fox, is in possession of all the content generated by our work, including the raw footage.
The YouTube channel featuring our work can be found here: http://www.youtube.com/user/ContempVideo
http://hdl.handle.net/10919/22043
contemplative
conference
interviews
video
Contemplative Video
oai:vtechworks.lib.vt.edu:10919/479362023-11-29T16:40:33Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Pinsirikul, Johanna
author
Sims, Taylor
author
Horn, Joshua
author
Fisher, Samantha
author
2014-05-09
This report includes a detailed description of Dr. Ewing’s Russian Flu project. This project integrated two groups: a Translation team and a CS team. The translation team worked to find and translate historical articles documenting the Russian Flu pandemic from 1889 to 1890. The languages consisted of French, German, Spanish, English and Russian. The CS team indexed the article metadata into a searchable website. The desired website allowed a user to search for articles, view a list of corresponding results, and understand the specifics of a given article.
The website was implemented using Solr, an open-source search engine platform, and Blacklight, a Ruby on Rails gem. Solr’s search features include facets, relevance definitions, spell checking, and synonyms. Blacklight was easily integrated with Solr, in that it displayed the search features and results in a user-friendly format. Blacklight was configured with a faceted search, drop-down menu search, full-text search box, and spell checker. The facets defined for the Russian Flu project included Newspaper Title, Infection Location, Reporting Location, Language, Date, and Keywords. Once a facet is specified, a search is executed, or a combination of the two occurs. Then a list of corresponding results is displayed. The metadata associated with each article can be viewed as a single document through clicking the Newspaper Title link.
The website also includes multimedia resources tracking the Russian Flu pandemic. An interactive timeline depicts a world map with the flu outbreak (and spread) at different time intervals. Another multimedia resource is the Google Earth 1889 Russian Flu map overlay. This interactive Google Map allows a user to see all continents and the flu’s impact through an 1889 world map.
The resulting website is: http://russianflu.lib.vt.edu
http://hdl.handle.net/10919/47936
russian flu
influenza
translation
pandemic
solr
blacklight
final report
midterm presentation
final presentation
Russian Flu Capstone Project
oai:vtechworks.lib.vt.edu:10919/1129112023-11-29T16:40:34Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Kweon, Seungchan
author
2022-12-14
Antibiotic resistance is a very important part of human health that needs to be constantly kept track of in order to give proper medication to patients. However, the large number of papers that come out makes it very difficult for a human to keep up. Thus, this project aims to create an automated way to extract meaningful information out of antibiotic resistance papers. It aims to create an automated way to collect and parse the papers. Then, it tries to analyze the papers and extract meaningful information out of it. In order to do so the project was divided into 4 main parts. In the first part antibiotic resistance genes were gathered to be used as search queries. In the second part PubMed papers were gathered using the genes. In the third part DeepEventMine was used to extract information from the papers. In the last part the extracted information was analyzed. The results from the data gathering process were good and lots of relevant papers were gathered. DeepEventMine was run successfully but most of the output was not very useful. The analysis part was mostly focused on statistical analysis and gave some useful results. As for the final deliverable the code used for data collection works well and can be used on other projects that work with medical papers. Detailed information on how to work with DeepEventMine can be found in this report. Finally, more work can be done in the analysis section to produce more information.
http://hdl.handle.net/10919/112911
antibiotic resistance
data mining
event extraction
Literature Mining of Antibiotic Resistance
oai:vtechworks.lib.vt.edu:10919/832012023-11-29T16:40:35Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
D'Avella, Michael T.
author
Imperial, Ian A.
author
Saad, Nabil
author
Park, Jay Y.
author
Mussie, Nathan T.
author
2018-05-01
This semester, Spring 2018, in CS 4624, Multimedia, Hypertext, and Information Access, our team pursued a project with the Virginia Tech Psychology Department and their Diversity Committee. We were tasked with enhancing the current Diversity and Inclusion webpages that are offered on the Psychology Department’s website. The clients desired for us to create new webpages that would enhance the recruitment of potential students and faculty, bring better awareness to the Department’s Diversity and Inclusion endeavors to current students and faculty, and allow users to easily and conveniently utilize the Department’s resources concerning Diversity and Inclusion.
We developed a new website for the Department that included eleven distinct pages with content related to Diversity and Inclusion. The information for the pages was gathered from the InclusiveVT Inclusion and Diversity Strategic Planning Guide [4], provided to us by our clients. We gathered a library of multimedia content for use on the website from the Department. We performed user walkthroughs and distributed a survey to members of the Department in order to gauge user satisfaction and areas requiring improvement. This survey also allowed users to submit additional multimedia content for our use.
The Department had also faced the issue of the ease of editing the current webpages that they are hosting. They wanted to be able to edit their webpages in an easier fashion, being able to upload news articles and new events that the Department was holding. They requested that we consider this as we approach the development of the new webpages.
We utilized WordPress to allow for a user-friendly environment for editing the webpages in the future. We made all of the clients administrators for the website so that they can make future changes and add other individuals to become administrators. We also created video tutorials that detail the step-by-step instructions to edit each portion of the website. These seven videos will allow for the clients and anyone with administrative access in the Department to be able to easily edit content without having to spend time learning the technology on their own.
They also were interested in updating their entire Psychology Department website, upgrading all of the current webpages to be modernized. If they found our design to be favorable to the current website’s design, then they would proceed to convert the entire website. This was a consideration for us as we proceeded, making sure to think about how the entire Department website could be converted and the effort that would involve for them or anyone who undertakes that task.
http://hdl.handle.net/10919/83201
https://www.diversity.psyc.vt.edu/
Diversity
Inclusion
Website
WordPress
Psychology
InclusiveVT
Website Psychology Diversity
oai:vtechworks.lib.vt.edu:10919/906552023-11-29T16:40:43Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Hopkins, James
author
Sherman, Brendan
author
Smith, Zachery
author
Wynn, Eric
author
2019-05-12
Our project yielded a modular, open-source course on machine learning in Python. It was built under the advisement of our client, Amirsina Torfi. It is designed to introduce users to machine learning topics in an engaging and approachable way. The initial release version of the project includes a section for core machine learning concepts, supervised learning, unsupervised learning, and deep learning. Within each section, there are 2-5 modules focused on specific topics in machine learning, including accompanying example code for users to practice with.
Users are expected to move through the course section-by-section, completing all of the modules within the section, reading the documentation, and executing the supplied sample codes. We chose this modular approach to better guide the users as far as where to start with the course. This is based on the assumption that users starting with a machine learning overview and the basics will likely be more satisfied with the education they gain than if they were to jump into a deep topic immediately. Alternatively, users can start at their own level within the course by skipping over the topics they already feel comfortable with.
The two main components of the project are the course website and Github repository. The course uses reStructuredText for all of its documentation so we are able to employ Sphinx to generate a fully functioning website from our repository. Both the website and repository are publicly available for both viewing and suggesting changes. The design of the course facilitates collaboration in the open-source environment, keeping the course up to date and accurate.
http://hdl.handle.net/10919/90655
Machine learning
Python
open source
education
Python4ML: An open-source course for everyone
oai:vtechworks.lib.vt.edu:10919/1171152023-12-08T22:01:14Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Georgiev, Alexander
author
McKelway, Bailey
author
Aneja, Rahul
author
Holmes, Claire
author
Zemui, Mahlet
author
2023-11-30
In the world of music education, inspiring students to maintain consistent and effective practice routines has long been a challenge. Recognizing this dilemma, our clients embarked on a journey to leverage technology and reimagine the practice experience for budding musicians. The result of their efforts was a music application that aims to revolutionize the way musicians approach their practice routines by addressing both convenience and motivation.
The innovative concept offers users a number of features that enhance their practice sessions, monitor their progress, and make the entire experience more engaging. Among its many functionalities, the application allows users to plan practice sessions and initiate them with the aid of a built-in timer to track their practice duration. Moreover, the application presents users with visualization representations of their progress on a daily, weekly, monthly, and overall basis through diverse graphs, each highlighting distinct aspects of their practice habits. Additionally, users can delve into a journal-like feature in the application, allowing them to explore and reflect on their musical journey, drawing insights from past practice sessions.
To address the core functionalities mentioned above, our project relies on the integration of Firebase for user authentication and backend data storage, coupled with React Native to ensure cross-platform compatibility in the frontend. This framework facilitates effective communication between the backend and frontend, enabling the exchange of user-related data in order to meet the clients’ requirements within the application. This is notably exemplified by our organization of user information in the backend, utilizing specific collections for swift reading and writing of data as users engage with the application. That being said, as we reflect on the culmination of this semester-long project, it is evident that overcoming challenges and seizing opportunities has been instrumental in our gaining invaluable experience in both client collaboration and implementing diverse solutions. However, acknowledging the iterative nature of application development, we understand the ongoing need for refining the existing features and incorporating new ones in future development.
https://hdl.handle.net/10919/117115
Expo
React Natice
Firebase
Music
Practice
Mobile Application
Practice 10k: Music App
oai:vtechworks.lib.vt.edu:10919/479532023-11-29T16:40:54Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Devin, Henslee
author
Jackie, Falatko
author
2014-05-11
Our client, Christine Link-Owens, is the president of the non-profit Giles Animal Rescue, which helps homeless, neglected, and abused pets in Giles County. The organization has a Drupal website, www.GilesAnimalRescue.org, that several groups of students have worked on, but it still needed bug fixes and feature expansion. Our goal was to make all the changes she asked for, as well as create a website content management environment that is easy to maintain, as well as thorough instructions of common tasks.
Many of our changes to the site were text formatting issues and adding images to pages, but also include:
• Updated the version of Drupal to resolve security issues
• Added pictures to many of the pages to showcase work of Giles Animal Rescue and its volunteers
• Added the ability to enlarge images by clicking on them
• Enabled gallery formatting to create image slide shows
• Changed settings in Drupal to make adding and editing content easy, by creating tailored content types with specific fields
• Resolved text formatting issues by making Filtered HTML default on all text content boxes
• Accessed newsletter subscribers
• Created a separate tab for Giles County Animal Shelter for information about the shelter
Additionally, we migrated hosting from Go Daddy to Bluehost. While updating the Drupal core when hosted at Go Daddy, we experienced many technical difficulties including limited control of our files and difficult menus for tasking such as backups. We moved to BlueHost because we were able to find an organization, GrassRoots, which offers free webhosting through BlueHost. Bluehost has a very intuitive cPanel that displays all the options on one page to help navigate through the website with ease.
Overall, our client has been very pleased with our changes and is excited about the future of her website. We feel that we have put forth our best effort over the course of this semester. We are proud to have assisted such a great organization that helps save unfortunate animals. We hope that our changes will aid them with their mission.
http://hdl.handle.net/10919/47953
Website
Drupal
Content Management
Intuitive
CS 4624
Capstone
Hypertext
Multimedia
BlueHost
GoDaddy
Giles Animal Rescue
Giles County Animal Rescue
CS4624
Website Redesign Project: Creating Intuitive Content Managment
oai:vtechworks.lib.vt.edu:10919/186582020-09-29T19:47:14Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Dockery, Brandon
author
Battersby, Kevin
author
Steele, Christopher
author
2012-05-02
This project is a prototype/proposal for how to present media (particularly) records in VTechWorks. It must be placed on a webserver to run, with its file structure preserved. It provides an AJAX prototype of VTechWorks that demonstrates a number of suggested changes to the interface, including the inclusion of an HTML5 video player, lightbox display of images, and asynchronous loading of data to prevent unnecessary page loads.
http://hdl.handle.net/10919/18658
vtechworks
video
html5
javascript
prototype
VTechWorks Interface Proposal and Prototype for Media Records
oai:vtechworks.lib.vt.edu:10919/186752022-03-29T18:37:36Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
White, Aleksi
author
Dancy, Zac
author
2012-05-07
The website for the Catawba Sustainability Center (CSC) was in its infancy, and it needed to be expanded with descriptions for onsite land demonstrations, showcases for student and faculty projects, and spotlights of the businesses on site. The lead content director for the site is Christy Gabbard, and the head of website development is Joe Gabbard
http://hdl.handle.net/10919/18675
catawba
Sustainability
center
csc
website
Catawba Multimedia Website
oai:vtechworks.lib.vt.edu:10919/709312023-11-29T16:40:55Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Biondi, Luciano
author
Walker, Omavi
author
Yeshiwas, Dagmawi
author
2016-05-01
This report (including code) provides an explanation of the implementation of the IDEALvr application and an analysis of the methodology to create the system. The methods of implementation include 3D virtualization onto an Android application using the game engine Unity and Google Cardboard 3D virtualizer. The methods of analysis include gathering metadata from the IDEAL database and creating new methods of analysis of pre-collected datasets. Other analysis includes understanding the different limitations and advantages of using Google Cardboard and Android to develop the user experience. Results of the analysis show that the data can be interpreted more interactively using virtual reality.
Our clients Mohammed Magdy and Dr. Edward A. Fox served as helpful resources in creating the requirements for this project and providing feedback on a weekly basis to change or continue on development paths. The requirements were outlined as (1) visualize the IDEAL collections in a meaningful way, (2) create a working interface for exploring the collections, and (3) create an application that allows users to draw conclusions from IDEAL data.
The goal of this project was to provide an easy to use interface that allows researchers to visualize collections in a way that increases understanding and allows them to draw conclusions from the data without having to read through massive collections of tweets. We created a hierarchical structure that allows users to filter collections by categories. We developed a frequent word analyzer to parse through the collections and generate a list of the most frequent words in a collection which is also visualized in the project. The virtual reality user interface allows for full immersion and interaction with as many IDEAL collections as the user desires.
The IDEALvr project was created with Unity and developed to work on Android devices that are supported by Google Cardboard. Work on a web application was attempted and issues are documented in the included files. Future work and plans are also included in the attached reports.
Overall, the IDEALvr project was very successful. It provides a way to visualize IDEAL collections in a way that allows for meaningful analysis and conclusions to be made.
http://hdl.handle.net/10919/70931
IDEAL
Twitter
Virtual Reality
VR
Data
Word Cloud
Android
Unity
Cardboard
Google Cardboard
Google
IDEALvr Word Cloud: IDEAL Data Visualization using Virtual Reality
oai:vtechworks.lib.vt.edu:10919/523562023-11-29T16:40:56Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Bhatia, Karan
author
Edwards, Zak
author
Layser, Matthew
author
Moore, Austin
author
2015-05-14
This describes results of the Seventeen Moments in Soviet History semester Capstone project in CS4624 Multimedia, Hypertext, and Information Access course. The majority of the work done with the website focused on shoring up security flaws and issues. This was first started by identifying all elements of the website that interacted with the database. Next, the corresponding code in the website was found and work was done to correct it. This began by identifying how CakePHP handles SQL queries and recommended ways to sanitize SQL queries in CakePHP. Next, flow of control in querying the database was changed to ensure that the recommended changes could be implemented. Once these change were made, the website was first tested to ensure that functionality was not damaged in anyway. Once it was confirmed that the website was still as functional as before the changes, testing was undergone to ensure that the SQL issues were fixed. This was done by attempting to make an SQL injection on the website. The database was then checked to ensure that no changes were made to the website and that the database was in the same state as before the injection attack.
In addition to fixing the security issues associated with the website, general database changes were made as well. First, user registration was changed to ensure that new users were not listed as moderators. Next, all moderators were dropped, with the clients representing the only moderators. Then, the website was modified to no longer store passwords in plain text and changed to only store the hashed passwords. This was confirmed by making new users and testing to see if their plaintext passwords were stored. In addition, all plaintext passwords were removed from the database.
Research was also undertaken for notifying the users of the Soviet History website that the website was operational again. First, a script for emailing users was considered but determined to be unacceptable due to a limit on number of emails sent by an address and the fact that the scripts are dependent on the running computer’s configurations. Next, a mass email service was considered, but determined to be undesirable as they operate on a monthly subscription fee and the service would only be used once. It was then determined that the best course of action was to determine which users should be emailed and only email them so as to not broadcast to the original hackers that the website was back up.
Finally, work is being done to fix the subtitles not appearing on the audio sections of the website. However, currently they are not working, although, test code is being run to see if it improves the subtitles issue.
http://hdl.handle.net/10919/52356
Soviet History
Website security
database
Seventeen Moments in Soviet History
oai:vtechworks.lib.vt.edu:10919/982572023-11-29T16:40:57Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Shere, Danya
author
Ayub, Ahmad
author
Mueller, Rebecca
author
Fabian, Lexi
author
Shah, Akshat
author
2020-05-11
In the United States, every state has a tourism website. These sites highlight the main attractions of the state, travel tips, and blog posts among other relevant information. The funding for these websites often comes from occupancy taxes, a form of taxes that comes from tourists who stay in hotels and visit attractions. Therefore, current and past tourists fund the efforts to draw future tourists into the state.
Since state tourism is funded by the success of past tourism efforts, it is important for researchers to spend their time and resources on finding out what efforts were successful and which weren’t. With this comes the importance of seeing trends in past tourism endeavors. By examining past tourism websites, patterns can be drawn about information that changed, from season to season and year to year. These patterns can be used to see what researchers deemed as successful tourism efforts, and help guide future state tourism decisions.
Our client, Dr. Florian Zach of the Howard Feiertag Department of Hospitality and Tourism Management, wants to use this historical analysis on state tourism information to help with his research on trends in state tourism website content. Iterations of the California state tourism website, among other sites, are stored as snapshots on the Internet Archive and can be accessed to see changes in websites over time. Our team was given Parquet files of these snapshots dating back to 2008. The goal of the project was to assist Dr. Zach by using the California state tourism website, visitcalifornia.com, and these snapshots as an avenue to explore data extraction and visualization techniques on tourism patterns to later be expanded to other states’ tourism websites.
Python’s Pandas library was utilized to examine and extract relevant pieces of data from the given Parquet files. Once the data was extracted, we used Python’s Natural Language Processing Toolkit to remove non-English words, punctuation, and a set of unimportant “stop words”. With this refined data, we were able to make visualizations regarding the frequency of words in the headers and body of the website snapshots. The data was examined in its entirety as well as in groups of seasons and years. Microsoft Excel functions were utilized to examine and visualize the data in these formats.
These data extraction and visualization techniques that we became familiar with will be passed down to a future team. The research on state tourism site information can be expanded to different metadata sets and to other states.
http://hdl.handle.net/10919/98257
Parquet
Tourism
State Tourism
Data Extraction
Visualization
Natural Language Processing
Pandas
US State Tourism Websites
oai:vtechworks.lib.vt.edu:10919/776042023-11-29T16:40:58Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Watts, Joseph
author
Anderson, Nick
author
Asbill, Connor
author
Mehr, Joseph
author
2017-05-10
The goal of this project is to leverage microblogging data about the stock market to predict price trends and execute trades based on these predictions. Predicting the price trends of stocks with microblogging data involves a complex opinion aggregation model. For this, we built upon previous research, specifically a paper called "CrowdIQ" submitted by a team consisting of some Virginia Tech faculty. This paper details a complicated method of aggregating an accurate opinion by modeling judge reliability and interdependence. Once the overall sentiment of the judges was deduced, we built trading strategies that take this information into account to execute trades.
The first step of the project was a sentiment analysis of posts on a microblogging site named StockTwits. These messages can contain a label indicating a bullish or bearish sentiment, which will help indicate a specific position to take on a given stock. However, most users choose not to use these labels on their StockTwits. A classification of these unlabeled tweets is required to autonomously utilize StockTwits to drive the proposed trading strategies.
With a working sentiment analysis model, we implemented the opinion aggregation model described by CrowdIQ. This can gauge an accurate market sentiment for a particular stock based on the collection of sentiments that are received from users on StockTwits.
The next step was the creation of a trading simulation platform, including a complete virtual portfolio management system and an API for retrieving historical and current stock data. These tools allow us to run quick and repeatable tests of our trading strategies on historical data. We can easily compare the performance of strategies by running them with the same historical data.
After we had a viable testing environment setup, we implemented trading strategies. This required research and analysis of other attempts at similar uses of microblogging data on predicting stock returns. The testing environment was focused on a set of stocks that is consistent with those used in CrowdIQ. The implementation of the CrowdIQ strategy served as a baseline against which we compared our results.
Development of new trading strategies is an open-ended task that involved a process of trial and error. It is possible for a strategy to find success in 2014, but not perform quite as well in other years, because market climates can be fickle. To assess the dependence of the market climate on our strategy's success, we also tested against data for the year of 2015 and compared the performance.
The final deliverable is a viable trading simulation environment coupled with various trading strategies and an analysis of their performance in the years of 2014 and 2015. The analysis of each strategy's performance indicated that our sentiment-based strategies perform better than the index in bullish markets like that of 2014, but, when they encounter a bear market, they typically make poor trading decisions which result in a loss of value.
http://hdl.handle.net/10919/77604
crowd-sourcing
sentiment analysis
stock trading
stock market
microblog
scala
spark
hbase
opinion aggregation
Analyzing Microblog Feeds to Trade Stocks
oai:vtechworks.lib.vt.edu:10919/1149642023-11-29T16:40:58Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Khamis, Merna
author
Chen, Jeff
author
Lee, Justin
author
Averyt, Matthew
author
Mitta, Nihal
author
2023
Webpages disappear online rapidly. When something like a crisis event occurs,
it is very important to retrieve and preserve all web pages related to that event
before they disappear in an effort to record their digital history. Web archiving
is a technology that enables storing webpages in a format called WARC (Web
ARChive). WARC records save all the pertinent information required to replay
the webpage as it was online, such as the HTML data, files, and ads.
The goal of this project is to implement a web archiving system that can
archive a large number of web pages depending on user input. To solve this
task, we have implemented two main scripts using Python due to its known
scripting capabilities, and a user interface to provide a recording and replaying
functionality to our system.
The first script’s purpose is to go through user-given websites and archive
them in the Web ARChive (WARC) format, with one WARC file per website.
It is capable of accepting a URL and a collection name to direct the archived
URL to, using various Python libraries and packages to do so like pywb and
subprocess, as well as waiting between URLs to not overload a server with
requests, avoiding being blocked by the target website(s). It operates by reading
URLs from a given text file and utilizes multithreading for an overall faster
performance during archival.
The second script’s purpose is to replay WARC files, showing the archived
webpage(s) as it initially was before archival. It is capable of accepting a WARC
file (.warc or .warc.gz) inputted by the user to be displayed using the webbrowser
library on the user’s own web browser. Similar Python libraries are used in this
script in its implementation.
The user interface was created using Node.js and React and is based on
pywb’s WebUI, serving to provide the user with an easier way to use the afore-
mentioned scripts. A Flask script then used to link the UI and scripts for
together, allowing for greater usability, with the functions the scripts have to
offer to be available for easier use. Using the UI, a user can search for an
archived page using its URL and is presented with a pywb calendar to view the
website capture(s) in their local browser.
The final deliverables of this project include completed scripts, a user inter-
face, a set of presentation slides, and a final report submitted to our professor
and client, Mohamed Farag. The report and presentations show the progress
our team made throughout the stages of our project’s development.
http://hdl.handle.net/10919/114964
webpages
archive
WARC
URL
retrieve
web archive
Crisis Events Webpages Archiving
oai:vtechworks.lib.vt.edu:10919/832122023-11-29T16:40:59Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Barnett, Matthew
author
Evans, Thomas
author
Islam, Fuadul
author
Zhang, Yibing
author
2018-05-02
The report describes the EmoViz project for the Multimedia, Hypertext, and Information Access Capstone at Virginia Tech during the Spring 2018 semester. The goal of the EmoViz project is to develop a tool that generates and displays visualizations made from Facial Action Coding System (FACS) emotion data.
The client, Dr. Steven D. Sheetz, is a Professor of Accounting and Information Systems at Virginia Tech. Dr. Sheetz conducted a research project in 2009 to determine how human emotions are affected when a subject is confronted with analyzing a business audit. In the study, an actor was hired to record a five minute video of a simulated business audit in which they read a pre-written script containing specific visual cues at highlighted points throughout the duration of the audit. Participants of the study were divided into two groups, each of which was given a distinct set of accounting data to review prior to watching the simulation video. The first group received accounting data that had purposely been altered in a way that would indicate the actor was committing fraud by lying to the auditor. The second group received accounting data that correctly corresponded to the actor’s script so that it would appear there was no fraud committed. All participants watched the simulation video while their face movements were tracked using the Noldus FaceReader software to catalog emotional states. FaceReader samples data points on the face every 33 milliseconds and uses a proprietary algorithm to quantify the following emotions at each sampling: neutral, happy, sad, angry, surprise, and disgust.
After cataloging roughly 9,000 data rows per participant, Dr. Sheetz adjusted the data and exported each set into .csv files. From there, the EmoViz team uploaded these files into the newly developed system where the data was then processed using Apache Spark. Using Spark’s virtual cluster computing, the .csv data was transformed into DataFrames which helps to map each emotion to a named column. These named columns were then queried in order to generate visualizations and display certain emotions over time. Additionally, the queries helped to compare and contrast different data sets so the client could analyze the visualizations. After the analysis, the client could draw conclusions about how human emotions are affected when confronted with a business audit.
http://hdl.handle.net/10919/83212
Visualization
Business Audit
Emotion
Facial Recognition
Python
Spark
EmoViz - Facial Expression Analysis & Emotion Data Visualization
oai:vtechworks.lib.vt.edu:10919/1150082023-11-29T16:41:00Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Deshmukh, Anoushka, Abilash, Nick, Nischal, Yashwanth Pradeep, Grifasi, Dinesh, Narem
author
2023-04-27
This project aims to implement a web interface to visualize interactions generated from students’ study sessions using OpenDSA content. Our goal is to help instructors conveniently understand how their students learn and interact with OpenDSA and identify areas of improvement in the system to aid the learning process. There is currently a lack of tools available for professors to use to visualize OpenDSA users' learning process. OpenDSA is used by many classes at Virginia Tech as a supplement to student’s learning material. Having these visualization tools would help see if students are engaged in the material, and how they are utilizing the eTextbook.
We will be taking the interaction data from a MySQL database. Then, we will be utilizing a modified version of Samnyeong Heo’s python script to convert the raw interaction data to abstract data for visualizations [3]. Next, we will create visualizations using Python libraries such as Numpy, pandas, and more. Finally, we will create a web interface on PowerBI to show our visualizations.
The final deliverables of this project include a fully functional web interface and a visualization tool for the interaction data on OpenDSA. We must also create a final report and prepare our presentation to give in front of our class, CS_4624, Multimedia, Hypertext, and Information. We will submit the above to our client Mohammed Farghally and our professor Mohamed Farag. We will be iterating upon our report and presentation as we continue to design and develop our interface and visualizations. We will be including more testing information and our evaluations and analysis of our envisioned system.
http://hdl.handle.net/10919/115008
OpenDSA, Python Visualizations, Interactive Dashboard, eTextbook, Multimedia, Hypertext, Information Access
Visualizing eTextbook Study Sessions Final Report
oai:vtechworks.lib.vt.edu:10919/1129092023-11-29T16:41:00Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Arbaiza, Camila
author
Smith, Jake
author
Josi, Spencer
author
Narayanan, Shashank
author
2022-12-14
The Virginia Tech Center of Drug Discovery (VTCDD) is an interdisciplinary organization consisting of around 50 members across 17 STEM-related departments. To showcase the organization’s members, they maintain a website to highlight their members’ accomplishments. Specifically, there is a subset of pages dedicated to listing the publications and patents created by VTCDD members relevant to the organization. These pages have not been updated since 2017 and were created by a site contributor that had minimal experience utilizing Virginia Tech’s content management system. Our goal as a team was to create a solution to better showcase the patent and publication accomplishments of the VTCDD.
Starting the project, we had complete creative liberty when it came to presenting the VTCDD publications and patents. Initially, the Publications page and the Patents page were endlessly scrolling static pages, making the sites poorly navigable. The publications only listed the citation, which made it difficult to quickly parse information from, especially since there were a variety of different citation formats. The site coordinator had to manually add new publications and patents to the site, which was a tedious process. With some of the site’s areas of improvement in mind, our goal was to make the sites more dynamic, and standardize the information through the use of filterable, paginated tables.
The deliverables consist of a final report, a web-scraper for Publication data, and web pages for the VTCDD site created on Ensemble: one for the Publications page, one for the Patents page, and another for the home page. The report shows our progress throughout our project lifecycle, ranging from ideation to testing. There are manuals for users and developers to ensure the web pages are properly maintained. Lastly, we discuss lessons learned and ideas for future development.
http://hdl.handle.net/10919/112909
VTCDD
Center of Drug Discovery
Patents
Publications
VTCDD Showcase
oai:vtechworks.lib.vt.edu:10919/1150262023-11-29T16:41:02Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Fichera, Katherine
author
Domnick, Dillon
author
2023-05-11
Schizophrenia is a chronic mental disorder characterized by frequent psychosis and audiovisual hallucinations affecting 24 million people globally. Despite its prevalence, the varying degrees by which the symptoms present combined with misguided media portrayal make schizophrenia and related psychotic disorders one of the more poorly understood and stigmatized mental illness diagnoses. Because it is so misunderstood, it is hard for those with the illness to get help and be treated properly. The objective of this project was to develop a virtual reality simulation using Unity, aimed at replicating the symptoms and impacts of schizophrenia.
The end result of this project is a VR simulation that simulates the symptoms of schizophrenia. It can be used with the Oculus Quest and is available through Dr. James Ivory. This project can be worked on by future students if they would like. This includes testing, more attributes to the simulation which could be more interactions or hallucinations, and keeping it updated with the Quest and Unity future versions.
http://hdl.handle.net/10919/115026
VR
Virtual Reality
Schizophrenia
Simulation
Hallucination
Mental Illness
Schizophrenia Simulator
oai:vtechworks.lib.vt.edu:10919/186732023-11-29T16:41:02Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Ishairzay, Rishi
author
De, Puloma
author
Hwang, Andrew
author
2012-05-06
This module aims to introduce FFMPEG to students in a linux environment (IBM Cloud)
http://hdl.handle.net/10919/18673
ffmpeg
ibm cloud
FFMPEG on the IBM Cloud
oai:vtechworks.lib.vt.edu:10919/238592023-11-29T16:41:04Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Adams, Julian
author
McElmurray, John
author
2013-10-02
America’s entertainment software industry creates a wide array of computer and video games to meet the demands and tastes of audiences as diverse as our nation’s population. Today’s gamers include millions of Americans of all ages and backgrounds. In fact, more than two-thirds of all American households play games. This vast audience is fueling the growth of this multi-billion dollar industry (Essential Facts About the Computer and Video Game Industry, 2006).
The Computer Science Department at Virginia Tech has offered a course to facilitate the future of art and game development. CS 4644: Creative Computing Studio Capstone is an intensive immersion into different approaches to game design and 3D modeling. The course allows students to develop an understanding of the scientific and technological principles associated with the design and development of computer and console games for both entertainment and serious applications. Students are encouraged to use a wide range of game engines as they work in teams to conduct an end-to-end integrative design project, the most popular being Unity.
Unity is a game development ecosystem: a powerful rendering engine fully integrated with a complete set of intuitive tools and rapid workflows to create interactive 3D content; easy multiplatform publishing; thousands of quality, ready-made assets in the Asset Store; and a knowledge-sharing Community.
Unity is free to a large proportion of developers and affordable for the rest. For independent developers and studios, Unity’s democratizing ecosystem smashes the time and cost barriers to creating uniquely beautiful games. They are using Unity to build a livelihood doing what they love: creating games that hook and delight players on any platform. It is for this reason that our group decided to work with the professors of the Creative Computer Studio Capstone to deliver a module that will quickly get students up and running with Unity game development.
Videos are publicly available through the YouTube playlist:
http://m.youtube.com/playlist?list=PLKFvhfT4QOqlEReJ2lSZJk_APVq5sxZ-x
All of the code is maintained in the public GitHub repository:
https://github.com/jm991/UnityThirdPersonTutorial?files=1
http://hdl.handle.net/10919/23859
Unity
computer game
video game
CS4644
Virginia Tech
Creative Computing Studio Capstone
game engine
tutorial
Online VT CS Module: Unity Crash Course for CS 4624
oai:vtechworks.lib.vt.edu:10919/220542023-11-29T16:41:05Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Parsons, Amber
author
Myrick, Gregory
author
2013-05-18
Giles County Animal Rescue is a volunteer organization located in Giles County, Virginia. This group of volunteers assists the Giles County Animal Shelter in placing animals in homes. They also campaign for awareness of the importance of spay/neuter. Since most of their information is accessed on the web, our client Christine Link-Ownes believes that it is important to have a website that is easy to use and update. For this project, we worked with our client and Giles County Animal Rescue to redesign their website, fix bugs, and add new functionality using Drupal. This included recreating the Giles County Animal Rescue website and adding features such as newsletters and animal statuses.
http://hdl.handle.net/10919/22054
Drupal
Shelter
Web Design
Giles County
Animals
Video
Giles County Animal Rescue
oai:vtechworks.lib.vt.edu:10919/831972023-11-29T16:41:06Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Cheng, Junjie
author
2018-05-07
This is the Neural Network Document Summarization project for the Multimedia, Hypertext, and Information Access (CS 4624) course at Virginia Tech in the 2018 Spring semester. The purpose of this project is to generate a summary from a long document through deep learning. As a result, the outcome of the project is expected to replace part of a human’s work.
The implementation of this project consists of four phases: data preprocessing, building models, training, and testing.
In the data preprocessing phase, the data set is separated into training set, validation set, and testing set, with the 3:1:1 ratio. In each data set, articles and abstracts are tokenized to tokens and then transformed to indexed documents.
In the building model phase, a sequence to sequence model is implemented by PyTorch to transform articles to abstracts. The sequence to sequence model contains an encoder and a decoder. Both are implemented as recurrent neural network models with long-short term memory unit. Additionally, the MLP attention model is applied to the decoder model to improve its performance.
In the training phase, the model iteratively loads data from the training set and learns from them. In each iteration, the model generates a summary according to the input document, and compares the generated summary with the real summary. The difference between them is represented by a loss value. According to the loss value, the model performs back propagation to improve its accuracy.
In the testing phase, the validation dataset and the testing dataset are used to test the accuracy of the trained model. The model generates the summary according to the input document. Then the similarity between the generated summary and the real human-produced summary are evaluated by PyRouge.
Throughout the semester, all of the above tasks were completed. With the trained model, users can generate CNN/Daily Mail style highlights according to an input article.
http://hdl.handle.net/10919/83197
Deep learning (Machine learning)
Natural Language Processing
Text Summarization
Recurrent Neural Network
Sequence to sequence
Neural Network Doc Summarization
oai:vtechworks.lib.vt.edu:10919/1171172023-12-08T22:01:55Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Brian Hays
author
Alex Zhang
author
Mitchel Rifae
author
Trevor Kappauf
author
Parsa Nikpour
author
2023-11-30
In the contemporary digital landscape, access to timely and relevant information during crisis events is crucial for effective decision-making and response coordination. This project addresses the need for a specialized web application equipped with a sophisticated crawler system to streamline the process of collecting pertinent information related to a user-specified crisis event.
The inherent challenge lies in the vast and dynamic nature of online content, where identifying and extracting valuable data from a multitude of sources can be overwhelming. This project aims to empower users by allowing them to input a list of newline-delimited URLs associated with the crisis at hand. The embedded crawler software then systematically traverses these URLs, extracting additional outgoing links for further exploration. Afterwards, the contents of each outgoing URL is then run through a predict function, which evaluates the relevance of each URL based on a scoring system ranging from 0 to 1. This scoring mechanism serves as a critical filter, ensuring that the collected web pages are not only related to the specified crisis event but also possess a significant degree of pertinence. We allow the user to set these thresholds, which enhances the efficiency of information retrieval by prioritizing content most likely to be valuable to the user's needs.
Throughout the crawling process, our system tracks a range of statistics, including individual website domains, the origin of each child URL, and the average score assigned to each domain. To provide users with a comprehensive and visually intuitive experience, our user interface leverages React and D3 to display these statistics effectively.
Moreover, to enhance user engagement and customization, our platform allows users to create individual accounts. This feature not only provides a personalized experience but also grants users access to a historical record of every crawl they have executed. Users are further empowered with the ability to effortlessly export or delete any of their previous crawls based on their preferences.
In terms of deliverables, our project commits to providing fully developed code encompassing both frontend and backend components. Complementing this, we will furnish comprehensive user and developer manuals, facilitating seamless continuity for future students or developers who may build upon our work. Additionally, our final deliverables include a detailed report and a compelling presentation, serving the dual purpose of showcasing our team's progress across various project stages and providing insights into the functionalities and outcomes achieved.
https://hdl.handle.net/10919/117117
Fullstack Application
Flask
React
Javascript
Dockerfile
Capstone
Python
Final Report
Final Presentation
CS4624
Multimedia
Hypertext
Information Access
Automated Crisis Collection Builder - Final Project Report
oai:vtechworks.lib.vt.edu:10919/709342023-11-29T16:41:13Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Barnes, Brittany
author
Elsi, Godolja
author
Kiseleva, Marina
author
2016-04-28
Cinemacraft is an interactive system built off of a Minecraft modification developed at Virginia Tech, OPERAcraft. The adapted system allows users to view their mirror image, as captured by Kinect sensors, in the form of a Minecraft avatar. OPERAcraft, the foundation of the project, was designed to engage K-12 students by allowing users to create and perform virtual operas in Minecraft. With the advanced functionality of Cinemacraft, the reinvented system aims to alter the perspective of how real-time productions will be produced, filmed, and viewed.
The system uses Kinect motion-sensing devices that track user movement and extract the data associated with it. The processed data is then sent through middleware, Pd-L2Ork, to Cinemacraft, where it is translated into avatar movement to be displayed on the screen, resulting in a realistic reflection of the user in the form of an avatar in the Minecraft world.
Within the display limitations presented by Minecraft, the avatar can replicate the user’s skeletal and facial movements; movements involving minor extremities like hands or feet cannot be recreated because Minecraft avatars do not have elbows, knees, ankles, or wrists. For the skeletal movements, three dimensional points are retrieved from the Kinect device that relate to specific joints of the user and are converted into three dimensional vectors. Using geometry, the angles of movement around each axis (X, Y, and Z) for each body region (arms, legs, etc.) are determined. The facial expressions are computed by mapping eyebrow and mouth movements within certain thresholds to specific facial expressions (mouth smiling, mouth frowning, eyebrows furrowed, etc.).
http://hdl.handle.net/10919/70934
Institute for Creativity Arts and Technology
OPERAcraft
Cinemacraft
Minecraft
Motion Sensing
Kinect
ICAT
Computer Science
Cinemacraft: Virtual Minecraft Presence Using OPERAcraft
oai:vtechworks.lib.vt.edu:10919/1149972023-11-29T16:41:13Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Hutcherson, Zachary S.
author
2023-05-09
This paper will review the types and use of unmanned aerial vehicles (UAVs) in conservation. Drones are being used as the preferred method of monitoring terrestrial and aquatic wildlife in difficult areas, thanks to the low price and efficiency that these tools offer. The paper discusses the three main types of drones: fixed-wing, rotary-wing, and hybrid. Fixed-wing drones are best suited for general surveillance of large areas and long-distance flights, while rotary-wing drones are small, light, and maneuverable, making them ideal for tasks such as photography, filmography, inspection, and surveillance. Hybrid drones are more complex, combining fixed and rotary wings or rotors. The paper also explores the potential benefits of adding solar panels to drones to improve their energy efficiency.
Multiple instances of the successful use of drones in the field were documented, including drones being used to identify objects in water, land, or air. Advanced machine learning algorithms were proven to be highly effective in identifying targets for military, conservation, and other purposes. The optimal placement of docking stations for aerial drones was discussed, and how they could be found using a new algorithm, back-and-forth-k-opt simulated annealing, or BFKSA was also discussed.
Overall, drones provide a cost-effective and efficient way to monitor and protect wildlife, making them an important tool for conservationists.
http://hdl.handle.net/10919/114997
Drones
Conservation
Machine Learning
UAV
AUV
Vision-based-tracking
A Literary Review on the Current State of Drone Technology in Regard to Conservation
oai:vtechworks.lib.vt.edu:10919/1099762023-11-29T16:41:14Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Kalbouneh, Omar
author
Marshburn, Madison
author
Carroll, David
author
2022-05-08
The team was provided with a platform to update and implement new design changes to the original website for a project led by Dr. Francesco Ferretti, an Assistant Professor in the Department of Fish and Wildlife Conservation. His research interests include studying the impact of humanity on the Earth’s oceans and conservation efforts. The website is built using WordPress with frontend CSS and HTML. RShiny provides the backend support to implement the Validation Monitor. We were tasked with converting the static framework of the website to a more dynamic and responsive framework and with improving on the current gamification scheme through a refined point rewards system and incentives.
The team was able to refine the current point reward system by integrating functionalities that will motivate users to validate sharks and support SharkPulse’s research for the ecology and taxonomy of shark populations. These functionalities include awarding users who are able to recognize rare shark species, and if the shark species are labeled as “endangered” or “vulnerable” or “critically endangered” according to the IUCN red lists of ecosystems and threatened species. Most of the completed changes were on the backend for the validation monitor and on the frontend for the identification guide. The team has also improved the identification guide to be a more suitable web page to educate users about sharks. It was updated by adding more questions and adding an option to let the users select “I can’t see” if they are unable to see shark characteristics from an image. Overall, the backend changes for assigning points based on if the user recognizes rare or threatened species is deployed. However, the project is still not complete, as the website still needs to be updated from being a static website to being a dynamic one. The rare species functionality could also be updated to improve the program’s performance.
http://hdl.handle.net/10919/109976
sharks
rare sharks
identification guide
SharkPulse
validation
conservations status
SharkValidatorGame
oai:vtechworks.lib.vt.edu:10919/926912023-11-29T16:41:15Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Duncan, Courtney
author
Garcia-Neal, Christian
author
Mehdi, Wasay
author
Urcia, Andre
author
2019-05-12
Our client, Dr. Mims, and a team of researchers, collected trait data on lesser-known vertebrate species in the northwestern United States. The goal of this research was to find links from trait to climate change vulnerability. She then published her data in a report that was made available through VertNet. Since the research comes from publicly available museum records it is only fitting to create a publicly accessible website to not only access the research but to engage the public on this important issue.
The goal of our project was to make a multiple page website with quick links, resources, and research all attached to their respective vertebrate/species. We also made sortable lists of the species based off of their trait data. Also to be included with our website is a manual on how to extend or maintain the website for future use and extensibility when we are no longer working on the website.
Another focus of the website is an informative visualization/infographic map that allows users to investigate the data of the species and their populations in different regions. Different parts of the map should be linked from each species individual page for easy association of information. Advancement on the infographic/visualization map that allows for input to clarify or maintain interest in the relevant data. Easy to understand controls that allow for detailing or generalizing parts of the map to meet criteria for different areas of interest or research.
Included are the files of trait data given to us by Dr. Mims and our final presentation. This trait data is for the species represented by our website.
http://hdl.handle.net/10919/92691
Vertebrate Map Visualization
oai:vtechworks.lib.vt.edu:10919/1129132023-11-29T16:41:16Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Dewan, Suha
author
Zhou, Daodao
author
Huynh, Long
author
Guo, Zipeng
author
2022-12-15
In digital image processing and computer vision, image segmentation is the process of partitioning a digital image into multiple image segments. More precisely, it’s the process of assigning a label to every pixel in an image so that pixels with the same label share certain characteristics. Image segmentation is an important step in almost any medical image study. Segments are used in images from microscopes that show us different types of cells and these cells contain hundreds of organelles and macromolecular assemblies. Cell Segmentation is the task of splitting a microscopic image domain into lots of different segments, which represent individual instances of cells, however, this requires enormous time for domain experts to label manually and thus the need for AI-Assisted annotation of medical Images. Our project will aid the annotators in receiving images quickly and easily through our web application and performing the predictions on these images.
http://hdl.handle.net/10919/112913
Website
Segmentation
AWS
AI
React
Cells
Medical
Microscopic
AI-Assisted Annotation of Medical Images
oai:vtechworks.lib.vt.edu:10919/832202023-11-29T16:41:17Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Fowler, Tom A.
author
Howe, Christian J.
author
2018-05-02
The goal of the project is to scale down the CloudFormation templates for deploying the Hyku digital repository application. We have attempted to reduce the cost of running the Hyku application with a base level of performance, essentially reducing it to the minimum viable scale. We have accomplished this by changing these templates and their configuration parameters to use less instances at smaller sizes. After evaluating a number of different options for reducing the base cost, including using other AWS offerings, we have settled on a number of parameters that work well at the base level of performance. In testing these changes, we used a qualitative method of testing the functionality of the existing feature set on the original deployment and comparing that to the functionality of the new deployment. We have seen no changes in functionality from the original deployment.
The cost reduction we see with these reduced instance sizes is to about one third of the original cost, resulting in massive savings given that the original cost of running the application was about $800-900 a month. The new cost of running our modified templates with the parameters we have tested is about $300 a month. Given that the original feature set is still functioning as it was before, we believe that we have achieved a satisfactory reduction of cost from the original deployment, and therefore have accomplished the goal we set out to complete.
We provide documentation on our process and the changes we made, including on how to reproduce in the future the changes we have made. Since the templates require some level of maintenance, this documentation is vital for deploying them in the future. The documentation provided by the report gives future maintainers the ability to quickly get up and running with the potential problems encountered when working with the templates, and gives future groups the insight to predict the kinds of challenges they will face when working on the Hyku CloudFormation templates.
http://hdl.handle.net/10919/83220
CloudFormation
Hyku
Hydra-in-a-Box
Samvera Labs
AWS
Amazon Web Services
Cloud Digital Repo Optimization
oai:vtechworks.lib.vt.edu:10919/776162023-11-29T16:41:17Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Renugopal, Jishnu
author
Zargarpur, Mattin
author
Zhao, Haiyu
author
Richardson, Christian
author
Zhang, Kevin
author
Schmidt, Will
author
2017-04-28
Global Event and Trend Archive Research (GETAR) is supported by NSF (IIS-1619028 and 1619371) through 2019. It will devise interactive, integrated, digital library/archive systems coupled with linked and expert-curated webpage/tweet collections. This project will act as a supplement to GETAR by providing a Virtual Reality (VR) interface to visualize geospatial data and extrapolate meaningful information from it. It will primarily focus on visualizing tweets and images obtained from the GETAR data archive on a globe in a VR world. This will be accomplished using tools like Unity, HTC Vive, C# and Git. In order to ensure that the product meets the end user’s specification, this project will use an iterative workflow with a very short feedback loop. The feedback obtained from Dr. Fox and our team members will be used to make subsequent prototypes and the final product.
Our project is intended to be used as a demo by school children interested in data analytics and data sciences. Additionally, this project can also be extended to add features to our end product.
Our final product can display images and tweets on a globe in a VR world provided that they have location information. As part of our final deliverable, we delivered a report, a presentation, a video demo and a GitHub repository containing the source code for our project. During this project, our team learned that building a 3D application is very different from building a 2D desktop application. We also learned that it is crucial to meticulously document all parts of product development to assist future development.
http://hdl.handle.net/10919/77616
VR
GETAR
Data exploration
Tweet
Video
Geolocation
Unity
HTC Vive
VR4GETAR
oai:vtechworks.lib.vt.edu:10919/1033022023-11-29T16:41:18Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
McLaren, Reilly
author
2021-05-02
The MusArt web application was created in the hopes of combining available audio and visual technology to be used for good. The origins of this project were inspired by Professor Ico Bukvic, my client, who conducts a significant amount of work at the intersection of music and technology. Many people, who will be recognized throughout this paper, helped in the ideation phase of this project, providing insights from their respective experience. As a result, this project rests on a very broad foundation, pulling together many different ideas into one cohesive final product.
The goal of MusArt is to provide users with an application that allows them to express themselves, regardless of self-perceived creative abilities. A parallel goal is to provide an activity to actively relax by using engaging audio and visual stimulation. This benefits the user due to the powerful effects that music has on our biological functions, combined with the benefits of tuning into our senses of vision and hearing. This idea of tuning in is common in meditation practices.
This product works on the idea of limiting the amount of choices available to a user, giving them enough freedom to feel that what they are creating is theirs without giving too much freedom that would lead the user to feeling overwhelmed. Regardless of experience, anybody should be able to use this tool as long as they have access to a computer.
The MusArt interface consists of a control center, a workspace, and a visual display.
The control center allows the user to choose from a set of given music/visual templates, which cover a wide range of styles, then play, pause, and reset this music-visual piece.
The workspace is where the user is able to manipulate various aspects of the music/visual piece, using different input devices.
The visual display is where the responsive visual appears.
Included in this submission are two versions of the final report (PDF and editable Word document), and two versions of the final presentation. These together cover my progress on this project up to 04/29/2021, along with my current goals for the near future.
http://hdl.handle.net/10919/103302
Music
Visualization
Art
Drawing
Generative
Creative
Color
FFT
Howler.js
p5.js
Musart Web Application
oai:vtechworks.lib.vt.edu:10919/523542023-11-29T16:41:19Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Garner, Elliot
author
Dean, Brandon
author
Mason, Brannon
author
2015-05-14
Currently, Network Infrastructure & Services (NI&S) takes inventory of equipment assigned to employees (computers, laptops, tablets, tools) and sends reports of higher value items to the Controller’s Office. All items have a VT tag number and a CNS number, which can currently only be matched up via an Oracle Forms interface. An inventory clerk must personally verify the existence and location of each piece of equipment. An improvement would be an app that scans an inventory number or bar code and the GPS location where it is scanned and the custodian of that equipment. This data could then be uploaded to a more accessible Google spreadsheet or similar web-based searchable table.
The 21st Century Inventory app aims to solve this problem by employing barcode scanning technology integrated into a mobile app which would then send the accompanying asset ID to a CSV formatted output file. By directly tying a product’s asset ID to the user and their information, along with having the capability to scan a product’s barcode to simplify inventory lookup, saving product information to a CSV file, and giving the user the ability to edit the current information of a product in the application, we are providing a significant upgrade to a system that currently solely relies on an Oracle Forms interface.
http://hdl.handle.net/10919/52354
Scanner
Barcode
Barcode Scanner
NI&S
21st Inventory
21st Inventory
oai:vtechworks.lib.vt.edu:10919/1128742023-11-29T16:41:20Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Ahn, Andrew
author
Yang, Josh
author
Lim, Heechan
author
Holt, Sasha
author
Kelley, Timothy
author
Homer, Jack
author
2022-12-13
Data helps us to understand the world around us. Not only is interpreting data important, but understanding how to communicate data is an essential skill in the modern world. Teaching children how to make and understand data visualizations lays a solid foundation for their critical thinking and understanding of large-scale problems. This project aims to teach elementary school students how to visualize data engagingly and effectively.
Building on this goal, our project was to develop an accessible website with the ability to host seven or more different game implementations designed around data visualization concepts. These games target three groups of school levels: 1st- and 2nd-grade, 3rd- and 4th-grade, and 5th-grade and up. The goal of the games is to break down complex data visualization concepts for various levels of understanding. Consequently, each game is designed to be fun and replayable, so children engage with the website for longer periods and learn more.
The website has been fully implemented and is accessible to the public. This implementation allows new developers to add games easily, assuming they are familiar with web development. Additionally, we have implemented seven games in either JavaScript or Unity, each of which is playable from the website. We have conducted testing for the application, via a digital feedback form provided to testers. The feedback given on these forms is used to improve the website and the games that are hosted on it.
The website features a mobile dropdown menu, an introductory page, a page for feedback, and seven games that teach core concepts of data visualization. The website can be used in classroom settings as an easy way for teachers to introduce data visualization to students.
http://hdl.handle.net/10919/112874
data visualization
data analytics
children
student
elementary school
react
unity
KidDataViz
oai:vtechworks.lib.vt.edu:10919/911932023-11-29T16:41:21Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Onofrio, Nick
author
Sorkin, Nick
author
Venetsanos, Devin
author
DiFrancisco, Michael
author
Johnson, Campbell
author
Fox, Edward A.
author
2019-05-10
Tobacco companies have had some of the best marketing strategies over the past century. It is well documented and well known that tobacco produces both mental and physical health issues, and yet these companies have found ways to remain as one of the largest businesses. The goal of our project is to assist Dr. Townsend in his research to understand Big Tobacco’s strategies.
This is done by taking some of the fourteen million documents released by tobacco companies online and presenting the data in a meaningful way so they can be analyzed. This project is hosted on a Virtual Machine provided to the team by Dr. Fox and the VT Computer Science department. The idea for the project is to begin by gathering the documents from online, turning them into a usable text format, then feeding these documents to a Doc2Vec-based machine learning tool that was created with Gensim. Using a pre-trained model, we then need to take this data and cluster it so that it is presentable in a usable manner. Thus Dr. Townsend and many others can use this system to further their research.
This submission includes a report on how to use the system and maintain it. This way Dr. Townsend can do what he wants with the system, and any future developers can understand how the system works. This system is comprised of different online components such as a Gensim doc2vec model and a fast approximate nearest neighbor similarity package from Gensim to do the clustering of the data. This has all been stored and set up on the virtual machine provided by the CS department so it should be accessible as long as the user is connected to the campus wifi. Through this project our team learned many things about working with a client, working with new technologies, and how to go about tracking and presenting progress to others.
http://hdl.handle.net/10919/91193
tobacco settlement documents
Doc2Vec
clustering
Tobacco Settlement Documents
oai:vtechworks.lib.vt.edu:10919/982542023-11-29T16:41:22Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Yu, Wang
author
Huang, Baokun
author
Liu, Han
author
Pham, Vinh
author
Nikolov, Alexander
author
2020-05-13
Inside Airbnb is a project by Murray Cox, a digital storyteller, who visualized Airbnb data that was scraped by author and coder Tom Slee. The website offers scraped Airbnb data for select cities around the world; historically data is also available.
We were tasked with creating visualizations with listing data over Virginia and Austria to see what impact Airbnb was having on the communities in each respective region. The choice was Virginia and Austria because our team was familiar with both regions, with parts of our team being familiar with Virginia and other parts being familiar with Austria. The eventual goal is to expand past analysis of these 2 regions and expand further to say the rest of the United States. Since July 2019, Tom Slee has abandoned the script2 to collect data. To collect data on Virginia and Austria, we needed to update the script to collect more recent data.
We began inspecting the script and found it was not collecting as much data as it once was. This was almost certainly due to Airbnb’s website layout changing over time (a common nature of websites). After finding out how the script worked, we eventually found out the various problems related to the script and updated it to the new Airbnb website design. Doing so, we were able to get even more data than we thought possible such as calendar and review data. From there, we were able to begin our data collection process.
During all the time fixing the script, our team was making mock visualizations to be displayed on a website for easy viewability. Once data collection was complete, the data was transferred over to be used for these mock visualizations. We visualized many things such as how many listings a single host had, how many listings were in a given county, etc. The main visualization created was to see where all the listings for Airbnb were on the map. We displayed this on a map. We also made maps to visualize availability, prices, and the number of reviews. Further, we created pie charts and histograms to represent Superhosts, instantly bookable listings, and price distributions.
We expect that in the future the script and the data collected and visualized will be used by both future CS Students working on subsequent iterations of the project as well as Dr. Zach himself, our client.
http://hdl.handle.net/10919/98254
Data Collection
Virginia
Austria
Airbnb
Visualization
Airbnb Scraping
oai:vtechworks.lib.vt.edu:10919/926222023-11-29T16:41:23Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Doan, Viet
author
Crawford, Matt
author
Nicholakos, Aki
author
Rizzo, Robert
author
Salopek, Jackson
author
2019-05-08
This submission resulted from the semester-long team project focused on obtaining the data of 50 DMO websites, parsing the data, storing it in a database, and then visualizing it on a website. We have worked on this project for our client, Dr. Florian Zach, as a part of the Multimedia / Hypertext / Information Access course taught by Dr. Edward A. Fox. We have created a rudimentary website with much of the infrastructure necessary to visualize the data once we have entered it into the database.
We have experimented extensively with web scraping technology like Heretrix3 and Scrapy, but then we learned that members of the Interrnet Archive could give us the data we want. We initially tabled our work on web scraping and instead focused on the website and visualizations.
We constructed an API in GraphQL in order to query the database and relay the fetched data to the front end visualizations. The website with the visualizations was hosted on Microsoft Azure using a serverless model. On the website we have a homepage, page for visualizations, and a page for information about the project. The website contains a functional navigation bar to change between the three pages. Currently on the homepage, we have a basic USA country map visual with the ability to change a state’s color on a mouse hover.
After complications with funding and learning that the Internet Archive would not be able to give us the data in time for us to complete the project, we pivoted away from the website and visualizations. We instead focused back on data collection and parsing. Using Scrapy we gathered the homepages of 98 tourism destination websites for each month they were available from April 2019 back to January 1996. We then used a series of Python scripts to parse this data into a dictionary of general information about the scraped sites as well as a set of CSV files recording the external links of the websites on the given months.
http://hdl.handle.net/10919/92622
Tourism
WaybackMachine
Python
Scrapy
Parsing
Web Scraping
Tourism Destination Websites
oai:vtechworks.lib.vt.edu:10919/1099882023-11-29T16:41:24Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Gonley, Matt
author
Nicholas, Ryan
author
Fitz, Nicole
author
Knock, Griffin
author
Bruce, Derek
author
2022-05-10
TwitterCollections is a continuation of work from a previous semester team called Library6Btweets. The prior team, which worked during Fall 2021, was composed of Yash Bhargava, Daniel Burdisso, Pranav Dhakal, Anna Herms, and Kenneth Powell. The current team that took this over, and worked on this during Spring 2022, is composed of Matt Gonley, Ryan Nicholas, Nicole Fitz, Griffin Knock, and Derek Bruce.
Billions of tweets have been collected by the Digital Library Research Laboratory (DLRL). The tweets were collected in three formats: DMI-TCAT, YTK, and SFM. The tweets collected should be converted into a standard data format to allow for ease of access and data research.
The goal is to convert the collected tweets into a unified JSON format. A secondary goal is to create a machine learning model to categorize uncategorized tweets. The standardized format is in two styles: an individual level, and a collection level. Conversion varies for these levels, requiring, respectively, conversion of each tweet and its attributes to a JSON object, and conversion of a whole collection of tweets to a separate JSON object.
Our work involved familiarizing ourselves with the previous semester’s work and its schema. The three formats for the tweets were as follows: Social Feed Manager (SFM), yourTwapperKeeper (YTK), and Digital Methods Initiative Twitter Capture and Analysis Toolset (DMI-TCAT). The previous team designed this schema with these tweet types in mind as well as the Twitter version 2 schema. The previous team also created a collection level schema that listed all of the tweet IDs in a given collection, to allow for determining which tweets belong in which collection. They designed this in accordance with the events archive website.
We were given the previous team's conversion scripts for each of the tweet formats as well. Each format needed a different script, as what attributes and what metadata from the tweets was collected differed. The format they were collected in also differed. DMI had the data split into six tables in SQL for any given topic, YTK had the data in separate tables for a topic, and SFM was in the format of JSON.
The original scripts were written in Python. For simplicity, we continued using Python as well. Our focus was on optimizing the scripts, as some of them were unusably slow. The scripts also needed to be modified to accommodate scale, where all the data could not be loaded into memory. We were provided six scripts, two for each tweet format: one script for the individual schema and one for the collection level schema.
In addition to the optimizations and modifications, a machine learning model was created to accurately classify the events for unlabeled tweet collections. The model can classify the tweets when fed the data from any of the formats. We experimented with a Naive Bayes model and BERT-based Neural Network model, and found the latter superior.
The new scripts, optimized versions of prior scripts, best machine learning model, and converted Twitter collection JSON files are our deliverables for this semester. We hope that a standardized set of data can allow for fast and effective research for those who want to incorporate tweets into their study.
http://hdl.handle.net/10919/109988
Twitter
SFM
DMI-TCAT
YTK
JSON
Python
Twitter Collections
oai:vtechworks.lib.vt.edu:10919/479452023-11-29T16:41:27Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Lech, Adam
author
Pontani, Joseph
author
Bollinger, Matthew
author
2014-05-09
The Digital Library Research Laboratory is a group focused on researching and implementing a full stack Hadoop cluster for data storage and analysis. The DLRL Cluster project is focused on learning and teaching the technologies behind the cluster itself. To accomplish this, we were given three primary goals.
First, we were to create tutorials to teach new users how to use Mahout, HBase, Hive, and Impala. The idea was to have basic tutorials that would provide users with an introductory coverage of these modern technologies, including what they are, what they’re used for, and a fundamental level of how they’re used. The first goal was met by creating an in-depth tutorial for each technology. Each tutorial contains step-by-step instructions on how to get started with each technology, along with pictures that allow users to follow along and compare their progress to ensure that they are successful.
Second, we would use these tools to demonstrate their capabilities on real data from the IDEAL project. Rather than have to show a demo to each new user of the system firsthand, we created a short (5 to 10 minute) demo video for each technology. This way users could see for themselves how to go about utilizing the software to accomplish tasks. With a video, users are able to pause and go back at their leisure to better familiarize themselves with the commands and interfaces involved.
Finally, we would utilize the knowledge gained from researching these technologies and apply it to the actual cluster. We took a real, large, dataset from the DLRL cluster and ran it through each respective technology. Some reports were generated, focusing on efficiency and performance, and an actual result dataset was generated for some data analysis.
http://hdl.handle.net/10919/47945
Mahout
Impala
HBase
Hive
Hadoop
IDEAL
DLRL Cluster
oai:vtechworks.lib.vt.edu:10919/1149632023-11-29T16:41:28Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Jang, James
author
Tran, Nam
author
Loomis, Drew
author
Mehta, Avi
author
Lin, Bowen
author
2023-04-27
Organized parking is an important aspect of living amongst each other in a society. Most people identify parking as a cumbersome activity because there is typically a lot of stress involved when interpreting vague parking rules. However, parking is a necessary evil as it keeps matters fair and civilized. The necessity for parking generates a huge amount of revenue for the economy.
The parking industry consists of a multitude of different aspects such as parking lot management, parking garage construction and management, and valet parking services. The market size for parking measured in revenue is estimated to be around 8.2 billion. Parking is an essential aspect of transportation as regulations and policies are necessary to improve transportation efficiency. As more people move into an area, the more the parking industry is expected to grow.
With universities always striving to recruit new students, an enormous parking infrastructure is needed to maintain the peace and stability on campus. Unfortunately, this often comes with painfully rigid administration that often inconveniences the daily lives of students.
Parking becomes a headache to many college students due to the increasing number of students, but stagnant number of parking spaces. This event drives down the availability of parking spaces and takes away from students’ learning experience as they often have to plan hours ahead to make it to class.
The smart parking application aims to provide its users with a more convenient parking experience over its nonusers. The application only supports parking at James Madison University at the current moment, but the goal is to expand to all universities. The core premise on how this application is able to provide this functionality is by analyzing the trends in parking spot occupancy and time. James Madison University installed sensors in their parking spaces that are able to track if it is occupied at all times of the day. The parking status is updated live on the official university website. The smart parking application utilizes the API that tracks these updates and observes for patterns to provide the user the most optimal place to park.
The tech stack for the smart parking application will be the MERN stack with MongoDB Atlas for real-time data utilization. MongoDB is chosen because of its malleable document structure. The backend is an Express/Node.js server with Python for the machine learning program. The frontend is a React Native iOS/Android application that fetches data from the API, and uses several component libraries for UI design, including Lottie, RNUI, and React Native Map. The app allows any person to use it without requiring them to create an account.
The purpose of the application is to provide users an advantage in parking over non users. The reasoning behind this is if everyone used the application to get the best parking spaces, then there would be a paradoxical effect and then no one could get the best parking spaces. The main goal here is to give the users of the application a more convenient parking solution, but resolving the lack of parking spaces is a potential issue to tackle in the future.
http://hdl.handle.net/10919/114963
smart parking
parking
mobile application
recommender
James Madison University
Smart Parking Recommender Mobile Application
oai:vtechworks.lib.vt.edu:10919/523382023-11-29T16:41:29Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Thompson, Dustin
author
Henke, Zach
author
Cox, Kevin
author
Fenton, Kevin
author
2015-05-14
The purpose of this project is to assist the VTTI in converting a large citation file into a CSV file for ease of access. It required us to develop an application which can parse through a text file of citations, and determine how to properly put the data into CSV format. We designed the program in Java and developed a user-interface using JavaFX, which is included in the latest edition of Java.
We came up with two main tools: the developer tool and the parsing program itself. The developer tool is used to build a tree made up of regular expressions which would be used in parsing the citations. The top nodes of the tree would be very general regexes, and the leaf nodes of the tree would become much more specific. This program can export the regex tree as a binary file which will be used by the main parsing program.
The main parsing program takes three inputs: a binary regex tree file, a citation text file, and an output location. Once run, it parses the citations based off of the tree it was given. It outputs the parsed citations into a CSV file with the citations separated by field. For any citations that the program is unable to process, it dumps them into a failed output text file so.
We also created an additional program as an alternative solution to ours. It uses Brown University’s FreeCite parsing program, and then outputs parsed citations to a CSV file.
http://hdl.handle.net/10919/52338
Citation
Parse
Regex
Java
Text Transformation
oai:vtechworks.lib.vt.edu:10919/709402023-11-29T16:41:30Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Goodrich, Sean
author
Stefan, Zack
author
Kodres-O'Brien, Benjamin
author
2016-04-28
For eight days in mid-March, Virginia Tech's Institute for Creativity, Arts, and Technology (ICAT) held a camp for two hundred children from age 3 to the 5th grade at North Cross School in Roanoke, Virginia. The purpose of the camp was to engage children in a creative process with the use informal learning. Our group documented the camp, providing an impression of what happened over those eight days. On May 2, 2016, at the annual ICAT Day we presented an exhibit of our documentation. We have contributed to a portion of a much larger exhibit, containing other student projects and demonstrations by the ICAT staff, of informal learning methods they have used and taught community teachers to use.
We documented certain parts of the informal learning process using two methods. First, we filmed 360 degree video of both indoor and outdoor workshops. We have displayed the video using an Oculus Rift so that during ICAT Day a visitor to the exhibit is able to interactively view the children in action. Our second method of documentation is a set of audio reflections from students who participated in the workshops. Towards the end of the camp we had children reflect and then speak about their experience, creating art based on the Rudyard Kipling poem “If.” During ICAT Day, visitors connected with the children’s camp experience by browsing these experiences digitally using their smartphone or similar device.
http://hdl.handle.net/10919/70940
ICAT
North Cross
360 Degree Video
CS4624
Multimedia
ICAT North Cross Exhibit
oai:vtechworks.lib.vt.edu:10919/1032652023-11-29T16:41:30Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Isele, Aaron
author
Kenyon, Mike
author
Warren, Steven
author
Helm, Seth
author
2021-05-12
This submission describes the process and implementation of the work undertaken to create a collaborative Chronic Wasting Disease (CWD) database to document the spread and testing history in the United States. Primarily, the data was from around 1999 to the present, as documentation of tests beyond that becomes much more difficult to obtain. The data used for this project was obtained by attempting to contact all 50 states' Department of Natural Resources (DNR) and requesting their current CWD testing data. This was met with varied success as only about four states provided well-defined data that could be placed into a national database. After communicating with the client and analyzing the data collected, six points of data were selected to be the focus of the project: state, county, year, total tests, positive tests, and negative tests. Utilizing R Shiny as the platform for deploying the database website, and Google Sheets as the persistent database, our team was able to create a private database website that will allow researchers to share and better understand their data using the tools provided. The data must be kept in a private database as many of the states expressed that they do not want their data to be publically shared as they must ensure it is being used responsibly. The database website features the data in a raw, searchable format as well as graphs and maps that allow whitelisted users to view the spread of CWD throughout the country and over time. The goal for this project moving forward is to have CWD researchers join the private database by agreeing to share their data now, and in the future, which will enable better tracking and predicting of CWD in the United States.
http://hdl.handle.net/10919/103265
prion
chronic wasting disease
database
cervid
deer
R Shiny
CWD
Prion Database
oai:vtechworks.lib.vt.edu:10919/1156472023-11-29T16:41:31Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Shah, Vedant
author
Ramesh, Vaishali
author
Daniel, Reema
author
Gathani, Mihir D.
author
2023-05-17
Electronic Theses and Dissertations (ETDs) are academic documents that provide an in-depth insight into an account of the research work of a graduate student and are designed to be stored in machine archives and retrieved globally. These documents contain abundant information that may be utilized by various machine learning tasks such as classification, summarization, and question-answering. However, these documents often have incomplete, incorrect, or inconsistent metadata which makes it challenging to accurately categorize these documents without manual intervention since there is no one uniform format to develop the metadata. Therefore, through the Classifying ETDs capstone project, we aim to create a gold standard classification dataset, leverage machine learning and deep learning algorithms to automatically classify ETDs with missing metadata, and develop a website to allow a user to classify an ETD with missing metadata and view already classified ETDs. The expected impact of this project is to advance information availability from long documents and eventually aid in improving long document information accessibility through regular search engines.
http://hdl.handle.net/10919/115647
Gold Standard ETD Classification Dataset
Deep Learning
Text Classification Models
Interactive User Interface
Data Cleaning
Classifying ETDs
oai:vtechworks.lib.vt.edu:10919/523372023-11-29T16:41:32Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Bissell, Cate
author
Foster, Grant
author
Langlotz, Bryce
author
K. K.
author
2015-05-14
Commemorating the 150th anniversary of Abraham Lincoln's assassination, the Virginia Center for Civil War Studies presented a series of programs, including a 2-day symposium of internationally-renowned scholars and a six-week-long exhibition of Lincoln-related artifacts in Newman Library. The exhibition included both objects and video presentations by students in Hist 2984, Abraham Lincoln: The Man, the Myth, the Legend.
Our client, Kim Kutz, wanted a website to accommodate the exhibit. It would be a WordPress site that exists within the current Virginia Tech Lincoln WordPress site. Our original plan was to include a 3D virtual tour of the Lincoln exhibit in the Newman Library. However, once the exhibit was set up and we visited it, we decided it was too small to be suitable for a virtual tour. So instead, we took pictures of it from various angles to include in an image slider on the homepage of our website.
We decided our website would contain two pages, since it was just an addition to the already existing Virginia Tech Lincoln website. These two pages would be a homepage and an artifacts page documenting all the artifacts relating to Lincoln that the Newman Library owns.
The homepage included the image slider with the various pictures we took of the exhibit as well as a short description of the purpose of the exhibit and what it contained. In this description, we included a link to the artifacts page. The website has two tabs, one for each page, as well as a search bar so a user can search within the site.
The artifacts page contains a photo grid that uses the Mural Theme currently available as one of the themes Wordpress provides. With the Mural Theme, each square photo when hovered over displays text that describes the title of the photo. Additional text can be added to describe the photo. We just included the title of each photo and our client can add text as she wishes in the future.
Each of the photos in our photo grid is of one of the Lincoln artifacts that was displayed in the Lincoln exhibit. Therefore, the titles were the names of these artifacts. When one of these photos is clicked, the user is transported to a separate page where he or she can view the student presentations explaining these artifacts. The presentations were done by students in Dr. Kim Kutz’s history class. She provided us with these presentations and specifically requested that they be included in our site. The presentations consisted of YouTube videos and Prezi presentations. We allowed both the YouTube and the Prezi presentations to be shown in WordPress’s lightbox feature. In addition, clicking the title of each video takes the user to the YouTube page where the video exists.
This submission contains a zip file of the website files, Word and PDF versions of the final report, and PowerPoint and PDF versions of the final presentation.
http://hdl.handle.net/10919/52337
Lincoln
Online Museum
Online Exhibit
Civil War
Website
Lincoln In Our Time: Online Museum Exhibition Website
oai:vtechworks.lib.vt.edu:10919/1171192023-12-08T22:03:07Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Bhavya Patel
author
Tai Nunez
author
Arjit Singh
author
Dong Xiao
author
Harris Naseh
author
In light of the escalating prevalence of mass shootings in the U.S., there is an urgent need for a structured digital repository to centralize, categorize, and offer detailed analyses of these events. This project aims to develop a comprehensive website functioning as a digital library. This library will house mass shooting objects where each object symbolizes a specific mass shooting event, elaborating on who, what, when, where, why, and how. The website's central features will include the ability to visualize and compare various mass shooting incidents, facilitating a broader understanding of trends, patterns, and anomalies. Users will be able to explore the data via geographic visualizations, timelines, and more, providing an immersive and informative experience. Underpinning the platform, our backend system will utilize Python, Flask, and MongoDB, ensuring robust data collection and management. This data includes information fields, URL sources associated with each event, and more. On the front end, technologies like NextJS, React, and Javascript will drive the user interface, supported by essential libraries such as React Chrono and Leaflet.js for advanced visualization. Deployment will be executed via Firebase or AWS for the frontend and Heroku for the backend. Two primary user categories have been identified: general users, who can view the data, and administrators, who can modify the contents. Ensuring the integrity of the data input, admin access will be safeguarded by authentication processes. In summary, this digital library emerges as a timely and crucial initiative in response to the rising tide of mass shootings in the U.S. This project aims to provide comprehensive insights into the tragic events that have marked the nation. Beyond its functional capabilities, the digital library strives to improve understanding, awareness, and ultimately, change in the narrative surrounding mass shootings.
https://hdl.handle.net/10919/117119
Mass Shootings
Website
Library
Mass Shooting Digital Library
oai:vtechworks.lib.vt.edu:10919/479512023-11-29T16:41:34Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Kindel, David
author
Nalls, Brandon
author
Thompson, Amanda
author
Katz, Andrew
author
Anderson, Ryan
author
2014-05-10
A website and a video has been created for the Contemplative Campus group at Virginia Tech. This is a hub for students and community members to seek knowledge of what these practices are about and how they can get involved. The website is located at http://www.contemplativecampus.dlib.vt.edu/ and the latest video(s) can be found at http://www.contemplativecampus.dlib.vt.edu/videos/. Inside this submission are the presentation materials along with a manual on how to use and maintain the website and the video.
Thanks to Dr. Douglas Lindner for the opportunity to work with him and his contacts in this area.
The website was created for ease of use and management, using WordPress, and the video has been created using professional grade software provided by Innovation Space.
http://hdl.handle.net/10919/47951
Contemplative Campus
Virginia Tech
Contemplative Practices
Contemplative Campus at Virginia Tech
oai:vtechworks.lib.vt.edu:10919/186592023-12-05T02:01:05Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
May, Daniel
author
Gates, Greg
author
Zhang, Jeff
author
2012-05-02
This video was created to help recruit graduate students for the Mathematics Education program at Virginia Tech.
http://hdl.handle.net/10919/18659
Math Education
Mathematics Education Recruitment Video
oai:vtechworks.lib.vt.edu:10919/982442023-11-29T16:41:35Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Robinson, Matthew
author
Whelan, Andrew
author
Burton, Samuel
author
Bolon, Brendan
author
Ellis, Harrison
author
2020-05-12
Deep Learning Course is an open-source course on deep learning topics hosted on GitHub in the Machine Learning Mindset repository built with the guidance of our client, Amirsina Torfi. We have designed and created four modules -- introduction, basic, neural network, and deep neural network concepts -- with each module containing subtopics. This course will introduce users to some key concepts used in developing and using deep learning and neural network models.
The approach to constructing this course was to split our time between researching, developing in-depth documentation on topics, and developing source code to go along with some of the topics. Users may navigate through the course, module by module and subtopic by subtopic in a linear fashion within each module, and execute the supplied sample code. In addition to providing documentation on the topics within deep learning, we supply information on various PyTorch and Python libraries used in the source code. This is to provide supplementary information on the specifics of the code. The goal is to have the user gain a further understanding of deep learning and its application in PyTorch and Python.
Our course addresses the problems of lack of resources and limited availability of open-source courses on deep learning. Our solution includes contextual materials in addition to source code. The main component of our project is the GitHub repository, with reStructuredText documentation. The repository is publicly available for viewing and suggestions. Thus our group provided the desired open-source course deliverable. To use our course, visit https://github.com/machinelearningmindset/deep-learning-course
http://hdl.handle.net/10919/98244
Deep learning (Machine learning)
Python
PyTorch
reStructuredText
Machine learning
Neural Networks
Course
GitHub
Open Source
Deep Learning Course
oai:vtechworks.lib.vt.edu:10919/1033142023-11-29T16:41:36Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Gurrapu, Sai
author
Ghadia, Jhanavi
author
Singh, Tarun
author
Yeshala, Sai Krishna
author
2021-05-14
The risk sentiment analysis tool for workers and workplace is an effort to analyze and determine the safety culture and risk levels of a workplace. Our program will take the narrative reports of the safety and risk conditions from the employees and pass it on to our sentiment analysis software and will return the sentiment values (positive/negative/neutral/mixed) to the users. These values can be referenced by the workplace owners or other employees to have an estimate of the safety conditions at a particular place. Our goal is to accumulate all the information provided by the concerned/satisfied employees, undertake proper sentiment analysis on it and have a reliable output for examination.
http://hdl.handle.net/10919/103314
risk
risk sentiment
sentiment analysis
workplace risk
Risk Prediction Sentiment Analysis
oai:vtechworks.lib.vt.edu:10919/220412023-11-29T16:41:37Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Singh, Avneet
author
Tatarka, Evan
author
2013-05-15
The Virginia Tech Food Safety Game will allow for the users or the new dining employees to learn about food safety in an interactive and more engaging way. The employees will be able to play this game on a personal or a public computer through the use of a browser and familiarize themselves with the food safety material that is necessary to take a food safety exam. The food safety game allows for the training coordinators to make the learning more enjoyable and guarantee food safety in their dining centers. When the game is complete, it will replace the PowerPoint presentation that is currently in place for the new employees to learn about food safety at Virginia Tech.
http://hdl.handle.net/10919/22041
food safety
food safety 100
game
handwashing
serving
cooking
Food Safety 100 Game
oai:vtechworks.lib.vt.edu:10919/1032692023-11-29T16:41:38Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Verelly, Abhinav
author
David, Gruhn
author
Bhattarai, Ashutosh
author
Grishaw, Shane
author
2021-05-13
Each state in the United States has its own state-run website, which is used as a means to attract new tourists to that location. Each of these sites is typically used to highlight any big attractions in that state. Any travel tips, facts regarding that location, blog posts, ratings from other individuals that have traveled there, or any other useful information that may attract potential tourists are also included. These websites are maintained and funded directly by occupancy taxes. Occupancy taxes are a form of state tax that an individual pays whenever one stays in a hotel or visits any attractions in that state. As such, the main goal of these websites is to attract new tourists to their location. These websites are maintained and paid for by past tourists who have visited that state.
Funding for future state tourism is determined by how many previous tourists have visited the state and paid the occupancy tax. Researchers need to be able to determine which elements of the website are most beneficial in attracting tourists. This can be determined by examining past tourism websites and looking for any patterns that would determine what worked well and what didn’t. These patterns can then be used to determine what was successful and use that information to make better-informed decisions.
Our client, Dr. Florian Zach of the Howard Feiertag Department of Hospitality and Tourism Management, plans to use the historical analysis done by our team, to further help his research on trends in state tourism websites content. Different iterations of each state tourism website are stored as snapshots on the Internet Archive and can be accessed to see changes that took place in that website. Our team was given Parquet files of these snapshots for the states of California, Colorado, and Virginia dating back to 1998. The goal of the project was to assist Dr. Zach by using these Parquet files to perform data extraction and visualization on tourism patterns. This can then be expanded to other states’ tourism websites in the future.
We used a combination of Python’s Pandas library, Jupyter Notebook, and BeautifulSoup to examine and extract relevant pieces of data from the given Parquet files. This data was extracted into various different categories, each with its own designated folder. These categories were raw text, images, background colors and background images, internal and external links, and meta tags. With this data sorted into the appropriate folders, we are then able to determine specific patterns such as what colored background was used the most. With our data extraction portion of this project completed along with the visualization, we hope to pass this on to future teams so that they are able to expand on our current project for the rest of the states.
http://hdl.handle.net/10919/103269
Python
Data Analytics
Visualizaiton
BeautifulSoup
pyarrow
Jupyter Notebook
Matplotlib
Tourism
Web scraping
US State Tourism
oai:vtechworks.lib.vt.edu:10919/1100562023-11-29T16:41:39Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Smith, Javan
author
2022-05-10
WebsiteSocialNetworksBritain is a project designed to create a HTML website from an input of forty thousand XML records. These XML records describe historical British and American figures and their connections to each other through family, associates, and organizations. The client has built these records and wishes them to be viewable on a website. Per specifications, the website must be built using no third-party services. The final Java program consists of two parts: reading the XML records into a data structure and providing the ability to search the according to appropriate criteria. The program uses a binary search tree to store the forty thousand records and creates a pop-up menu to select the records the user wishes to view. The pop-up menu allows the user to search through the records and find previously unseen connections between the historical figures.
http://hdl.handle.net/10919/110056
Georgian
Great Britain
Social Network
XML
Java
Historical Figures
WebsiteSocialNetworksGreatBritain
oai:vtechworks.lib.vt.edu:10919/1150292023-11-29T16:41:40Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Shivaraman, Thrilok
author
Ouzhinski, Theo
author
Denman, William
author
Geibel, Katie
author
2023-05-14
This submission describes the CS 3604: Professionalism in Computing Case Study Library, a Library coordinated by our client Dr. Dan Dunlap, that contains the recent case studies written by students in the class. The Case Study Library website provides a platform through which these case studies can be viewed. This was the third group to work on the Library, and the current Library allows for student case study upload, searching, and filtering by course topic. However, upload was through one admin account given to all students provided by the teacher. This meant once a student uploaded, they could not go back to edit their submission as there was no way to link users to uploads. Additionally, the interactivity of the website was limited. The first goal of this iteration was to implement login functionality in a manner so that students can log in using their Virginia Tech accounts. This enables us to link users with their uploads and thereby allows them to edit. In order to improve the interactivity of the site, metadata fields will be added for tags and liking. When uploading, students will be able to select various tags from a bank of options that pertain to the subject of their case study, which can later be used for sorting. When viewing case studies, website users will be able to like a submission, and the number of likes on each submission will be stored which can be later used for a recommended page. Our work will increase the opportunity for interaction with users of the website, allowing students to better search for case studies by topic, and to like the studies that others upload. Currently, all of the features that the group attempted to create are working and present, but upload is still not working due to the bucket pointing to the wrong place. The group worked together to build these features as requested by the client, and had to go through a few refactors of the goals in order to reach reasonable milestones over the course of the semester.
http://hdl.handle.net/10919/115029
Case Study
CS 3604
Dr. Dan Dunlap
Library
AWS
DynamoDB
Login
Authentication
Metadata and Tagging
Website
File Storage
Computer Science
CS 3604 Case Study Library III
oai:vtechworks.lib.vt.edu:10919/1171092023-12-07T22:03:09Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Nguyen, Van Ha
author
Burnett, Sarah
author
Freedman, Bradley
author
Ganesan, Bharathi
author
Ravindran, Roshan
author
2023-11-30
Health researchers are looking for all possible relationships between two health conditions, obesity and diabetes. To investigate the issue robustly, create detailed experimentation, and develop lasting solutions, the Chorobesity project presents a visual tool of the geographical relationship between obesity and diabetes for our clients to utilize in their studies. Making use of different levels of maps, as well as different color “keys”, the user can study different regions’ health condition statuses.
The Chorobesity project aims to be a visual and dynamic tool that researchers can use to further their understanding of the geographical correlation between obesity and diabetes. It provides relevant data and tools for the user to easily interpret and tweak this data for their best understanding. This interactive map, in providing a snapshot of the current health profile of the United States, seeks to be an indispensable tool for policymakers, health professionals, and the general public to understand how obesity and diabetes correlate as the clients see fit.
https://hdl.handle.net/10919/117109
health map
obesity
diabetes
visualization tool
interactive map
mouse hover
choropleth map
geographic correlation
health issues
Chorobesity: Modern Insight Into An Enduring Epidemic
oai:vtechworks.lib.vt.edu:10919/220572023-11-29T16:41:41Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Bailey, Bradley
author
Dotson, Sarah
author
Feng, Susan
author
Han, Lance
author
Kelly, Sean
author
Shepherd, Hunter
author
2013-05-18
As a result of our course work we have made a fully functional website that is ready to be moved over to FASWVA’s host and made live. Additionally, we have completed a Virtual Food Drive. A Virtual Food Drive is a way for the user to experience the act of shopping for food for others rather than donating a flat amount. FASWVA personnel will be able to update content as well as encourage companies and individuals to hold food drives using the Virtual Food Drive.
The report discusses that Drupal leads to easier editing and that the Virtual Food Drive will improve the user’s experience while donating. The next step is to perform analytics after the website goes live. Recommendations discussed include:
- Go Live
- Training personnel
- Perform regular audits on the website
- User test the Virtual Food Drive
Future groups should consider taking up the following multimedia aspects of this project:
- Hunger Quiz
- Hunger Simulation
- Peer to Peer Food Drives and Fundraisers
- Testimonies on Website
- TV Ready Promo Video
http://hdl.handle.net/10919/22057
Drupal
FASWVA
Feeding America
website
Feeding America Southwest Virginia
oai:vtechworks.lib.vt.edu:10919/220622023-11-29T16:41:42Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Hakimov, Kurban
author
Hartwell, Andrew
author
Ulmet, Robert
author
2013-05-18
CTRnet (Crisis, Tragedy, and Recovery network) is an NSF funded project that focuses on crawling/scanning the Internet regarding tragic events and creating digital libraries of information on those crises. CTRnet downloads webpages in regards to these events to ensure that this information is saved. As an example, CTRnet has over 440 gigabytes of webpages saved just for the Hurricane Sandy event.
Our group was assigned with creating a script to walk through the downloaded webpages, finding relevant images, and downloading them. We also researched gallery modules to create a Drupal gallery for our downloaded images.
http://hdl.handle.net/10919/22062
python script
image parsing
image filtering
CTR
Drupal gallery
Crisis, Tragedy, and Recovery Network Project
CTRimages
oai:vtechworks.lib.vt.edu:10919/926212023-11-29T16:41:43Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Beemsterboer, Christopher
author
Zebina, Tyler
author
2019-05-19
In our Multimedia/Hypertext/Information Access capstone course, we worked with Adult Day Services to create a training video system to teach new instructors in their organization how to conduct recurring interviews with the adult clients. Adult Day Services is an organization at Virginia Tech that provides person-centered care to older adults who need assistance.
Adult Day Services also aims to promote the physical, social, emotional, mental, and cognitive health of its participants, and they use a variety of assessments to measure overall well-being and participant progress. These assessments are conducted in the form of interviews, and the body language, tone, and speech of the interviewer are key to performing them successfully.
The training video system we created covers five different types of assessments and is designed to efficiently train new instructors to conduct these interviews. We filmed an Adult Day Services instructor conducting interviews with five different participants, each completing the five assessments. We edited the footage and compiled all of the clips of each type of assessment together including transitions and titles. We later created a menu system which allows a user to select to play all of the training videos at once, or to play just the training video for a specific type of assessment. We have also included sub-categories within each type of assessment so the user can decide to view a specific participant as opposed to all. We delivered this project in the form of a Blu-ray .iso file on a USB drive which contains the menu system and the associated videos. We have also included instructions on how to download the VLC media player, which is the optimal software for viewing the contents on the .iso file.
Finally, we have included our final presentation from our capstone course that goes over the final product as well as the lessons learned and our future plans.
http://hdl.handle.net/10919/92621
Adult Day Services
ADS
Multimedia
Video
Assessment
ADS Assessment Video
oai:vtechworks.lib.vt.edu:10919/776292023-11-29T16:41:44Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Dean, Tommy
author
Pasha, Ali
author
Clarke, Brian
author
Butenhoff, Casey J.
author
2017-05-10
The main goal behind the Common Crawl Mining system is to improve Eastman Chemical Company’s ability to use timely knowledge of public concerns to inform key business decisions. It provides information to Eastman Chemical Company that is valuable for consumer chemical product marketing and strategy development. Eastman desired a system that provides insight into the current chemical landscape. Information about trends and sentiment towards chemicals over time is beneficial to their marketing and strategy departments. They wanted to be able to drill down to a particular time period and look at what people were writing about certain keywords.
This project provides such information through a search interface. The interface accepts chemical names and search term keywords as input and responds with a list of web page records that match those keywords. Included within each record returned is the probable publication date of the page, a score relating the page to the given keywords, and the body text extracted from the page. Though it was one of the stretch goals of the project, the current iteration of the Common Crawl Mining system does not provide sentiment analysis. It would be relatively straightforward to extend the system to perform it, given the appropriate training data.
The final Common Crawl Mining system is a search engine implemented using Elasticsearch. Relevant records are identified by first analyzing Common Crawl for Web Archive (WARC) files that have a high frequency of records from interesting domains. Records with publication dates are then ingested into the search engine. Once the records have been indexed by Elasticsearch, users are able to execute searches which return a list of relevant records. Each record contains the URl, text, and publication date of the associated webpage.
Included in this submission are Microsoft Office and PDF versions of the Common Crawl Mining project's final presentation and final report. The final presentation outlines the project's history. The final report outlines the progress made on the project and includes a developer's and user's manual for the system. This submission also includes a compressed folder which contains all of the source code associated with the Common Crawl Mining project.
http://hdl.handle.net/10919/77629
Common Crawl
Elasticsearch
Content Mining
Eastman Chemical Company
Common Crawl Mining
oai:vtechworks.lib.vt.edu:10919/186602020-09-29T19:47:14Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Elliott, Patrick
author
Herzer, Brian
author
2012-05-02
The following problem was addressed by our project: How can we easily visualize the content of a body of text without manually analyzing its content? The initial goal was to be able to visualize captioned college lectures, but ended up being any section of text.
Our client, Mr. James Barker of Aptigent, supplied us with a large collection of captioned news reports for us to create visualizations. These television news reports were a good examples for us, since they can usually be summarized with just a few key words and relationships between words. This obviously makes them optimal for visualizing.
There were a few specifics about our task for this project. We had the ability to use a clustering program which would take a given body of text and generate, among other things, a list of keywords, which we called 'concepts,' and a list of tags. The concepts were words that the clustering program believed to have more importance, while the tags were generally words or phrases that were tied directly to one or more concepts.
Our solution needed to be web-based. In order to best accomplish this task, we chose to design our solution using HTML and JavaScript. We choose to use Raphael, a JavaScript library, to draw the visualizations.
Our solution puts a heavy emphasis on the proximity between each concept and tag. Whether or not the two appear in the same sentence is also taken into consideration.
http://hdl.handle.net/10919/18660
Lecture Capture Cluster Analyze Visualization
LectureCapture
oai:vtechworks.lib.vt.edu:10919/523532023-11-29T16:41:45Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Chittum, Matthew
author
He, Zinai
author
Li, Jiarui
author
Rahman, Tanvir
author
2015-05-14
SearchKat is a searchable database built on Apache Solr. The database was created with the purpose of being able to search Hurricane Katrina narratives that were gathered from survivors. The project was carried out in Spring 2015 in CS4624 (Multimedia, Hypertext, and Information Access) with Dr. Edward Fox as the professor and Dr. Katie Carmichael as the project client.
SearchKat has user specified word groupings which were determined by Dr. Carmichael in order to make the database thematically searchable, with each of these word groupings being considered a theme. The system can both search by these word groupings as well as by generic word associations such as synonyms.
The database has customized fields displayed when a query is performed. We have created a CSV file which stores the filename, line number, and line content for each line through all of the files. We use this information to display this relevant data whenever a search is performed.
http://hdl.handle.net/10919/52353
SearchKat
Hurricane Katrina
Database
Multimedia, Hypertext, and Information Access
SearchKat
oai:vtechworks.lib.vt.edu:10919/982512023-11-29T16:41:50Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Thackaberry, Taylor
author
Bogemann, Kayley
author
Burchard, Shane
author
Butler, Jessie
author
Spencer, Austin
author
2020-05-05
The purpose of the Twitter Disaster Behavior project is to identify patterns in online behavior during natural disasters by analyzing Twitter data. The main goal is to better understand the needs of a community during and after a disaster, to aid in recovery.
The datasets analyzed were collections of tweets about Hurricane Maria, and recent earthquake events, in Puerto Rico. All tweets pertaining to Hurricane Maria are from the timeframe of September 15 through October 14, 2017. Similarly, tweets pertaining to the Puerto Rico earthquake from January 7 through February 6, 2020 were collected. These tweets were then analyzed for their content, number of retweets, and the geotag associated with the author of the tweet. We counted the occurrence of key words in topics relating to preparation, response, impact, and recovery. This data was then graphed using Python and Matplotlib. Additionally, using a Twitter crawler, we extracted a large dataset of tweets by users that used geotags. These geotags are used to examine location changes among the users before, during, and after each natural disaster. Finally, after performing these analyses, we developed easy to understand visuals and compiled these figures into a poster.
Using these figures and graphs, we compared the two datasets in order to identify any significant differences in behavior and response. The main differences we noticed stemmed from two key reasons: hurricanes can be predicted whereas earthquakes cannot, and hurricanes are usually an isolated event whereas earthquakes are followed by aftershocks. Thus, the Hurricane Maria dataset experienced the highest amount of tweet activity at the beginning of the event and the Puerto Rico earthquake dataset experienced peaks in tweet activity throughout the entire period, usually corresponding to aftershock occurrences. We studied these differences, as well as other important trends we identified.
http://hdl.handle.net/10919/98251
Puerto Rico
earthquake
Hurricane Maria
Topic Analysis
geotag
geolocation
social media
twitter
disaster
behavior
Twitter Disaster Behavior
oai:vtechworks.lib.vt.edu:10919/479212023-11-29T16:42:02Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Lopez-Gomez, Austin
author
Singh, Divit P.
author
2014-05-08
Concept:
KindaRight is a collaborative, open network for artists of all disciplines. We emphasize collaboration between artists of all disciplines because we firmly believe that art comes from inspiration, and inspiration comes from people. The more you know and the more you see will help artists produce better, more beautiful pieces of art. We believe that the best people qualified to critique art are artists themselves, which is why KindaRight also revolves around a merit system which shows your status as an artist weighted for both how many people have liked you, and the respective merit of those people. Finally, we are first and foremost a network: for connecting those creating art to those buying art; to discover new art and new talent; and where the entire art community can work together.
Current Status:
As of the end of this semester we have implemented a full user experience for uploading and sharing photographs. We plan to continue this project and implement a design that is more closely related to our vision. We have included various milestone markers including our midterm and final presentations that detail our status at those points in time respectively. We also included our poster from VTURCS which gives a good overall description of our project and where our future works will be focused. Lastly we have included our final report which is a comprehensive documentation of everything that we have built this semester.
Eventually, our website will be open: kindaright.com
http://hdl.handle.net/10919/47921
Collaborative network for artists and patrons
website
KindaRight
oai:vtechworks.lib.vt.edu:10919/709552023-11-29T16:42:03Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Sharma, Divya
author
Patha, Laxmi Harshitha
author
Sethi, Gurkiran
author
Kotagiri, Pranavi
author
2016-05-10
The goal of our project is to construct video vignettes of scenarios illustrating different types of micro-aggressions. Micro-aggressions are the everyday verbal, nonverbal, and environmental slights, snubs, or insults, whether intentional or unintentional, that communicate hostile, derogatory, or negative messages to target persons based solely upon their marginalized group membership (from Diversity in the Classroom, UCLA Diversity & Faculty Development, 2014).
Interactions and conversations between peers and faculty are never-ending. The biggest concern related to micro-aggression is that individuals may not even know that they are committing a micro-aggression, which is why we want to inform as many individuals as we can about this topic. In fact, a micro-aggression can even occur when someone is giving a compliment. By raising micro-aggression awareness we can have safer, more alert and more intelligent interactions.
We created videos displaying different types of micro-aggression events. We have completed shooting and editing of three videos, each with length in the 1-3 minute range. The editing was done in the Innovation Space Center in Torgersen Hall, and the resulting videos are available through YouTube. We have also completed preliminary work on the fourth video. The raw files for the videos are located in the Innovation Space in the "Save Work Here" folder under the names "Sharma" and "Patha". With these videos the overall goal is to garner attention and awareness regarding micro-aggressions that take place on a day-to-day basis. We hope that our videos can be a stepping-stone to finding a solution to an everyday problem, possibly inspiring others to produce additional videos on this important topic.
http://hdl.handle.net/10919/70955
Micro-Aggression
Video Vignettes
Micro-Aggression Video Vignettes
oai:vtechworks.lib.vt.edu:10919/1099792023-11-29T16:42:04Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Zhu, Kecheng
author
Gager, Zachary
author
Neal, Shelby
author
Li, Jiangyue
author
Peng, you
author
2022-05-09
Electronic theses and dissertations (ETDs) contain valuable knowledge that can be useful in a wide range of research areas. To effectively utilize the knowledge contained in ETDs, the data first needs to be parsed and stored in an XML document. However, since most of the ETDs available on the web are presented in PDF, parsing them is a challenge to make their data useful for any downstream task, including question-answering, figure search, table search, and summarizing. For information search and extraction, contextual information is needed to perform these tasks. However, such semantic information is hidden in PDF documents. In contrast, XML can explicitly share semantic information. The structure within XML documents can enforce semantic continuity within the tag elements. Accordingly, knowledge graphs can be more easily built from XML, rather than PDF, representations. The goal of this project was to extract different elements of scholarly documents such as metadata (title, authors, year), chapter headings and subheadings, equations, figures (and captions), tables (and captions), and paragraphs, and then package them into an XML document. Subsequently, a pipeline responsible for the conversion and a dataset to support the object detection step was developed. Over the semester, 200 ETDs, both born-digital and scanned, were annotated using a online tool called RoboFlow. A model based on Facebook’s open-sourced object detection model, Detectron2, was trained with the created dataset. Besides that, a pipeline that utilizes the model has been built that converts an ETD in PDF into an XML document, which can then be used for future downstream tasks and HTML for visualization. A dataset consisting of 200 annotated ETDs and a working pipeline were delivered to the client. From the project, the Object Detection Team learned numbers of libraries related to the task, built a sense of the importance of version control, and understood how to split a large task into smaller and more approachable pieces.
http://hdl.handle.net/10919/109979
Object Bounding Box Detection
OCR
Computer Vision
R-CNN Model
Content Classification
RoboFlow
XML
HTML
Python
Object Detection
oai:vtechworks.lib.vt.edu:10919/709432023-11-29T16:42:04Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Nimitz, Sarah
author
Forsyth, Duke
author
Knittle, Andrew
author
2016-05-04
Project deliverables are provided, including a detailed description of the creation of a polling and parsing system for keeping track of severe weather warnings, as delivered by the National Weather Service, and an interface to allow the user to view a representation of Doppler radar data in three dimensions. The report describes the roles of the team members, the work accomplished over the Spring 2016 semester, and the methods by which the team accomplished this work.
http://hdl.handle.net/10919/70943
Weather
Doppler
ICAT
The Cube
C#
Unity
Regular Expressions
Regular Expression
Regex
Virtual Reality
Augmented Reality
Severe Weather Statement
Weather Warning
Warning
3-Dimensional Weather Visualisation
oai:vtechworks.lib.vt.edu:10919/982402023-11-29T16:42:05Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Bui, Dat
author
2020-05-11
DeepSqueak is a deep-learning based system for detection and analysis of ultrasonic vocalizations.
The original DeepSqueak model was created by Kevin R. Coffey, Russel G. Marx, and John F. Neumaier.
Rodents engage in social communication through ultrasonic vocalizations, and Dr. Bowers is utilizing DeepSqueak's technology to study rats in his lab.
AviSoft is another software package that has been used by Dr. Bowers, to record and manually analyze sound files gathered from the rats.
Dr. Bowers would like to use all available data to train DeepSqueak's classification model, to further improve its accuracy, and to reduce manual analysis and labeling work.
The purpose of the Vocalization Detection project is to assist with that effort, leveraging the available data, the two software packages, and our processing.
Initial efforts involved studying DeepSqueak, AviSoft, and the available data files.
Further exploration considered automating use of the tools, and helping with the training of DeepSqueak models.
Then the work pivoted, to develop matching methods to take data processed with AviSoft, to transform that into labeled data to improve the training of DeepSqueak models.
http://hdl.handle.net/10919/98240
Deep learning (Machine learning)
Classification
Rat
Vocalization
DeepSqueak
AviSoft
MATLAB
Vocalization Detection
oai:vtechworks.lib.vt.edu:10919/1070152023-11-29T16:42:06Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Kadam, Archita
author
McKnight, Evan
author
Chu, Yifan
author
Patel, Pranav
author
2021-12-14
The purpose of our project is to display a model of the spread of an infectious disease throughout the Commonwealth of Virginia. The user interface should encompass the ability to display what-if scenarios, adjust relevant parameters, and visualize resulting output. This report will further explain the background of the project, our implementation, and what our group has learned throughout this experience.
We created a product to suit the needs of visualizing the given data and formula. It is provided in the form of a webpage or dashboard in order to display a graphical model about Chronic Wasting Disease (CWD). The results and predictions drawn from this information are shown on the dashboard in order to track and see the spread of CWD. The user or client is able to manipulate different sliders to see the kinds of data required. Using a model, the data will be manipulated using statistics on the data to output the correct information that is needed.
The dashboard is hosted on an application called R-Shiny. The client, Professor Luis Escobar, requested we use R-Shiny as it was the most familiar for both parties. We created the dashboard using the model provided to us and data that was taken over the course of some period of time. The R-Shiny Dashboard is a way of creating and presenting the change in data through the sliders. With the statistical model we can see a lot of change.
http://hdl.handle.net/10919/107015
R-Shiny
dashboard
statistics
Chronic Waste Disease
sliders
Virginia
deer
visualization
Disease Spread Simulator II
oai:vtechworks.lib.vt.edu:10919/1032712023-11-29T16:42:07Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Weedon, Daniel B.
author
Olsen, Daniel
author
2021-05-03
The knowledge graph project is a two-component project: the first component is concerned with the back end Grakn.AI while the second component deals with the front end service registration. The goal of the project is to build a knowledge graph that represents how user information goals are connected to one another. The knowledge graph is connected to a workflow management system that allows developers to register their services and add them to the knowledge graph.
A knowledge graph is a directed graph data model that stores interlinked entities. Storing data into the knowledge graph allows you to see how this data is connected with other entities in the graph as well as how they are connected. Through this we see the power of a fully fleshed-out knowledge graph. A user may wish to complete a task but has no knowledge about how to complete this task or the tools needed to do so. They can use the knowledge graph and query for this task and thus retrieve the workflow necessary to perform this task including the input files, output files, libraries, functions, and environments.
During this project, research was conducted on both the back end and the front end. On the back end, our team researched how to search through the knowledge graph with Grakn. The front end searched for a suitable method to visualize the knowledge graph. As a result, the Grakn database is able to query the knowledge graph and a Python API is connected to Grakn to allow the front end to display an update to date version of the knowledge graph.
http://hdl.handle.net/10919/103271
Knowledge graph
Visualization
Grakn.AI
Grakn
Workflow
Knowledge Graph
oai:vtechworks.lib.vt.edu:10919/1070682023-11-29T16:42:08Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Nguyen, Bryant
author
Muddam, Maanas
author
Kim, James
author
Moratti, Kristopher
author
2021-12-08
Professor Escobar requested a redesign of the current Wildlife Disease Research Website to meet modern day web design standards. The report contains a description of the requirements for designing the new website including both aesthetic and functional requests. These requests were realized using a coding paradigm known as the MERN Stack. A thorough description of how this paradigm was used to redesign the website is included in the report. Also, the report contains descriptions of the testing process for the website and the deployment strategy. Included in the report is a User's Manual and Developer's Manual. The User's Manual describes the user flow through the website providing descriptions of the various front-end functionalities on each page. The Developer's Manual provides a technical description of how the front-end and back-end were coded. Included is a thorough description of how to edit the code to edit website functionality. The presentation contains a high level description of the contents of the report.
There are two versions of the report with them containing the same content but in different formats. This is the same with the presentation files.
http://hdl.handle.net/10919/107068
MERN
Quality Analysis Testing
Website
Wildlife Diseases
Website Wildlife Diseases Redesign
oai:vtechworks.lib.vt.edu:10919/709642023-11-29T16:42:08Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Won, Stephen
author
2016-04-28
The major goal of the Event Based Categorization of Tweet Collections project is to help the IDEAL team to access tweet collections easily. Categorizing over 1,000 collections will aid organization, browsing, searching, and other activities. The focus of this project is to categorize each collection. The original method for categorization was to use a taxonomy scheme, but that was refined to use a tag system. This way the users will be able to see all of the collections in an organized way.
In the original planning, in addition to the categorizing, we planned to implement a user interface, as an extension of the current table of collections, to make it more interactive and easier to browse. The design for the interface also is described.
http://hdl.handle.net/10919/70964
IDEAL
tweets
tagging
categorization
collection
IDEAL Tweet Collection Categorization
oai:vtechworks.lib.vt.edu:10919/479162023-11-29T16:42:10Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Schiefer, Jeb
author
Sharma, Paul
author
2014-05-07
DSpace [1] is an open source repository application used by many organizations and institutions. It provides a way to access and manage all kinds of digital documents. The 4624S14DSpaceEmbargo project was intended to extend the functionality of the ItemImport command line tool. Specifically the goal was to add the ability to embargo uploaded items until a specified date. This functionality was already implemented for the two web interfaces (XMLUI and JSPUI). DSpace is used by the Virginia Tech library in the form of VTechWorks [2].
The project was overseen initially by Keith Gilbertson and Zhiwu Xie who work for the Virginia Tech library. Near the end of the semester we were introduced to another software developer for the library, Jay Chen. We helped Jay set up the DSpace environment on his local computer and demonstrated to him how to use the ItemImport command line tool.
Embargoes are used to limit access until a specified date. An embargo can be applied as a resource policy on an item, group, or bitstream level. An item level embargo restricts access to all of the files uploaded for a particular item. A group level embargoes submissions from anyone that is a member of the specified group. By default, the Anonymous group is the group that is used. A bitstream level embargo restricts access only on a specific file that is uploaded. The date format expected for setting an embargo must adhere to the ISO 8601 date format [3], specifically the YYYY-MM-DD, YYYY-MM, and YYYY variations.
The deliverables for this project were the source and this documentation. Source code will be available on VTechWorks as well as GitHub. The GitHub repository [4] will be more up to date than the VTechWorks copy because we will continue some work on the project after the due date for this project based on feedback from the DSpace developers. The JIRA ticket for this feature to be implemented in DSpace 5.0 is DS-1996 [5].
[1] DuraSpace, “DSpace”, 2014, http://dspace.org/
[2] Virginia Tech, “VTechWorks”, 2014, http://vtechworks.lib.vt.edu/
[3] ISO, “Date and time format - ISO 8601,” 2014, http://www.iso.org/iso/home/standards/iso8601.htm
[4] GitHub, “jebschiefer/DSpace,” 2014, https://github.com/jebschiefer/DSpace/
[5] DuraSpace JIRA, “[DS-1996] Embargo Support in ItemImport,” 2014, https://jira.duraspace.org/browse/DS-1996
http://hdl.handle.net/10919/47916
dspace
CS4624
4624S14DSpaceEmbargo
oai:vtechworks.lib.vt.edu:10919/832142023-11-29T16:42:11Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Eason, Andrew D.
author
Cianfarini, Kevin M.
author
Hansen, Marshall C.
author
Davies, Shane J.
author
2018-05-07
This project is focused on the culture and trends of the Triple Crown Trails (Appalachian Trail, Pacific Crest Trail, and Continental Divide Trail). The goal of this project is to create a large collection of forum and blog posts that relate to the previously stated trails through the use of web crawling and internet searching. One reason for this project is to assist our client with her Master’s Thesis. Our client, Abigail Bartolome is focusing her thesis on the different trends and different ways of life on the Triple Crown Trails, and the use of our tool will help her. The impact of our project is that it will allow our client to be able to sift through information much faster in order to find what she does and does not need for her thesis, instead of wasting time searching through countless entries with non-relevant information. Abigail will also be able to sift through what kind of information she wants specifically through the use of our tagging system. We have provided the dates, titles, and author of each post so she can immediately see if the article has relevant information and was posted in a time frame that is applicable.
The project will have two main focuses, the frontend and the backend. The frontend is an easy-to-use interface for Abigail. It will allow her to to search for specific tags, which will filter the blog posts based on what information she seeks. The tags are generated automatically based on the content of all of the forums and blogs together, making them very specific which is good for searching for the kind of content desired by our client. When she finishes adding tags, she can then search for blogs or forums that relate to the topics tagged. The page will display them in a neat format with the title of the article that is hyperlink-embedded so she can click on it to see the information from the article, as well as the author, date, and source of the post.
The backend is where all the heavy lifting will be done, but obviously is invisible to the client. This is where we will go through each of the blog or forum websites fed into the web crawler to store all of the relevant information into our database. The backend is also where the tagging system is implemented and where tags are generated and applied to blog posts. WordPress and BlogSpot (for the most part) have a uniform way of going through blogs, so our web crawler acts accordingly based on which website it is, and is able to go through until there are no more blogs on that site. All of the blog posts, contents, pictures, tags, URLs, etc. are stored in the backend database and then linked to our frontend so that we can display it neatly and organized to the liking of Abigail. From 31 sources we have collected 3,423 blog posts to which have been assigned 87,618 tags.
Together, the frontend and the backend provide Abigail with a method to both search and view blog post content in an efficient manner.
http://hdl.handle.net/10919/83214
Trail
Blogs
Web Scraping
Django
Python
Presentation
Report
Application
Trail Culture
Blog and Forum Collection for Trail Study
oai:vtechworks.lib.vt.edu:10919/220602023-11-29T16:42:11Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
2013-05-18
This project was in support of the educational activities of the Computing Research Association (CRA-E). The main point of the project was to collect data associated with electronic theses and dissertations (ETDs) to allow determination of why graduate students in computing go into computing research. The deliverables include a database of the data extracted from the ETDs analyzed and a framework for machine learning and manual approaches to this data extraction.
To accomplish these objectives, ETDs from North Carolina State University (NCSU), Florida State University (FSU), Auburn University (AU), Wake Forest University (WFU), and Virginia Tech (VT) were analyzed and results were inserted into the database. The Extensible Markup Language (XML) was decided upon as the structuring format for the data extracted from ETDs, and a tag structure was created utilizing biographical, educational, and institutional data from each ETD. Some of the tags included: author name, title of the paper, year published, undergraduate institution of the author, etc. XML was chosen because of its prevalence in the ETD field, its structural properties, and its ease of use. These tags were used to create the attributes for each entry in the database in Microsoft Access. Access was chosen mostly because of convenience and easy porting of tags into the system. However, the database could be moved into another system quite easily. Challenges that arose included missing data or insufficient information in various areas.
The second deliverable took the form of instructions (pg. 4 in the report) that could be given to an Amazon Mechanical Turk user in how to extract information. These instructions were created and provided in order to increase speed and decrease errors in manual data extraction. It was found that the basic structure of most ETDs is similar and is normally in this approximate order (dependent on institution of origin): title page, table of contents, abstract, actual content, biography, acknowledgements, and resume (not normally present). In these, all but the table of contents and the paper itself contains required information for the database. The instructions provide the most common locations for each tag/attribute and alternate locations (if any were found). They also instruct the Mechanical Turk user what to do in case of missing data for each attribute.
http://hdl.handle.net/10919/22060
Amazon Mechanical Turk
CRA
Microsoft Access
Machine learning
XML
ETD
Thesis
Dissertation
Database
Computing
Research
Joseph Luke
Lamont Banks
Database Creation and Information Extraction from ETDs for CRA-E
oai:vtechworks.lib.vt.edu:10919/982392023-11-29T16:42:12Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Meno, Emma
author
Vincent, Kyle
author
2020-05-12
The Twitter-Based Knowledge Graph for Researchers project is an effort to construct a knowledge graph of computation-based tasks and corresponding outputs. It will be utilized by subject matter experts, statisticians, and developers. A knowledge graph is a directed graph of knowledge accumulated from a variety of sources. For our application, Subject Matter Experts (SMEs) are experts in their respective non-computer science fields, but are not necessarily experienced with running heavy computation on datasets. As a result, they find it difficult to generate workflows for their projects involving Twitter data and advanced analysis. Workflow management systems and libraries that facilitate computation are only practical when the users of these systems understand what analysis they need to perform. Our goal is to bridge this gap in understanding. Our queryable knowledge graph will generate a visual workflow for these experts and researchers to achieve their project goals.
After meeting with our client, we established two primary deliverables. First, we needed to create an ontology of all Twitter-related information that an SME might want to answer. Secondly, we needed to build a knowledge graph based on this ontology and produce a set of APIs to trigger a set of network algorithms based on the information queried to the graph. An ontology is simply the class structure/schema for the graph. Throughout future meetings, we established some more specific additional requirements. Most importantly, the client stressed that users should be able to bring their own data and add it to our knowledge graph. As more research is completed and new technologies are released, it will be important to be able to edit and add to the knowledge graph. Next, we must be able to provide metrics about the data itself. These metrics will be useful for both our own work, and future research surrounding graph search problems and search optimization. Additionally, our system should provide users with information regarding the original domain that the algorithms and workflows were run against. That way they can choose the best workflow for their data.
The project team first conducted a literature review, reading reports from the CS5604 Information Retrieval courses in 2016 and 2017 to extract information related to Twitter data and algorithms. This information was used to construct our raw ontology in Google Sheets, which contained a set of dataset-algorithm-dataset tuples. The raw ontology was then converted into nodes and edges csv files for building the knowledge graph.
After implementing our original solution on a CentOS virtual machine hosted by the Virginia Tech Department of Computer Science, we transitioned our solution to Grakn, an open-source knowledge graph database that supports hypergraph functionality. When finalizing our workflow paths, we noted some nodes depended on completion of two or more inputs, representing an ”AND” edge. This phenomenon is modeled as a hyperedge with Grakn, initiating our transition from Neo4J to Grakn. Currently, our system supports queries through the console, where a user can type a Graql statement to retrieve information about data in the graph, from relationships to entities to derived rules. The user can also interact with the data via Grakn's data visualizer: Workbase. The user can enter Graql queries to visualize connections within the knowledge graph.
http://hdl.handle.net/10919/98239
Knowledge Graph
Ontology
Subject Matter Experts
Twitter
Twitter-Based Knowledge Graph for Researchers
oai:vtechworks.lib.vt.edu:10919/709422023-11-29T16:42:13Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Paradiso, Matthew
author
Morrison, Matthew
author
Suriano Siu, Julio
author
2016-05-04
Described is a project on the development of the SafeRoad software application; the report provides a reference for future work. This project was completed as a capstone requirement for CS 4624 (Multimedia, Hypertext, and Information Access) at Virginia Tech, guided by the client.
SafeRoad is a software application designed to to analyze the NHTSA vehicle complaint database, determine the most common complaints, and predict recalls based on these complaints. The goal of the application is to make our vehicles and roads safer and prevent the loss of life.
The software has been developed to be used by data analysts for automotive manufacturers and governing agencies such as the NHTSA. The software can be run either preemptively or in response to a complaint or series of complaints regarding an automobile or one of its components. The results of the program can lead to the issuing of a recall before more serious consequences occur.
The project has been developed using Java, database connectivity, and machine learning algorithms. A classifier training set has been created and included with the source code. The final product has proven to predict recalls with an accuracy level that is significantly higher than what was required.
http://hdl.handle.net/10919/70942
Machine learning
Natural Language Processing
Database
NHTSA
vehicle complaint database
automobile recalls
automotive manufacturers
classifier training set
Naive Bayes
SafeRoad
oai:vtechworks.lib.vt.edu:10919/1099802023-11-29T16:42:14Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Egnor, Rachel
author
Cochran, Tress
author
2022-05-08
The rise of the Covid-19 disease has brought challenges and opportunities that we have not been faced with before. With the new use of masks in our daily lives, reading lips to communicate with others when the surrounding environment may have increased noise levels, is a simple task that we no longer are able to fully utilize. The elimination of masks, however, does not eliminate the overall issue that trying to communicate in a noisy environment creates. To solve this issue, we have created and implemented an Android based phone application, PhonEtech, intended to facilitate and streamline communication between a variety of users, more specifically, when verbal communication is not ideal. Over the duration of a semester-long time frame, our team has designed, developed and tested this application idea, leading to a usable and functionable application. The finalized result is a phone app that can be developed further to create a more customizable experience with newer features. PhonEtech is not a replacement for verbal communication but instead an aid to a variety of users faced with various communication issues and roadblocks.
http://hdl.handle.net/10919/109980
Mobile Application
Android
Speech Assistance
PhonEtech: Recording Audio and Displaying Accompanying Text
oai:vtechworks.lib.vt.edu:10919/1150152023-11-29T16:42:14Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Todd, Jackson
author
2023-05-10
The Robotics and Sensorimotor Control Laboratory (RoSenCo), led by Dr. Netta Gurari, in the Department of Biomedical Engineering and Mechanics, is dedicated to conducting neurological research with stroke survivors. Stroke is a leading cause of disability worldwide, and RoSenCo aims to contribute to the development of treatments by studying the effects of stroke on tactile perception. RoSenCo researchers have designed a series of experiments to advance their goals, and this project focuses on developing software to facilitate these experiments in a flexible and expandable manner. The software, known as RoSenCo Experiment Manager (RoSenCoExMan), has been implemented to control actuators and collect data from sensors at a rate of 1600Hz. It also provides real-time graphing of selected data streams, displays text-to-speech-powered audio-visual instructions for participants, and saves collected data in the required format for subsequent analysis. RoSenCoExMan is written in Python, utilizing various libraries for implementing features such as graphing, hardware access, and text-to-speech capabilities. The software employs multi-processing techniques to achieve the required performance. Opportunities for future work include extending the audio-visual participant feedback functionality to enable experiments utilizing more dynamic visuals, and the addition of a graphical user interface.
http://hdl.handle.net/10919/115015
Neuroscience
Python
Stroke
Magnetic Resonance Imaging
Text to speech
Pneumatics
Real-time plotting
HCIInterfaceForStroke
oai:vtechworks.lib.vt.edu:10919/1156432023-11-29T16:42:15Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Blakemore, Talia
author
Phan, Long
author
2023-05-11
Our project involves expanding upon a previous recommendation system built by CS 5604 students. Previous CS 5604 teams have created a chapter summarization model to generate summaries for over 5000 Electronic Theses and Dissertations (ETDs). We used these summaries to fuel our recommendation system. Using chapter summaries improved our ability to predict resources that a user may be interested in because we narrowed our focus to individual chapters rather than the abstract of the whole paper. Authors will benefit from this recommendation system because their work will be more accessible. We provide a web page for users to explore how different clustering algorithms impact the search results, giving the user the ability to modify parameters such as the number of clusters and minimum cluster size. This web page will appeal to niche users interested in experimenting with recommendation systems, allowing them to fine-tune the recommendation results. We recommend for future work to continue exploring different clustering algorithms, as well as using our chapter recommendations to fuel a recommendation list based on each chapter. During this project, we learned about clustering algorithms, working as a team, and starting a project from the ground up. A previous CS5604 team built a stand-alone website that supports search, a recommendation system, and the ability to experiment with different search methods. During this semester, we expanded upon the existing website, using clustering algorithms to experiment with the recommendation system. Users may specify different parameters to understand how different clustering algorithms may change the recommendations.
http://hdl.handle.net/10919/115643
ETDs
Recommendation System
Machine Learning
clustering
KMeans
DBScan
ETD Recommendation System
oai:vtechworks.lib.vt.edu:10919/1099872023-11-29T16:42:16Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Amados, Zach
author
Gantulga, Tengis
author
Ramos, Renzo
author
Duong, Brian
author
2022-05-10
Online cyber threats have been an increasing concern since the dawn of the Internet. To combat this problem, USCyberRange, a cybersecurity education team, provides online courses and exercises to teach students about cybersecurity issues and solutions. Our team partnered with FourDesign, a graphic design team, to make the USCyberRange website have responsive design on devices with different screen sizes. Our team was provided with the initial website built by last semester’s group, as well as new Figma designs crafted by FourDesign. Throughout the course of the semester, our team used the WordPress interface and the Elementor plugin to implement many new Figma designs into the website and make them have responsive design for desktop, mobile, and tablet devices. Our team then received client feedback on our implementation, and made adjustments accordingly. The finalized product is a WordPress website that could be further developed to make all front-end pages responsive and to include their corresponding back-end functionality.
Our team has delivered the WordPress website, this report, and the final slide presentation.
http://hdl.handle.net/10919/109987
WordPress
Website
Computer Science
Capstone
PHP
Software Engineering
WebsiteCyberRangeUS2
oai:vtechworks.lib.vt.edu:10919/1070022023-11-29T16:42:17Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Ali, Syed A.
author
Gillies, Liam W.
author
Mulvaney, Elizabeth M.
author
2021-12-08
The goal of the Blockchain Etextbook group was to develop new content related to Ethereum under the Blockchain section of the OpenDSA textbook. OpenDSA is designed to inform students and researchers within the field of computer science about key topics within the field. This team specifically covered the content related to Blockchain, a new area of study in computer science. The expected audience for this topic is students and researchers either currently working in, or who are studying the topics within, Blockchain. This textbook aims to provide a single place for referencing material related to the topic.
Under the supervision of Dr. Cliff Shaffer at Virginia Tech, the team developed Blockchain content for the textbook. This included creating interactive exercises for the users to learn with and writing prose composed from researching resources about Blockchain and Ethereum. The original description of the project covered topics broadly within Blockchain but Dr. Shaffer narrowed his interest with the team down to Ethereum and topics related to that.
The team wrote textbook content related to the concepts of Ethereum including proof of stake, hard forks, crypto hacking, Ethereum Virtual Machine (EVM), and Gas. Our deliverables were reStructured Text files and HTML exercises related to these topics. In addition, the report gives users a tutorial on how to use the chapters within the textbook as well as giving future developers details on how to modify and improve chapters within the books. The team learned some of the issues with writing a textbook on new material since there is often limited or conflicting information regarding the topics.
http://hdl.handle.net/10919/107002
Blockchain
OpenDSA
Ethereum
Consensus Algorithms
HTML
Javascript
Ethereum Virtual Machines
Hard Forks
Textbook
Crypto Hacking
Blockchain Etextbook
oai:vtechworks.lib.vt.edu:10919/709322023-11-29T16:42:18Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Stulga, Steven
author
2016-05-04
To develop and test big data software, one thing that is required is a big dataset. The full English Wikipedia dataset would serve well for testing and benchmarking purposes. Loading this dataset onto a system, such as an Apache Hadoop cluster, and indexing it into Apache Solr, would allow researchers and developers at Virginia Tech to benchmark configurations and big data analytics software. This project is on importing the full English Wikipedia into an Apache Hadoop cluster and indexing it by Apache Solr, so that it can be searched.
A prototype was designed and implemented. A small subset of the Wikipedia data was unpacked and imported into Apache Hadoop’s HDFS. The entire Wikipedia Dataset was also downloaded onto a Hadoop Cluster at Virginia Tech. A portion of the dataset was converted from XML to Avro and imported into HDFS on the cluster.
Future work would be to finish unpacking the full dataset and repeat the steps carried out with the prototype system, for all of WIkipedia. Unpacking the remaining data, converting it to Avro, and importing it into HDFS can be done with minimal adjustments to the script written for this job. Continuously run, this job would take an estimated 30 hours to complete.
http://hdl.handle.net/10919/70932
Wikipedia
Hadoop Cluster
Solr
XML
Avro
Apache
English Wikipedia on Hadoop Cluster
oai:vtechworks.lib.vt.edu:10919/1129152023-11-29T16:42:19Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Bagbey, Trevor
author
Betsill, Matthew
author
Elgeoushy, Omar
author
Setareh, Daniel
author
2022-12-16
Examples of applied knowledge are vital to any university student looking to develop a deep and abiding understanding of their major. Such examples pour the foundation of changing the world through novel, thought provoking innovations and advancements by offering insight into how an idea can turn into a reality. This Case Study Library was developed with the purpose of providing students and others a litany of various cases in which specific Computer Science topics were relevant in industry.
Case studies, in this context, are the multimedia presentations by students of Virginia Tech in CS3604: Professionalism in Computing that take place at the end of the semester. Students are instructed to pick an example from industry pertaining to the Internet, Artificial Intelligence, Intellectual Property, Commerce, or Privacy. Through thorough research and the learnings from the class itself, the students construct their presentations on their topic of choice.
This project is the second iteration of the Case Study Library. From the previous project group’s work, the library held more than 500 case studies by students of Professionalism in Computing. It was evident that there were some major aspects that could be improved upon. These included titling the case studies by file name, no classification of case studies by course topic, and no thumbnail images. Along with this, a significant percentage of files were unable to be displayed due to file format issues. The burden of uploading case studies was on the professor, who needed to run a Python script to batch upload site items. This iteration of the project had a solid understanding of what needed to be done, including the addition of a student upload page, stylistic corrections, search parameter specification, and thumbnail images for files. Search filtering, student authentication, collections by course topic, and functionality to upload more than one case study file were also added. Changes to the site were made via a frontend administrative page as well as making additions and modifications to the code base.
With the Case Study Library having been improved, it now stands as an effective tool that current students of Professionalism in Computing can reference while they work on their own case studies.
http://hdl.handle.net/10919/112915
CS3604
Case Study
VTDLP
Computer Science
Artificial Intelligence
Privacy
ICT
Commerce
Intellectual Property
Computing
Library
CS3604 Case Study Library II
oai:vtechworks.lib.vt.edu:10919/832052023-11-29T16:42:20Zcom_10919_10194col_10919_18655
00925njm 22002777a 4500
dc
Ward, Ryan
author
Lee, Jun
author
Beard, Stuart
author
Edwards, Skylar
author
Su, Spencer
author
2018-05-07
The Global Event and Trend Archive Research (GETAR) project is supported by NSF (IIS-1619028 and 1619371) through 2019. It will devise interactive, integrated, digital library/archive systems coupled with linked and expert-curated webpage/tweet collections. In support of GETAR, the 2017 project built a tool to scrape the news to identify important global events. It generates seeds (URLs of relevant webpages, as well as Twitter-related hashtags and keywords and mentions). A display of the results can be seen from the hall outside 2030 Torgersen Hall.
This project extends that work in multiple ways. The quality of the work done has been improved. This is evident in changes done to the clustering algorithm and the user interface changes to the clustering display of global events. Second, in addition to events reported in the news, trends have been identified, and a database of trends and related events were built with a corresponding user interface according to the client’s preferences. Third, the results of the detection are connected to software for collecting tweets and crawling webpages, so automated daily runs find and archive webpages related to each trend and event.
The final deliverables include development of a trend detection feature with Reddit news, integration of Google Trends into trend detection, an improved clustering algorithm to have more accurate clusters according to k-means, an improved UI for important global events according to what the client wanted, and an aesthetically pleasing UI to display the trend information. Work accomplished included setting up a table of tagged entities for trend detection and configuring the database for clustering and trends to work with our personal machines, and completing the deliverables. Many lessons were learned regarding the importance of using existing tools, starting early, doing research, having regular meetings, and having good documentation.
http://hdl.handle.net/10919/83205
Trend Detection
Trends
Python
GETAR
Reddit
Google
News trends
Event Trend Detector
marc///col_10919_18655/100