CS4624: Multimedia, Hypertext, and Information Access

Permanent URI for this collection

This collection contains the final projects of the students in in the course Computer Science 4624: Multimedia, Hypertext, and Information Access, at Virginia Tech. This course, taught by Professor Ed Fox, is part of the Human-Computer Interaction track, the Knowledge, Information, and Data track, and the Media/Creative Computing track. The curriculum introduces the architectures, concepts, data, hardware, methods, models, software, standards, structures, technologies, and issues involved with: networked multimedia (e.g., image, audio, video) information, access and systems; hypertext and hypermedia; electronic publishing; virtual reality. Coverage includes text processing, search, retrieval, browsing, time-based performance, synchronization, quality of service, video conferencing and authoring.

Browse

Recent Submissions

Now showing 1 - 20 of 294
  • CS4624 Projects Discovery Portal
    Perryman, Shayne; Kanaan, Adam; Fuentes, Mark; Gonzalez, Maddie; Moshin, Faizan (Virginia Tech, 2024-05-01)
    CS4624 is one of many capstone courses a student within the CS curriculum is able to take. The multimedia and hypertext course digs into the diverse range of multimedia content such as images, audio, video, and any information retrieval and access relating to it. With this comes the capstone project which is a semester long project given to us students to allow a display of mastery within our discipline. It has been a pleasure to have Dr. Farag and Vedant Shah guide and assist us with the project. An insight into real-world applications as well as a diverse approach to different problems has allowed us to grow as both people and developers. The current discovery portal for CS4624 student projects serves as a platform for students working within the course to submit and hold their projects. Details included within the pages on the discovery portal consist of abstract, date, author, presentation, final report, source code, and collections. Additionally, the discovery portal contains filtering features to allow users to specifically search for any project dependent on; recent, issue date, author, title, subject, content type, and department. This allows teachers to easily access desired projects as well as a safe holding for semester long projects that students worked hard on. With this comes the purpose of the project. After reviewing the functionality and appearance of the discovery portal, there were many things that needed to be improved on. The first very noticeable issue was that the search features did not either function properly or at all. For starters, the ‘By Issue Date’ filter would not allow filtering with either month or year. It was required that both were specified along with the open search requiring an exact formatting of date (xx-xx-xxxx). The ‘By Author’ filter expected that the full name was typed out, including the comma separating the first and last name. All open search features expected a prompt that would produce an exact match. It also came to our attention that some features produced over a thousand categories for only 269 projects. This should never be the case as the purpose of filtering is to narrow the search for projects. After analyzing the discovery portal, our main focus became improving upon the search and filtering features. This was something that would require us to completely recreate the discovery portal due to existing source code being unavailable. With this, we first needed to create a front-end and back-end that would relay any information requests to have a web page display. To replicate the discovery portal further, we also implemented an authentication aspect. Accounts would be divided into ‘admin’, ‘professor’, and ‘user’, each having distinct permissions on what they are able to insert, delete, and modify. Following this was the development of our database to store all of the projects’ files as well as a schema that the search and filtering features would utilize. Finally, we implemented a search api for the back-end to access, a completed schema for each project, and created a function search and filter function. Upon completion, we hope to provide future CS4624 students and staff with a more convenient tool to guide them in their journey of completing their capstone projects.
  • Crisis Events Information Extraction
    Rabbani, Eitsaam; Spies, Will; Gregory, Sully; Brown, Brianna; Saikrishna, Nishesh  (Virginia Tech, 2024-05-01)
    Unfortunately, crises occur quite frequently throughout the world. In an increasingly digital age, where most news outlets post articles about events online, there are often tens or even hundreds of articles about the same event. Although the information found in each article is often similar, some information may be specific to a certain article or news outlet. And, as each news outlet usually writes a lengthy article for each crisis event that happens, it can be hard to quickly locate and learn the basic, important information about a given crisis event. This web app project aims to expedite this lengthy process by consolidating any number of articles about a crisis event into who, what, where, when, and how (WWWWH). This information extraction is accomplished using machine learning for named entity recognition and dependency parsing. The extracted WWWWH info is displayed to the user in an easily digestible table, which allows for users to quickly learn the essential information regarding any given crisis event. Both the user’s input and the output data will be saved to a database, so that users can see their previous usages of the program again at any time. While users must manually input web articles into the program, whether as links or .txt files, there is potential in the future to use a web crawler to automate this initial article gathering. The stack for this applications utilizes the MERN Stack. MongoDB was chosen due to its flexible document structure. For the back-end features such as natural language processing and our server we utilized Python and Express/Node.js. The front-end consists of React which is used to fetch our data and utilizes component libraries such as MUI for consistent design language. The deliverables for this project include our Final Presentation and Final Report which show our progress throughout the development stages, and finally our code for the application which are submitted to our professor and client, Mohamed Farag.
  • Integrated Web App for Traffic Simulator
    Nguyen, Phu; Knight, Ryan; Issing, Alex; Shah, Karan; Desai, Joey (2024-05-01)
  • Crisis Events Text Summarization
    Shah, Tarek; Crisafulli, Francesco; Sonnakul, Aniket; Mohammed, Farhan; Guadiamos, Santiago (2024-05-01)
    From mass shootings to public health emergencies, crisis events have unfortunately become a prevalent part of today’s world. This project contributes to the advancement of crisis response capabilities by providing an accessible tool for extracting key insights from diverse sources of information. The system allows users to create collections to aggregate articles and data relevant to specific crisis events. Users can upload files in various formats, including text files of links, zip files containing articles in text format, or zip files with HTML content. The program extracts and organizes information from these sources, storing it efficiently in a SQLite database for future retrieval and analysis. One of the key features of the system is its flexibility in text summarization. The current summarizers available are BERT, T5, and NLTK, but it would be relatively easy to add new summarizers at a later date. Currently, the NLTK and T5 summarizers work relatively quickly, but the BERT summarizer takes minutes before it finishes summarizing. This is because the BERT summarizer is the most powerful, being a larger model and requiring more processing. The front-end of the application is written in React.js using JavaScript. The back-end is composed of the database, the scraper, and the summarizers. The code for accessing the database is written in Python. The Flask framework facilitates back-end operations, allowing seamless integration between frontend and database functionalities. The code for the summarizers is also written using Python. The libraries used in the summarizer code are NLTK, Transformers, PyTorch, and Summarizer. The code for the web scraper is also written using Python and utilizes the BeautifulSoup4 library for parsing HTML. Overall, this project aims to empower users with a crisis information management tool that efficiently aggregates, extracts, and summarizes data to aid in crisis response and decision-making.
  • CS4624: Crisis Events Knowledge Graph Generation
    de Chutkowski, James; Abhilash, Geethika; Turkiewicz, Justin; Tran, Anthony; Walters, Matthew (2024-05-01)
    In a world inundated by information during crisis events, the challenge isn’t just finding data, it’s making sense of it. Knowledge graphs rise to this challenge by structuring disparate data into interconnected insights, enabling a clearer understanding of complex situations through visual relationships and contextual analysis. This report presents a web-based application for generating and managing knowledge graphs and details the process taken to create it. The application integrates React with Material-UI for the frontend, Flask for the backend, and MongoDB and Neo4j for data storage. Users input multi document collections which are processed using Beautiful Soup, Stanford’s Core NLP, NLTK and SpaCy to extract and analyze data, forming triplestores and named entities. These elements are then used to generate knowledge graphs that are stored in Neo4j and rendered on the web via Sigma.js and Graphology. This report addresses development processes, features, testing, step-by step guidance for users and developers, the lessons we learned working on this project, and potential enhancements that can be implemented by future student groups picking up our project.
  • Creating a Website for Election Predictions
    Tran, Pierre; Pham, Danny; Hong, Eunice; Chung, Danny; Tran, Bryan (2024-05-01)
    This project, Creating a Website for Election Predictions, aims to present 2024 presiden- tial election predictions at the county level using demographic variables. By leveraging machine learning techniques on historical survey data, our website offers an interactive map and data visualization tools for public access. This non-academic approach seeks to provide a more accurate and representative analysis of election predictions, diverging from traditional poll-based methods. Additionally, it serves as a user-friendly platform for policymakers and the public to gain insightful, data-driven perspectives on election outcomes.
  • Crisis Events News Coverage Analysis
    Alemu, Hiwot; Hashmi, Rayyan; Wallace, Eric; Harrison, Dylan (2024-05-01)
    Analysis of the coverage of crisis events by news agencies provides important information for historians, activists, and the general public. Detecting bias in news coverage is a direct benefit. Thus, there is a need for an automated tool that, given a set of crisis events and a dataset of webpages about these events, can extract the set of news media outlets that reported about these events and how frequently, the types of events covered by each media outlet, and how each news media outlet links to other outlets, if any. Bias detection and sentiment analysis can then be applied to each media outlet to discover hidden patterns. The web application we have designed will allow users to provide a collection of URLs or HTML files for webpages reporting on a crisis event. The program will provide a thorough analysis of the provided collection, detecting bias in news coverage as well as linkage between different domains. The results of this analysis will then be returned to the user, offering insights into their provided collection of news articles in a way that is accurate, informative, and easy to understand. Our team is optimistic that the application we have developed will assist users in navigating the complexities of news reporting during periods of uncertainty. In today's increasingly divided and turbulent political landscape, discerning the truth from misinformation is more crucial than ever. We believe that our application will empower individuals to make more informed decisions through enhancing the transparency of online news organizations, ultimately contributing to a culture of more responsible journalism and improved civic discourse.
  • Building an Intelligent QA/Chatbot with LangChain and Open Source LLMs
    Bogusz, William; Mohbat, Cedric; Liu, James; Neeser, Andrew; Sigua, Alex (2024-05-01)
    We have created a web application enabling access to Intelligent Q/A chatbots, where the end user has access to query language learning models to retrieve context specific information. This web application will provide a collection-based interface, where documents uploaded by the user provide the context for responses by the language learning model to user input. This is accomplished through retrieval augmented generation (RAG) pipeline. As to reduce inaccuracies and fulfill the user needs of the client, the language learning model will notify the user if a query cannot be sufficiently answered given the documents in a collection. As such, our application emphasizes collection management with the functionality to upload (in .txt, .html or .zip format) and delete documents as well as select specific collections, while providing a familiar interface not much different from the web interface for established AI chatbot services such as OpenAI’s ChatGPT or Anthropic’s Claude. The final product also currently encompasses a landing page and user login, with accessibility to a document upload portal for creating document collections.
  • Discovery Portal for Twitter Collection
    Saif, Hamza; Mustard, Fiona; Duduru, Sai; Forest, Kyra; Agadkar, Vaasu (Team 10, 2024-05-01)
  • Integrated Web App for Crisis Events Crawling
    Hong, Michelle ; Rathje, Sondra ; Angeley, Stephen ; Teaford, Jordan ; Braun, Kristian (Virginia Tech, 2024-04)
    The integration of a web crawler and a text classifier into a unified web application is a practical advancement in digital tools for crisis event information retrieval and parsing. This project combines HTML text processing techniques and a priority-based web crawling algorithm into a system capable of gathering and classifying web content with high relevance to specific crisis events. Utilizing the classifier project’s model trained with targeted data, the application enhances the crawler's capability to identify and prioritize content that is most pertinent to the crisis at hand. The transition from Firebase to MongoDB for backend services provides a much more flexible, accessible, and permanent database solution. As well as this, the system’s backend is further supported by a Flask API, which facilitates the interaction between the frontend, the machine learning model, and the database. This setup not only streamlines the data flow within the application but also simplifies the maintenance and scalability of the system. This integrated web app aims to serve as a valuable tool for stakeholders involved in crisis management, such as journalists, first responders, and policy makers, enabling them to access timely and relevant information swiftly. During development of this project there were many challenges with fixing the two projects; out of the box neither was functional when they were obtained from their respective repositories. As well as this, the projects had incomplete documentation, leaving a lot for our team to figure out on our own. The results of our team is a redesigned frontend, backend, and MongoDB local database together into a cohesive, full application.
  • ScrapingGenAI
    Do, James ; Bae, Heewoon ; Colby, Julius (2024-05-10)
    AI has been widely used for many years and has been a constant front-page news topic. The recent but fast development of generative AI inspired many conversations, from concerns to aspirations. Understanding how the topic develops and when people become more supportive of generative AI is critical for social scientists to pinpoint which developments inspire public discussions. The use of generative AI is relatively new. The data and insight gathered could be used to determine if use in a commercial setting (like in Travel/Hospitality) is viable and what the potential feedback from the public might look like. We developed two specialized web scrapers. The first targets specific keywords within Reddit subreddits to gauge public opinion, and the second extracts discussions from corporate earnings calls to capture the business perspective. The collected data were then processed and analyzed using Python libraries, with visualizations created in Matplotlib, Pandas, and Tkinter to depict trends through line charts, pie charts, and bar charts. We limited our analysis period from August 2022 to March 2024, which is significant as ChatGPT was released in November 2022, allowing us to observe notable changes. These tools not only show changes in public interest and sentiment but also provide a graphical representation of temporal shifts in the perception of AI technologies over time. The final product is designed for anyone interested in company transcripts and in comparing them to the public perspective. The product offers users access to detailed data representations, including numerical trends and visual summaries to further understand the correlation between the company and the public. This comprehensive overview assists in understanding how public and corporate sentiments towards AI have shifted during a recent 20-month period. A significant hurdle was using the PRAW API for Reddit data scraping. Through review of documentation, tutorials, and additional support from a teaching assistant, we successfully implemented the functionality needed to extract and process the data from subreddits effectively. To make our findings more accessible and engaging, future additional work transforming this product into a fully functional website would be beneficial. This platform would make the insights more readily available to a wider audience, including the general public and industry stakeholders. Doing so could enhance the impact and usefulness of our project.
  • Assistive Voice Assistant
    Satnur, Abishek Ajai; Bruner, Charles (2024-05-09)
    This project is an extension of work that has been done in previous years on the sharkPulse website. sharkPulse was created due to the escalating exploitation of shark species and the difficulty of classifying shark sightings. Due to sharks’ low population dynamics, exploitation has only exacerbated the issue and made sharks the most endangered group of marine animals. sharkPulse retrieves sightings from several sources such as Flickr, Instagram, and user submissions to generate shark population data. The website utilizes WordPress , HTML, and CSS for the front end and R-Shiny, PostgreSQL, and PHP to connect the website to the back end database. The team was tasked with improving the general usability of the site by integrating dynamic data-informed visualizations. The major clients of the project are Assistant Professor Franceso Ferreti from the Virginia Tech Department of Fish and Wildlife Conservation and Graduate Research Assistant Jeremy Jenrette. The team established regular contact through Slack, scheduled weekly meetings online with both clients, and acquired access to all major code repositories and relevant databases. The team was tasked with creating dynamic and data-informed visualizations, general UI/UX improvements, and stretch goals for improving miscellaneous pages throughout the site. The team developed PHP scripts to model a variety of statistics by dynamically querying the database. These scripts were then sourced directly through the site via the Elementor WordPress module. All original requirements from the clients have been met as well as some stretch goals established later in the semester. The team created a Leaflet global network map of affiliate links which dynamically sourced the sharkPulse social network groups from an Excel spreadsheet and generated country border markers and links to each country’s social network sites as well as a Taxonomic Accuracy Table for the Shark Detector AI. The team created and distributed a survey form to collect user feedback on the general usability of the site which was compiled and sent to the client for future work.
  • SharkPulse App
    Hagood, Mia; Warner, Patrick; Tran, Anhtuan Vuong (2024-05-09)
    This project is an extension of work that has been done in previous years on the sharkPulse website. sharkPulse was created due to the escalating exploitation of shark species and the difficulty of classifying shark sightings. Due to sharks’ low population dynamics, exploitation has only exacerbated the issue and made sharks the most endangered group of marine animals. sharkPulse retrieves sightings from several sources such as Flickr, Instagram, and user submissions to generate shark population data. The website utilizes WordPress , HTML, and CSS for the front end and R-Shiny, PostgreSQL, and PHP to connect the website to the back end database. The team was tasked with improving the general usability of the site by integrating dynamic data-informed visualizations. The major clients of the project are Assistant Professor Franceso Ferreti from the Virginia Tech Department of Fish and Wildlife Conservation and Graduate Research Assistant Jeremy Jenrette. The team established regular contact through Slack, scheduled weekly meetings online with both clients, and acquired access to all major code repositories and relevant databases. The team was tasked with creating dynamic and data-informed visualizations, general UI/UX improvements, and stretch goals for improving miscellaneous pages throughout the site. The team developed PHP scripts to model a variety of statistics by dynamically querying the database. These scripts were then sourced directly through the site via the Elementor WordPress module. All original requirements from the clients have been met as well as some stretch goals established later in the semester. The team created a Leaflet global network map of affiliate links which dynamically sourced the sharkPulse social network groups from an Excel spreadsheet and generated country border markers and links to each country’s social network sites as well as a Taxonomic Accuracy Table for the Shark Detector AI. The team created and distributed a survey form to collect user feedback on the general usability of the site which was compiled and sent to the client for future work.
  • Case Studies Library
    O'Such, Joseph; Woody, Jonathan; Fields, Eliza; Jaldi, Hamza (2024-05-09)
    The purpose of this project is to create an online repository for the CS 3604 course at Virginia Tech. This course, Professionalism in Computing, has students complete case studies on various ethical issues. The issues range from historical supreme court cases to ongoing struggles. Each year nearly 300 such studies are conducted. There should be a mechanism in place to store these studies so the repository is easy to navigate and search. Previous work attempted to use a preexisting digital library tool hosted on AWS to implement this repository. Over time, the CS 3604 copy became out of sync and out of date, leading to a mountain of issues. Initially, this group sought to overcome those issues and stay with the previous approach. After attempting to resolve those issues, the group met with a software engineer from the team supporting the original digital library platform. This resulted in a switch to a custom website, built from scratch, to host the CS 3604 repository. The new full stack website used React.js, Express.js, Node.js, and MongoDB to accomplish this goal. Due to the late start, the group created a preliminary website architecture, before breaking into tasks of frontend development, application development, backend work, and authentication. The new repository offers a user profile to each student in the capstone class that is accessed via a Microsoft login linked to their Virginia Tech account. Each user can upload a title, list of tags, and PDF document showcasing their case study. The rest of the site is publicly accessible and can be searched by title and tags. The searching features are less sophisticated compared to the prior website. However, the new website has the advantages of user login, linking of case studies to users via login, and easier maintainability.
  • Language and Sentiment Analysis of Extremist Behavior in Online Game Communities
    McBride, Liam; Lanigan, Daniel; Neps, Renzo (2024-05-08)
    Language and Sentiment Analysis of Extremist Behavior in Online Game Communities was a Multimedia, Hypertext, and Information Access capstone project to assist the VT Gamer Lab in gathering more data to analyze links between online video game communities and extremist behavior. Specifically military simulation games (referred to as milsims) were analyzed due to the inherently political and violent nature of the gameplay. The deliverables for the project were a community forum and YouTube scraper, cleaned data, visualizations, and sentiment analysis. We collected large datasets from both the community forums and YouTube, successfully cleaned the data, and did an analysis to create interesting visualizations. Sentiment analysis was originally going to be conducted with the client but was delayed past the submission of our report so we created our own analysis methods to produce interesting visualizations. Two of these visualizations showed us that the potentially extremist language on both platforms is very similar, both in word choice and frequency. This suggests that the communities define the language more than the platform they exist on.
  • Knowledge Graph Building
    Hao, Qianxiang; Xing, Haoran (2024-05-09)
    Our team’s main objective was to expand the Virtuoso database by integrating a comprehensive dataset of 500,000 enriched Electronic Theses and Dissertations (ETDs). We built upon the preliminary framework of 200 XML records used for initial testing. This database expansion would enable the developers to deploy more robust testing and analysis of the current Knowledge Graph database. Additionally, our team focused on standardizing the data expansion process, ensuring that future developers have a consistent and reliable foundation for their work. The current Knowledge Graph was established with the Virtuoso graph database system. We primarily worked on four steps to expand the KG database, including inserting Object IDs into each element in XML files, converting XML files to RDF triples, uploading RDF triples to the Virtuoso database, and URI resolution. We leveraged the power of Python, along with its robust libraries (rdflib, sparqlwarpper, requests, xmltodict, Node.js, NPM, tkinter) and tools (REST API, Docker) to execute these steps. Initially, our team successfully tested the data expansion process on a local Virtuoso instance to ensure the functionality and correctness of the expanding procedure. We prepared to deploy the process on the Virtuoso database within the Endeavour cluster upon confirmation. Although we successfully expanded the database by 333 ETDs, we were unable to reach our target of 500,000 ETDs due to a shortage of XML data. This limitation made us refocus our efforts on refining the data expansion process for better standardization and future scalability. We streamlined the data expansion process by integrating the Object ID insertion, data conversion, and data uploading processes into a single GUI application, creating a more straightforward and compact workflow. This visual interface would enhance usability for future developers and teams.
  • AgInsuranceLLMs
    Shi, Michael; Rajesh, Saketh; Truong, An; Hilgenberg, Kyle (2024-05-09)
    Our project is to develop a conversational assistant to aid users in understanding and choosing appropriate agricultural insurance policies. The assistant leverages a Large Language Model (LLM) trained on datasets from the Rainfall Index Insurance Standards Handbook and USDA site information. It is designed to provide clear, easily understood explanations and guidance, helping users navigate their insurance options. The project encompasses the development of an accessible chat interface, backend integration with a Flask API, and the deployment of the assistant on Virginia Tech's Endeavour cluster. Through personalized recommendations and visualizations, the assistant empowers users to make well-informed decisions regarding their insurance needs. Our project report and presentation outline the project's objectives, design, implementation, and lessons learned, highlighting the potential impact of this interactive conversational assistant in simplifying the complex process of selecting agricultural insurance policies.
  • Tweet Collections
    Kolakaleti, Sushen; D'Alessandro, Kevin; Narantsatsralt, Enk; Mruz, Ilya; Lam, Chris (Virginia Tech, 2024-05-09)
    For a series of various Virginia Tech research projects related to Dr. Andrea Kavanaugh, more than six billion tweets between the years 2009-2024 were collected to be used for research purposes. These tweets cover many topics, but primarily focus on trends and important events that occurred during the time period. These tweets were collected in three different formats: Social Feed Manager (SFM), yourTwapperKeeper (YTK), and Digital Methods Initiative Twitter Capture and Analysis Toolset (DMI-TCAT). The original focus of the project was to convert these tweets into a singular format (JSON) to make tweet access easier and simplify the research process. The team in the Fall of 2021 consisting of Yash Bhargava, Daniel Burdisso, Pranav Dhakal, Anna Herms, and Kenneth Powell were the first to take on this project and managed to finish the process of writing the initial Python scripts used to convert the three tweet formats to JSON. They originally provided six different Python scripts, two for each of the three tweet formats, one for the individual schema and the other for the collection level schema. However, large parts of these Python scripts were highly unoptimized and would take an unreasonably long time to run. Thus, the team in Spring of 2022 consisting of Matt Gonley, Ryan Nicholas, Nicole Fitz, Griffin Knock, and Derek Bruce took on the project and managed to optimize a portion of the original Python scripts in addition to implementing a BERT-based machine learning model used to classify the tweets. They adjusted the scripts to better accommodate scale and were able to begin the tweet conversion process, getting through about 800 million of the roughly 6 billion tweets collected. This project was taken over again in Spring of 2024, and began by writing additional automation scripts to simplify the process and reduce the amount of work that had to be done manually for the SFM conversion process. In addition to writing new scripts, our team updated some of the scripts done by the past team, to better suit our uses. We exported 45 collections from the SFM machine and were able to convert 9,744,468 tweets from SFM. Regarding DMI_TCAT and YTK, the raw SQL files needed to be transferred to a new database in order to convert the remaining tweets. This process was begun for DMI and YTK at the Digital Library Research Laboratory, located in room 2030 at Torgerson Hall, and will be continued into Summer 2024. Regarding the machine learning aspects of the project, we implemented a new hate speech classifier, due to the prevalence of hate speech on the internet. We ran a test with both a GloVe model and a BERT model with a Naive Bayes classifier, before ultimately settling on the GloVe model due to the speed being significantly faster while still providing enough accuracy to be useful.
  • Chapter Classification and Summarization
    Jackson, Miles; Zhao, Yinhjie (2024-05-07)
    The US corpus of Electronic Theses and Dissertations (ETDs), partly captured in our research collection numbering over 500,000, is a valuable resource for education and research. Unfortunately, as the average length of these documents is around 100 pages, finding specific research information is not a simple task. Our project aims to tackle this issue by segmenting our sample of 500,000 ETDs, and providing a web interface that provides users with an application that summarizes individual chapters from the previously segmented sample. The first step of the project was to verify that the automatic segmentation process, performed in advance by our client, could be relied upon. This required each team member to analyze 50 segmented documents and verify their integrity by confirming that each chapter was correctly identified and separated into a PDF. During this process, we noted any peculiarities, to identify recurring issues and improve the segmentation process. The rest of our time and effort went into creating an efficient web interface that would allow users to upload ETD chapters and display said chapter’s summary and classification results. We were able to complete a web interface that allows a user to upload an ETD chapter PDF from the sampled ETD database and view the summary of the PDF along with all of the metadata (author, title, publication date, etc.) of the associated ETD. Additionally, the group verified approximately 60 of the automatically segmented documents and detailed any errors or peculiarities thoroughly. Our group delivered both the web interface as a GitHub repository and an Excel spreadsheet detailing the complete results of our segmentation verification process. The interface was designed to be used in aiding research on ETDs. Although this application won’t be available publicly, researchers may use it privately to assist with any ETD research projects they participate in. The web interface uses Streamlit, which is a Python framework for web development. This was the first time anyone in the group had used Streamlit, so we had to learn each feature that we used, which caused quite a few issues. However, quickly searching and accessing the metadata database, which was originally an Excel sheet with 500,000 entries, posed the biggest threat to the usability of our interface. Luckily, we were able to solve all issues through the use of API documentation, our client, Bipasha Banerjee, and our extremely helpful instructor, Professor Edward A. Fox. In terms of technical skills, we have learned how to operate a Streamlit web interface as well as how to use MySQL. However, we also learned a few life lessons. Firstly, do not use the first tool available when attempting to solve a solution. It is wise to take extra time to search for the best tool for a given situation instead of wasting time compensating for using the wrong tool. Secondly, life happens without regard and without warning, but the best move is to reanalyze the situation and push forward to complete the work that must be done.
  • CTE Website
    Sheik, Smera; Alavala, Vaishnavi; Waheed, Aren; Strine, Corbin (Virginia Tech, 2024-05-08)
    The Computational Tissue Engineering program at Virginia Tech is an interdisciplinary program that allows graduate students to learn about the following fields: Tissue Engineering, Computational Science, and Molecular and Cell Biology. The vision of the CTE program is for students to feel better equipped about these disciplines and act as trained professionals that can both develop and help push the boundaries of these disciplines. The current CTE website was created a decade ago with a software system called Basecamp by our client Dr. Murali. Throughout the years, it was established that the CTE website became more difficult to update due to newer releases and versions of Basecamp and PHP. Our goal for this project was to update the current CTE website to a modern framework that would allow for an easier to update interface. Our methodology to update the current CTE website started with choosing a web development system that would best fit the needs and requirements of this CTE website. The capabilities of WordPress and additive functionalities led us to choose WordPress as our web development system. Our methodology to update Dr. Murali’s research website involved understanding the layout and overview of how the website currently looks, researching the UI of other research websites, and creating the research website on WordPress. The outlined project deliverables involved understanding the pros and cons of choosing a web development system that would leverage the capabilities of easier maintenance with a refined layout, implementing a bare bones CTE website, and implementing additional features including building Dr. Murali’s research website. Throughout the project, group members worked together to understand the front-end and back-end aspects of the project including researching specific plugins to use that would best fit the feasibility of the website, building of Figma wireframes, form creation, migration of past CTE website pages to the new CTE website, and testing both the functionalities of the CTE website and Dr. Murali’s research website. Final URL for CTE Website: https://wordpress.cs.vt.edu/cteigep/ Final URL for Dr. Murali's Personal Website: https://wordpress.cs.vt.edu/tmmurali/research/