Browsing by Author "Kusuma, Manisha"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
- Civil War Twin: Exploring Ethical Challenges in Designing an Educational Face Recognition ApplicationKusuma, Manisha (Virginia Tech, 2022-01-06)Facial recognition systems pose numerous ethical challenges around privacy, racial and gender bias, and accuracy, yet little guidance is available for designers and developers. We explore solutions to these challenges in a four-phase design process to create Civil War Twin (CWT), an educational web-based application where users can discover their lookalikes from the American Civil War era (1861-65) while learning more about facial recognition and history. Through this design process, we synthesize industry guidelines, consult with scholars of history, gender, and race, evaluate CWT in feedback sessions with diverse prospective users, and conduct a usability study with crowd workers. We iteratively formulate design goals to incorporate transparency, inclusivity, speculative design, and empathy into our application. We found that users' perceived learning about the strengths and limitations of facial recognition and Civil War history improved after using CWT, and that our design successfully met users' ethical standards. We also discuss how our ethical design process can be applied to future facial recognition applications.
- CS 5604 2020: Information Storage and Retrieval TWT - Tweet Collection Management TeamBaadkar, Hitesh; Chimote, Pranav; Hicks, Megan; Juneja, Ikjot; Kusuma, Manisha; Mehta, Ujjval; Patil, Akash; Sharma, Irith (Virginia Tech, 2020-12-16)The Tweet Collection Management (TWT) Team aims to ingest 5 billion tweets, clean this data, analyze the metadata present, extract key information, classify tweets into categories, and finally, index these tweets into Elasticsearch to browse and query. The main deliverable of this project is a running software application for searching tweets and for viewing Twitter collections from Digital Library Research Laboratory (DLRL) event archive projects. As a starting point, we focused on two development goals: (1) hashtag-based and (2) username-based search for tweets. For IR1, we completed extraction of two fields within our sample collection: hashtags and username. Sample code for TwiRole, a user-classification program, was investigated for use in our project. We were able to sample from multiple collections of tweets, spanning topics like COVID-19 and hurricanes. Initial work encompassed using a sample collection, provided via Google Drive. An NFS-based persistent storage was later involved to allow access to larger collections. In total, we have developed 9 services to extract key information like username, hashtags, geo-location, and keywords from tweets. We have also developed services to allow for parsing and cleaning of raw API data, and backup of data in an Apache Parquet filestore. All services are Dockerized and added to the GitLab Container Registry. The services are deployed in the CS cloud cluster to integrate services into the full search engine workflow. A service is created to convert WARC files to JSON for reading archive files into the application. Unit testing of services is complete and end-to-end tests have been conducted to improve system robustness and avoid failure during deployment. The TWT team has indexed 3,200 tweets into the Elasticsearch index. Future work could involve parallelization of the extraction of metadata, an alternative feature-flag approach, advanced geo-location inference, and adoption of the DMI-TCAT format. Key deliverables include a data body that allows for search, sort, filter, and visualization of raw tweet collections and metadata analysis; a running software application for searching tweets and for viewing Twitter collections from Digital Library Research Laboratory (DLRL) event archive projects; and a user guide to assist those using the system.