Behind the Scenes: Evaluating Computer Vision Embedding Techniques for Discovering Similar Photo Backgrounds

dc.contributor.authorDodson, Terryl Dwayneen
dc.contributor.committeechairLuther, Kurten
dc.contributor.committeememberQuigley, Paulen
dc.contributor.committeememberHuang, Lifuen
dc.contributor.departmentComputer Science and Applicationsen
dc.date.accessioned2023-07-12T08:00:35Zen
dc.date.available2023-07-12T08:00:35Zen
dc.date.issued2023-07-11en
dc.description.abstractHistorical photographs can generate significant cultural and economic value, but often their subjects go unidentified. However, if analyzed correctly, visual clues in these photographs can open up new directions in identifying unknown subjects. For example, many 19th century photographs contain painted backdrops that can be mapped to a specific photographer or location, but this research process is often manual, time-consuming, and unsuccessful. AI-based computer vision algorithms could be used to automatically identify painted backdrops or photographers or cluster photos with similar backdrops in order to aid researchers. However, it is unknown which computer vision algorithms are feasible for painted backdrop identification or which techniques work better than others. We present three studies evaluating four different types of image embeddings – Inception, CLIP, MAE, and pHash – across a variety of metrics and techniques. We find that a workflow using CLIP embeddings combined with a background classifier and simulated user feedback performs best. We also discuss implications for human-AI collaboration in visual analysis and new possibilities for digital humanities scholarship.en
dc.description.abstractgeneralHistorical photographs can generate significant cultural and economic value, but often their subjects go unidentified. However, if these photographs are analyzed correctly, clues in these photographs can open up new directions in identifying unknown subjects. For example, many 19th century photographs contain painted backdrops that can be mapped to a specific photographer or location, but this research process is often manual, time-consuming, and unsuccessful. Artificial Intelligence-based computer vision techniques could be used to automatically identify painted backdrops or photographers or group together photos with similar backdrops in order to aid researchers. However, it is unknown which computer vision techniques are feasible for painted backdrop identification or which techniques work better than others. We present three studies comparing four different types of computer vision techniques – Inception, CLIP, MAE, and pHash – across a variety of metrics. We find that a workflow that combines the CLIP computer vision technique, software that automatically classifies photo backgrounds, and simulated human feedback performs best. We also discuss implications for collaboration between humans and AI for analyzing images and new possibilities for academic research combining technology and history.en
dc.description.degreeMaster of Scienceen
dc.format.mediumETDen
dc.identifier.othervt_gsexam:38050en
dc.identifier.urihttp://hdl.handle.net/10919/115739en
dc.language.isoenen
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectConvolutional Neural Networks (CNN)en
dc.subjectComputer Vision (CV)en
dc.subjectPhotographyen
dc.subjectHistoryen
dc.subjectCultural Heritageen
dc.subjectAmerican Civil Waren
dc.titleBehind the Scenes: Evaluating Computer Vision Embedding Techniques for Discovering Similar Photo Backgroundsen
dc.typeThesisen
thesis.degree.disciplineComputer Science and Applicationsen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.levelmastersen
thesis.degree.nameMaster of Scienceen

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Dodson_TD_T_2023.pdf
Size:
1.4 MB
Format:
Adobe Portable Document Format

Collections