Generative AI Framework for 3D Object Generation in Augmented Reality

Behravan, Majid

Generative AI Framework for 3D Object Generation in Augmented Reality

dc.contributor.author	Behravan, Majid	en
dc.contributor.committeechair	Gracanin, Denis	en
dc.contributor.committeemember	Wang, Xuan	en
dc.contributor.committeemember	North, Christopher L.	en
dc.contributor.department	Computer Science and Applications	en
dc.date.accessioned	2025-01-04T09:00:50Z	en
dc.date.available	2025-01-04T09:00:50Z	en
dc.date.issued	2025-01-03	en
dc.description.abstract	This thesis presents a framework that integrates state-of-the-art generative AI models for real-time creation of three-dimensional (3D) objects in augmented reality (AR) environments. The primary goal is to convert diverse inputs, such as images and speech, into accurate 3D models, enhancing user interaction and immersion. Key components include advanced object detection algorithms, user-friendly interaction techniques, and robust AI models like Shap-E for 3D generation. Leveraging Vision Language Models (VLMs) and Large Language Models (LLMs), the system captures spatial details from images and processes textual information to generate comprehensive 3D objects, seamlessly integrating virtual objects into real-world environments. The framework demonstrates applications across industries such as gaming, education, retail, and interior design. It allows players to create personalized in-game assets, customers to see products in their environments before purchase, and designers to convert real-world objects into 3D models for real-time visualization. A significant contribution is democratizing 3D model creation, making advanced AI tools accessible to a broader audience, fostering creativity and innovation. The framework addresses challenges like handling multilingual inputs, diverse visual data, and complex environments, improving object detection and model generation accuracy, as well as loading 3D models in AR space in real-time. In conclusion, this thesis integrates generative AI and AR for efficient 3D model generation, enhancing accessibility and paving the way for innovative applications and improved user interactions in AR environments.	en
dc.description.abstractgeneral	This thesis explores how advanced artificial intelligence (AI) can create realistic three-dimensional (3D) objects in real-time within augmented reality (AR) environments. AR is a technology that overlays digital content onto the real world through AR glasses. Our primary goal is to transform different types of inputs, such as pictures and speech, into precise 3D models, enhancing the user's experience and interaction with their surroundings. The framework includes advanced techniques to detect and process objects from images and text, using powerful AI models. These models, called Vision Language Models (VLMs) and Large Language Models (LLMs), help the system analyze the inputs accurately and provide suggestions for creating objects that fit the environment and the user's needs. The integration of these technologies with text-to-3D and image-to-3D models allows virtual objects to blend seamlessly into the real world, creating an immersive experience. This technology has practical uses in various fields. In gaming, it allows players to design and interact with custom game items. In retail, it enables customers to see how products would look in their own space before buying them. For interior design, it allows designers to create 3D models of real-world objects for planning and visualization. A key achievement of this research is making 3D model creation more accessible to everyone, not just experts. This democratization fosters creativity and innovation, allowing more people to benefit from AR technology. The framework also addresses technical challenges, such as understanding multiple languages and complex environments, improving the accuracy and quality of the generated 3D models.	en
dc.description.degree	Master of Science	en
dc.format.medium	ETD	en
dc.identifier.other	vt_gsexam:42292	en
dc.identifier.uri	https://hdl.handle.net/10919/123898	en
dc.language.iso	en	en
dc.publisher	Virginia Tech	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.subject	Generative AI	en
dc.subject	Augmented reality	en
dc.subject	Natural language processing	en
dc.subject	Image processing	en
dc.title	Generative AI Framework for 3D Object Generation in Augmented Reality	en
dc.type	Thesis	en
thesis.degree.discipline	Computer Science & Applications	en
thesis.degree.grantor	Virginia Polytechnic Institute and State University	en
thesis.degree.level	masters	en
thesis.degree.name	Master of Science	en

Files

Original bundle

Now showing 1 - 2 of 2

Name:: Behravan_M_T_2025.pdf
Size:: 49.34 MB
Format:: Adobe Portable Document Format

Download

Name:: Behravan_M_T_2025_support_1.pdf
Size:: 33.77 KB
Format:: Adobe Portable Document Format
Description:: Supporting documents

Download

Collections

Masters Theses