Artificial Intelligence (AI)-based Semantic Communications with Multimodal Data: Framework and Implementation
dc.contributor.author | DeRieux, Jean-Luc Aristide | en |
dc.contributor.committeechair | Saad, Walid | en |
dc.contributor.committeemember | Dhillon, Harpreet Singh | en |
dc.contributor.committeemember | Ramakrishnan, Narendran | en |
dc.contributor.department | Electrical Engineering | en |
dc.date.accessioned | 2025-09-23T08:00:19Z | en |
dc.date.available | 2025-09-23T08:00:19Z | en |
dc.date.issued | 2025-09-22 | en |
dc.description.abstract | Semantic communication (SC) has emerged as an effective paradigm for reducing the bandwidth needs of wireless services by exploiting the so-called "semantics" or meaning behind the data. To date, existing works in this area either focus on multimodal approaches only and omit context-aware recovery or embed it in cross-modal settings, such as audio-to-video, rather than providing a unified, modality-agnostic method. These works also impose substantial architecture redesigns for additional modalities support and are not easily extensible. In contrast to prior work, in this thesis, a novel semantic framework called the semantic context-aware framework for adaptive multimodal reasoning (SCE-FOAM) is proposed. SCE-FOAM is a multimodal semantic framework that enables compact transmission, efficient reconstruction, and contextually-aware predictions using a unique microservice-based architecture. This unique design simultaneously offers an extensible and modular platform for incorporating new modalities and enabling scalable deployment strategies. Experimental results show that SCE-FOAM can achieve data reductions up to 50% for text, 94.56% for audio, and 98.70% for images, respectively. Lastly, the proposed contextual‑prediction model achieves an average accuracy of 90% across all modalities. In addition, a heuristic-based extension of the deferred acceptance (DA) matching algorithm is proposed. The extension enables network node matches that incorporate exploration, coverage, and diversification heuristics. In summary, this thesis presents a unified, extensible, context-aware, multimodal SC framework and a heuristic extension to the DA matching algorithm. | en |
dc.description.abstractgeneral | Semantic communication is a method for a device to talk with other devices on a wireless network using their own unique and dynamic language, rather than traditional methods that rely on information represented as binary 1's and 0's. By representing information in a traditional binary form, the devices are prevented from understanding the meaning of the message, which can be both inefficient and error-prone when information is being sent or received. By using semantic communication, the devices are endowed with understanding the underlying meaning, or semantics, of the information. This means devices will be able to represent information in their unique language, which can be more efficient than traditional methods and can allow devices to understand context when information is missing. Conversely, in this thesis, a novel semantic framework called the semantic context-aware framework for adaptive multimodal reasoning (SCE-FOAM) is proposed. This framework supports common media types such as text, audio, and images, and is able to efficiently represent information in minimized forms. Additionally, the framework allows for context recovery, which enables devices to predict missing information when their connection is lost. Lastly, it is designed to be easily used by software developers and researchers, which will encourage widespread adoption of SCE-FOAM. In summary, this thesis contributed a semantic framework that supports multimedia (text, audio, image) messages, context recovery, and is designed to be easily used for widespread adoption in the semantic communication field. | en |
dc.description.degree | Master of Science | en |
dc.format.medium | ETD | en |
dc.identifier.other | vt_gsexam:44664 | en |
dc.identifier.uri | https://hdl.handle.net/10919/137817 | en |
dc.language.iso | en | en |
dc.publisher | Virginia Tech | en |
dc.rights | Creative Commons Attribution 4.0 International | en |
dc.rights.uri | http://creativecommons.org/licenses/by/4.0/ | en |
dc.subject | Semantics | en |
dc.subject | Contextual Learning | en |
dc.subject | Multimodal | en |
dc.subject | Microservice | en |
dc.subject | Matching | en |
dc.title | Artificial Intelligence (AI)-based Semantic Communications with Multimodal Data: Framework and Implementation | en |
dc.type | Thesis | en |
thesis.degree.discipline | Electrical Engineering | en |
thesis.degree.grantor | Virginia Polytechnic Institute and State University | en |
thesis.degree.level | masters | en |
thesis.degree.name | Master of Science | en |
Files
Original bundle
1 - 1 of 1