Vision-Language Models for Biomedical Applications

Thapa, SurendrabikramNaseem, UsmanZhou, LupingKim, Jinman2024-11-042024-11-042024-10-28https://hdl.handle.net/10919/121530Vision-language models (VLMs) are transforming the landscape of biomedical research and healthcare by enabling the seamless integration and interpretation of complex multimodal data, including medical images and clinical texts. Recognizing the growing impact of these models, the first international workshop on Vision- Language Models for Biomedicine (VLM4Bio) was held in conjunction with ACM Multimedia 2024. The workshop aimed to address the critical need for advanced techniques that can leverage VLMs in applications such as medical imaging, diagnostics, and personalized treatment. As healthcare data increasingly involves both visual and textual information, VLM4Bio provided a platform for interdisciplinary collaboration between experts in natural language processing, computer vision, biomedical engineering, and AI ethics. This paper provides an overview of the inaugural edition of the VLM4Bio workshop, summarizing the key discussions, contributions, and future directions for expanding the workshop’s scope and influence in subsequent editions.application/pdfenCreative Commons Attribution 4.0 InternationalVision-Language Models for Biomedical ApplicationsArticle - Refereed2024-11-01The author(s)https://doi.org/10.1145/3689096.3690770