✨✨Latest Advances on Multimodal Large Language Models
-
Updated
Nov 27, 2024
✨✨Latest Advances on Multimodal Large Language Models
【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Mixture-of-Experts for Large Vision-Language Models
[NeurIPS 2024] This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"
[NeurIPS'24 & ICMLW'24] CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models
[ECCV 2024] API: Attention Prompting on Image for Large Vision-Language Models
[NeurIPS 2024] Hawk: Learning to Understand Open-World Video Anomalies
Awesome Large Vision-Language Model: A Curated List of Large Vision-Language Model
Official code for our paper: "VL-Uncertainty: Detecting Hallucination in Large Vision-Language Model via Uncertainty Estimation".
Leverage multimodal large vision language model for quantitative analysis
Add a description, image, and links to the large-vision-language-model topic page so that developers can more easily learn about it.
To associate your repository with the large-vision-language-model topic, visit your repo's landing page and select "manage topics."