Holistic Evaluation of Multimodal LLMs on Spatial Intelligence
-
Updated
Apr 14, 2026 - Python
Holistic Evaluation of Multimodal LLMs on Spatial Intelligence
[ACL 2025] FinMME: Benchmark Dataset for Financial Multi-Modal Reasoning Evaluation
[CVPR'24 Highlight] The official code and data for paper "EgoThink: Evaluating First-Person Perspective Thinking Capability of Vision-Language Models"
General AI evaluation and Gauge Engine. A unified evaluation engine for LLMs, MLLMs, audio, and diffusion models.
[CVPR'25] 🌟🌟 EgoTextVQA: Towards Egocentric Scene-Text Aware Video Question Answering
Why We Feel: Breaking Boundaries in Emotional Reasoning with Multimodal Large Language Models
On Path to Multimodal Generalist: General-Level and General-Bench
The official code and data for paper "VidEgoThink: Assessing Egocentric Video Understanding Capabilities for Embodied AI"
The code repository for "OmniEvalKit: A Modular, Lightweight Toolbox for Evaluating Large Language Model and its Omni-Extensions"
[ICASSP'26] PictOBI-20k: A dataset designed to evaluate large multimodal models on the visual decipherment tasks of pictographic OBCs
A Multimodal Benchmark for Evaluating Scientific Reasoning Capabilities of VLMs
offcial implementatio of "Investigating Redundancy in Multimodal Large Language Models with Multiple Vision Encoders" (ICLR2026)
Office implementation of EgoPrivacy (ICML2025)
Office codebase for ICML 2025 paper "Core Knowledge Deficits in Multi-Modal Language Models"
A high-performance highly-customizable reverse OCR tool that renders text or huggingface-compatible datasets to images. Dimension, DPI, CSS configurable!
Multi-modal and Vision Language Model Spatial Reasoning Benchmark
Official repository of paper titled "FPBench: A Comprehensive Benchmark of Multimodal Large Language Models for Fingerprint Analysis".
Benchmark for evaluating MLLMs as judges of vision-task outputs across intrinsic and tool-mediated settings
(WACV 2026) The Perceptual Observatory: Characterizing robustness and grounding in multimodal LLMs (MLLMs)
Add a description, image, and links to the mllm-evaluation topic page so that developers can more easily learn about it.
To associate your repository with the mllm-evaluation topic, visit your repo's landing page and select "manage topics."