This project leverages Vision Mamba, a state-of-the-art model architecture optimized for long-sequence vision tasks, to perform document layout analysis, information extraction, and question answering directly from scanned or complex document images (e.g., PDFs, reports, tables, forms).