GitHub - JidongLi-hub/JudgeBeforeAnswer

Judge Before Answer:
Can MLLM Discern the False Premise in Question?

Jidong Li, Lingyong Fang, Haodong Zhao, Sufeng Duan, Gongshen Liu

📋 Overview

The original code of constructing JBA dataset is given in this repository, which is an evaluation set of false premise problems for MLLM.

🤗Find our dataset on Huggingface

🚀 Quick Start

Find our JBA dataset in dataset/Judge_Before_Answer.json or on Huggingface, image ids are from Visual Genome. Also, you can run main.py to constrcut your own JBA dataset.

Run test.py to generate test results for MLLM.

Run evaluate.py to evaluate the results and get metrics.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
__pycache__		__pycache__
dataset		dataset
images		images
scripts		scripts
.gitignore		.gitignore
README.md		README.md
evaluate.py		evaluate.py
grpo.py		grpo.py
main.py		main.py
model_chat.py		model_chat.py
prompts.py		prompts.py
sft.py		sft.py
test.py		test.py
utils.py		utils.py
visualization.py		visualization.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Judge Before Answer:
Can MLLM Discern the False Premise in Question?

📋 Overview

🚀 Quick Start

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Judge Before Answer: Can MLLM Discern the False Premise in Question?

📋 Overview

🚀 Quick Start

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Judge Before Answer:
Can MLLM Discern the False Premise in Question?

Packages