CVPR-NTIRE Video Saliency Prediction Challenge 2026

Dataset

We provide a novel audio-visual mouse saliency dataset with the following key-features:

Diverse content: movie, sports, live, vertical videos, etc.;
Large scale: 2000 videos with mean 18s duration;
High resolution: all streams are FullHD;
Audio track saved and played to observers;
Mouse fixations from >5000 observers (>70 per video);
License: CC-BY;

File structure:

Videos.zip — 2000 (1200 Train + 800 Test) .mp4 video (kindly reminder: many videos contain an audio stream and users watched the video with the sound turned ON!)
TrainTestSplit.json — in this JSON we provide Train/Public Test/Private Test split of all videos
SaliencyTrain.zip/SaliencyTest.zip — almost losslessly (crf 0, 10bit, min-max normalized) compressed continuous saliency maps videos for Train/Test subset
FixationsTrain.zip/FixationsTest.zip — contains the following files for Train/Test subset:

.../video_name/fixations.json — per-frame fixations coordinates, from which saliency maps were obtained, this JSON will be used for metrics calculation
.../video_name/fixations_maps/ — binary fixation maps in '.png' format (since some fixations could share the same pixel, this is a lossy representation and is NOT used either in calculating metrics or generating Gaussians, however, we provide them for visualization and frames count checks)

VideoInfo.json — meta information about each video (e.g. license)
SampleSubmission.zip — example submission for the challenge, obtained from fitted Center Prior Gaussian over mean training saliency maps.

Evaluation

Environment setup

conda create -n saliency python=3.10.19
conda activate saliency
pip install numpy==2.2.6 opencv-python-headless==4.12.0.88 tqdm==4.67.1
conda install ffmpeg=4.4.2 -c conda-forge

Run evaluation

Archives with videos were accepted from challenge participants as submissions and scored using the same pipeline as in bench.py.

Usage example:

Check that your predictions match the structure and names of the baseline SampleSubmission
Install pip install -r requirments.txt, conda install ffmpeg
Download and extract SaliencyTest.zip, FixationsTest.zip, and TrainTestSplit.json files from the dataset page
Run python bench.py with flags:

--model_video_predictions ./SampleSubmission-CenterPrior — folder with predicted saliency videos
--model_extracted_frames ./SampleSubmission-CenterPrior-Frames — folder to store prediction frames (should not exist at launch time), requires ~170 GB of free space
--gt_video_predictions ./SaliencyTest/Test — folder from dataset page with gt saliency videos
--gt_extracted_frames ./SaliencyTest-Frames — folder to store ground-truth frames (should not exist at launch time), requires ~170 GB of free space
--gt_fixations_path ./FixationsTest/Test — folder from dataset page with gt saliency fixations
--split_json ./TrainTestSplit.json — JSON from dataset page with names splitting
--results_json ./results.json — path to the output results json
--mode public_test — public_test/private_test subsets

The result you get will be available following results.json path

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
README.md		README.md
bench.py		bench.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CVPR-NTIRE Video Saliency Prediction Challenge 2026

Dataset

Evaluation

Environment setup

Run evaluation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

CVPR-NTIRE Video Saliency Prediction Challenge 2026

Dataset

Evaluation

Environment setup

Run evaluation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages