We provide a novel audio-visual mouse saliency dataset with the following key-features:
- Diverse content: movie, sports, live, vertical videos, etc.;
- Large scale: 2000 videos with mean 18s duration;
- High resolution: all streams are FullHD;
- Audio track saved and played to observers;
- Mouse fixations from >5000 observers (>70 per video);
- License: CC-BY;
File structure:
-
Videos.zip— 2000 (1200 Train + 800 Test) .mp4 video (kindly reminder: many videos contain an audio stream and users watched the video with the sound turned ON!) -
TrainTestSplit.json— in this JSON we provide Train/Public Test/Private Test split of all videos -
SaliencyTrain.zip/SaliencyTest.zip— almost losslessly (crf 0, 10bit, min-max normalized) compressed continuous saliency maps videos for Train/Test subset -
FixationsTrain.zip/FixationsTest.zip— contains the following files for Train/Test subset:
-
.../video_name/fixations.json— per-frame fixations coordinates, from which saliency maps were obtained, this JSON will be used for metrics calculation -
.../video_name/fixations_maps/— binary fixation maps in '.png' format (since some fixations could share the same pixel, this is a lossy representation and is NOT used either in calculating metrics or generating Gaussians, however, we provide them for visualization and frames count checks)
-
VideoInfo.json— meta information about each video (e.g. license) -
SampleSubmission.zip— example submission for the challenge, obtained from fitted Center Prior Gaussian over mean training saliency maps.
conda create -n saliency python=3.10.19
conda activate saliency
pip install numpy==2.2.6 opencv-python-headless==4.12.0.88 tqdm==4.67.1
conda install ffmpeg=4.4.2 -c conda-forge
Archives with videos were accepted from challenge participants as submissions and scored using the same pipeline as in bench.py.
Usage example:
- Check that your predictions match the structure and names of the baseline SampleSubmission
- Install
pip install -r requirments.txt,conda install ffmpeg - Download and extract
SaliencyTest.zip,FixationsTest.zip, andTrainTestSplit.jsonfiles from the dataset page - Run
python bench.pywith flags:
--model_video_predictions ./SampleSubmission-CenterPrior— folder with predicted saliency videos--model_extracted_frames ./SampleSubmission-CenterPrior-Frames— folder to store prediction frames (should not exist at launch time), requires ~170 GB of free space--gt_video_predictions ./SaliencyTest/Test— folder from dataset page with gt saliency videos--gt_extracted_frames ./SaliencyTest-Frames— folder to store ground-truth frames (should not exist at launch time), requires ~170 GB of free space--gt_fixations_path ./FixationsTest/Test— folder from dataset page with gt saliency fixations--split_json ./TrainTestSplit.json— JSON from dataset page with names splitting--results_json ./results.json— path to the output results json--mode public_test— public_test/private_test subsets
- The result you get will be available following
results.jsonpath