Short Phrase Reduction Stimuli

A curated collection of 8 short polyphonic phrases.

Overview

This dataset contains musical excerpts originally used as stimuli in behavioral experiments investigating note-level structural salience in polyphonic music. Each phrase is selected to be:

Short: 8-16 bars, suitable for iterative reduction tasks
Tonally coherent: Begins and ends in the same key (or returns to tonic)
Structurally complete: Contains clear phrase boundaries
Polyphonically rich: Features multiple voices with melodic and harmonic content

Dataset Structure

short_phrase_collections/
├── README.md
├── LICENSE
├── metadata/
│   └── phrases.csv          # Main metadata (single source of truth)
├── stimuli/
│   ├── phrase_01/           # Purcell - Rondeau from Abdelazer
│   │   ├── score.musicxml
│   │   └── score.pdf
│   ├── phrase_02/           # Handel - Suite HWV 450 Menuet
│   └── ...
└── docs/
    └── selection_criteria.md

Phrases

ID	Composer	Work	Key	Bars
phrase_01	Purcell	Rondeau from Abdelazer	D minor	1-8
phrase_02	Handel	Suite HWV 450 (Menuet)	G major	1-8
phrase_03	Nardini	Adagio cantabile	A major	1-8
phrase_04	Telemann	Gigue à l'Angloise	—	—
phrase_05	Mozart	Piano Sonata K.331	A major	1-8
phrase_06	Corelli	Sarabande Op.5 No.7	D minor	1-16
phrase_07	Tartini	Violin Sonata (Devil's Trill)	G minor	—
phrase_08	Haydn	Symphony No.88, 2nd mvt	D major	1-8

Note ID Convention

Each note in a stimulus file is assigned a unique ID following this traversal order:

Sort all notes by offset (onset time) ascending
For notes at the same offset, sort by midi pitch descending (higher pitch first)
Assign sequential IDs: phrase_01_1, phrase_01_2, ...

This ensures deterministic, reproducible ID assignment across processing pipelines.

Deduplication

Some stimuli required preprocessing to remove duplicate notes (same onset + same pitch):

Phrase	Notes Removed	Reason
phrase_03 (Nardini)	4	Notation artifacts
phrase_08 (Haydn)	19	Orchestral unison doublings in piano reduction

Policy: When duplicates exist, the note with the longest duration is kept. Full details are recorded in metadata/phrases.csv.

Usage

Loading with music21 (Python)

from music21 import converter

score = converter.parse('stimuli/phrase_01/score.musicxml')
for note in score.flatten().notes:
    print(note.pitch, note.offset, note.quarterLength)

Metadata Access

import pandas as pd

phrases = pd.read_csv('metadata/phrases.csv')
print(phrases[['phrase_id', 'composer', 'work_title', 'key']])

Selection Criteria

See docs/selection_criteria.md for detailed inclusion/exclusion criteria.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Short Phrase Reduction Stimuli

Overview

Dataset Structure

Phrases

Note ID Convention

Deduplication

Usage

Loading with music21 (Python)

Metadata Access

Selection Criteria

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
docs		docs
metadata		metadata
stimuli		stimuli
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Short Phrase Reduction Stimuli

Overview

Dataset Structure

Phrases

Note ID Convention

Deduplication

Usage

Loading with music21 (Python)

Metadata Access

Selection Criteria

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages