Implement per-key padding configuration in PaddedBatch#3008
Merged
TParcollet merged 10 commits intodevelopfrom Dec 1, 2025
Merged
Implement per-key padding configuration in PaddedBatch#3008TParcollet merged 10 commits intodevelopfrom
TParcollet merged 10 commits intodevelopfrom
Conversation
Added support for per-key padding configuration in PaddedBatch.
Added tests for PaddedBatch functionality including per-key padding, mixed configuration, numpy array support, and backward compatibility.
Initialize per_key_padding_kwargs to an empty dictionary by default.
Contributor
There was a problem hiding this comment.
Pull request overview
This PR adds per-key padding configuration support to the PaddedBatch class, allowing different padding values (and other padding parameters) to be specified for different data keys in a batch. This is useful for scenarios where different types of data require different padding strategies, such as padding audio with 0 and labels with -100 (a common ignore index in loss calculations).
- Adds a new
per_key_padding_kwargsparameter toPaddedBatch.__init__()that accepts a dictionary mapping keys to padding configuration dictionaries - The per-key configuration takes precedence over the global
padding_kwargsfor specified keys - Maintains full backward compatibility with existing code
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| speechbrain/dataio/batch.py | Adds per_key_padding_kwargs parameter and implements logic to apply per-key or global padding configuration based on key presence |
| tests/unittests/test_batching.py | Adds comprehensive test coverage including basic per-key padding, mixed configurations, numpy array handling, and backward compatibility |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Clarify the optional nature of padding_kwargs and per_key_padding_kwargs in the docstring.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Added support for per-key padding configuration in PaddedBatch.
What does this PR do?
Allow keys to have specific padding values.
Before submitting
PR review
Reviewer checklist