[ML] Add aten::split and aten::stack for question-answering models#3012
Merged
edsavage merged 1 commit intoelastic:mainfrom Mar 26, 2026
Merged
[ML] Add aten::split and aten::stack for question-answering models#3012edsavage merged 1 commit intoelastic:mainfrom
edsavage merged 1 commit intoelastic:mainfrom
Conversation
The deepset/tinyroberta-squad2 model uses aten::split (and aten::stack per ES node logs) in its answer span extraction logic. These ops only appear when traced with AutoModelForQuestionAnswering rather than AutoModel. Update the extraction configs to use the correct auto_class. Also verified that LaBSE, BAAI/bge-reranker-base, and castorini/bpr-nq-ctx-encoder (from the supported models docs) are all covered by the existing allowlist. Made-with: Cursor
✅ Snyk checks have passed. No issues have been found so far.
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse. |
There was a problem hiding this comment.
Pull request overview
This PR updates the TorchScript op allowlist and model-tracing configs so question-answering (QA) model graphs (specifically deepset/tinyroberta-squad2 traced via AutoModelForQuestionAnswering) are fully captured and validated by the PyTorch graph validator.
Changes:
- Add
aten::split(andaten::stack) to the C++ allowed-operations allowlist. - Update extraction/validation model configs to trace
qa-tinyroberta-squad2usingAutoModelForQuestionAnswering. - Refresh the golden per-model op reference for
qa-tinyroberta-squad2to reflect the QA-head trace (adds ops likeaten::split,prim::ListUnpack, etc.).
Reviewed changes
Copilot reviewed 2 out of 4 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| dev-tools/extract_model_ops/validation_models.json | Switch qa-tinyroberta-squad2 to an explicit spec using auto_class=AutoModelForQuestionAnswering for validation tracing. |
| dev-tools/extract_model_ops/reference_models.json | Same auto_class switch so golden extraction traces the QA head rather than the base model. |
| bin/pytorch_inference/unittest/testfiles/reference_model_ops.json | Updates the golden ops list for qa-tinyroberta-squad2 and adds an auto_class field. |
| bin/pytorch_inference/CSupportedOperations.cc | Adds aten::split and aten::stack to the runtime allowlist. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
aten::splitandaten::stackto the allowlist — needed bydeepset/tinyroberta-squad2for answer span extraction in its question-answering head.AutoModelForQuestionAnsweringrather thanAutoModel. Updates the extraction configs to use the correctauto_class.reference_model_ops.json) shows additional ops (aten::contiguous,aten::squeeze,prim::ListUnpack) in the updated tinyroberta entry — these are already in the allowlist but weren't observed in the previous base-model trace. The switch toAutoModelForQuestionAnsweringcaptures the full QA-head graph which uses these ops.sentence-transformers/LaBSE,BAAI/bge-reranker-base, andcastorini/bpr-nq-ctx-encoder(from the supported models docs) are all covered by the existing allowlist.Test plan
deepset/tinyroberta-squad2no longer rejected by graph validator