Skip to content

Pull requests: vllm-project/llm-compressor

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[do not land] GPTQ actorder regression test suite awq For any issue / PR related to AWQ support fp8 For any issue / PR related to FP8 support gptq For any PR / issue related to GPTQ support llama For any PR / issue related to Llama herd support qwen For any PR / issue related to Qwen support w4a16
#2643 opened Apr 22, 2026 by HDCharles Collaborator Draft
3 tasks
Add SmoothQuant layer mappings for Cohere, DeepSeek V3, and Phi3 enhancement New feature or request smoothquant For any issue / PR related to SmoothQuant support transforms Related to transforms-based modifiers like SpinQuant and Quip two-reviews When a PR requires two reviews
#2639 opened Apr 21, 2026 by jayakumarpujar Loading…
4 tasks done
[AWQ] Per-output-slice grid search for fused q_proj (Qwen3.5 attn_output_gate) awq For any issue / PR related to AWQ support enhancement New feature or request quality-failed qwen For any PR / issue related to Qwen support Refactor Code cleanup and/or improvements to existing features transforms Related to transforms-based modifiers like SpinQuant and Quip
#2636 opened Apr 21, 2026 by juju812 Draft
[AWQ] Seed grid search with identity baseline + fail fast on non-finite loss awq For any issue / PR related to AWQ support enhancement New feature or request needs-rebase Refactor Code cleanup and/or improvements to existing features two-reviews When a PR requires two reviews
#2635 opened Apr 21, 2026 by juju812 Loading…
2 of 3 tasks
[Deprecation] [Offload] [Tracing] Remove legacy offloading logic in tracing Refactor Code cleanup and/or improvements to existing features tracing Issues related to model tracing
#2633 opened Apr 20, 2026 by kylesayrs Collaborator Loading…
[Deprecation] Replace deprecated function usage autoround For any PR / issue related to autoround support quality-failed Refactor Code cleanup and/or improvements to existing features
#2632 opened Apr 20, 2026 by kylesayrs Collaborator Loading…
add example of w8a8fp8 for qwen3.5 documentation Improvements or additions to documentation enhancement New feature or request fp8 For any issue / PR related to FP8 support qwen For any PR / issue related to Qwen support two-reviews When a PR requires two reviews
#2631 opened Apr 20, 2026 by zhangxin81 Loading…
Adding test_group to lm-eval configs enhancement New feature or request fp8 For any issue / PR related to FP8 support nvfp4 For any PR / issue related to NVFP4 support two-reviews When a PR requires two reviews w4a16
#2623 opened Apr 16, 2026 by debroy-rh Loading…
Defer weight qparams to epoch end, unify calibration lifecycle
#2621 opened Apr 15, 2026 by HDCharles Collaborator Loading…
2 of 5 tasks
test gptq issue [not for land] enhancement New feature or request gptq For any PR / issue related to GPTQ support nvfp4 For any PR / issue related to NVFP4 support quality-failed
#2617 opened Apr 14, 2026 by HDCharles Collaborator Loading…
Add actorder support for GPTQ block quantization enhancement New feature or request fp8 For any issue / PR related to FP8 support gptq For any PR / issue related to GPTQ support ready When a PR is ready for review Refactor Code cleanup and/or improvements to existing features two-reviews When a PR requires two reviews
#2616 opened Apr 14, 2026 by rk119 Loading…
[Tests] Add transformers v5 modeling tests and clean up import guards qwen For any PR / issue related to Qwen support Refactor Code cleanup and/or improvements to existing features
#2614 opened Apr 13, 2026 by dsikka Collaborator Loading…
[not for land] DDP regression tests awq For any issue / PR related to AWQ support documentation Improvements or additions to documentation enhancement New feature or request llama For any PR / issue related to Llama herd support quality-failed qwen For any PR / issue related to Qwen support
#2613 opened Apr 13, 2026 by HDCharles Collaborator Loading…
4 tasks done
fix: support transformers >= 5.0 (TORCH_INIT_FUNCTIONS fallback) bug Something isn't working qwen For any PR / issue related to Qwen support two-reviews When a PR requires two reviews w4a16
#2608 opened Apr 12, 2026 by quivent Loading…
[oneshot] clean offload_dir during post-processing
#2605 opened Apr 10, 2026 by brian-dellabetta Collaborator Draft
3 tasks
[docs] deepseek v3.2 docs documentation Improvements or additions to documentation ready When a PR is ready for review
#2602 opened Apr 10, 2026 by brian-dellabetta Collaborator Loading…
fix: correct TOKENIZERS_PARALLELISM_ENV constant value needs-rebase ready When a PR is ready for review two-reviews When a PR requires two reviews
#2596 opened Apr 10, 2026 by kuishou68 Loading…
[Refactor] Refactor splits to only use the "calibration" split (#2551) needs-rebase ready When a PR is ready for review Refactor Code cleanup and/or improvements to existing features two-reviews When a PR requires two reviews
#2589 opened Apr 8, 2026 by arpitkh101 Loading…
Observers refactor needs-rebase
#2585 opened Apr 8, 2026 by HDCharles Collaborator Loading…
[Refactor] Consolidate Intermediate Offloading needs-rebase two-reviews When a PR requires two reviews
#2583 opened Apr 8, 2026 by menogrey Contributor Loading…
[AWQ] [gemma3] remove input layernorm mapping
#2571 opened Apr 6, 2026 by brian-dellabetta Collaborator Loading…
1 task
feat: add ActivationOrdering support for per-channel GPTQ quantization needs-rebase ready When a PR is ready for review two-reviews When a PR requires two reviews
#2525 opened Mar 26, 2026 by matdou Loading…
[Examples] Reorganize examples by model/scheme/algo hierarchy documentation Improvements or additions to documentation needs-rebase
#2510 opened Mar 24, 2026 by dsikka Collaborator Draft
ProTip! Follow long discussions with comments:>50.