Skip to content

[CELEBORN-2218] Bump lz4-java version from 1.8.0 to 1.10.4 to resolve CVE‐2025‐12183 and CVE-2025-66566#3555

Closed
SteNicholas wants to merge 5 commits intoapache:mainfrom
SteNicholas:CELEBORN-2218
Closed

[CELEBORN-2218] Bump lz4-java version from 1.8.0 to 1.10.4 to resolve CVE‐2025‐12183 and CVE-2025-66566#3555
SteNicholas wants to merge 5 commits intoapache:mainfrom
SteNicholas:CELEBORN-2218

Conversation

@SteNicholas
Copy link
Member

@SteNicholas SteNicholas commented Dec 3, 2025

What changes were proposed in this pull request?

  • Bump lz4-java version from 1.8.0 to 1.10.4 to resolve CVE‐2025‐12183 and CVE-2025-66566.
  • Lz4Decompressor follows the suggestion to move from fastDecompressor to safeDecompressor to mitigate the performance.

Backport:

Why are the changes needed?

  • CVE‐2025‐12183: Various lz4-java compression and decompression implementations do not guard against out-of-bounds memory access. Untrusted input may lead to denial of service and information disclosure. Vulnerable Maven coordinates: org.lz4:lz4-java up to and including 1.8.0.

  • CVE-2025-66566: Insufficient clearing of the output buffer in Java-based decompressor implementations in lz4-java 1.10.0 and earlier allows remote attackers to read previous buffer contents via crafted compressed input. In applications where the output buffer is reused without being cleared, this may lead to disclosure of sensitive data. JNI-based implementations are not affected.

Therefore, lz4-java version should upgrade to 1.10.4.

Does this PR resolve a correctness bug?

No.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

CI.

@yawkat
Copy link

yawkat commented Dec 3, 2025

I recommend you stick with fastestInstance. It is secure as long as you are on 1.8.1+. It will be much slower than in previous versions, but that can be mitigated by moving to fastestInstance.safeDecompressor like you did in this PR, which is much faster.

@SteNicholas
Copy link
Member Author

@yawkat, thanks for review. I have sticked with fastestInstance for the workaround With the 1.8.1 patch applied, these workarounds are not necessary. It is still recommended to move from fastDecompressor to safeDecompressor to mitigate the performance impact of the fix, however. PTAL.

@SteNicholas SteNicholas force-pushed the CELEBORN-2218 branch 2 times, most recently from c113f4f to 202ba06 Compare December 4, 2025 13:49
@SteNicholas SteNicholas force-pushed the CELEBORN-2218 branch 2 times, most recently from 95d3ed2 to 7ec6523 Compare December 11, 2025 05:56
@SteNicholas SteNicholas force-pushed the CELEBORN-2218 branch 2 times, most recently from da9e0bc to 8043b4a Compare December 11, 2025 06:24
@yawkat
Copy link

yawkat commented Dec 11, 2025

Also fyi there was another cve (CVE-2025-66566) that needs a newer version

@SteNicholas SteNicholas changed the title [CELEBORN-2218] Bump lz4-java version from 1.8.0 to 1.8.1 to resolve CVE‐2025‐12183 [CELEBORN-2218] Bump lz4-java version from 1.8.0 to 1.10.0 to resolve CVE‐2025‐12183 and CVE-2025-66566 Dec 12, 2025
@Marcono1234
Copy link

CVE-2025-66566 affects versions less than or equal to 1.10.0. You should upgrade to 1.10.1.

@SteNicholas SteNicholas changed the title [CELEBORN-2218] Bump lz4-java version from 1.8.0 to 1.10.0 to resolve CVE‐2025‐12183 and CVE-2025-66566 [CELEBORN-2218] Bump lz4-java version from 1.8.0 to 1.10.1 to resolve CVE‐2025‐12183 and CVE-2025-66566 Dec 15, 2025
@SteNicholas
Copy link
Member Author

Ping @pan3793, @yawkat, @Marcono1234.

@pan3793
Copy link
Member

pan3793 commented Dec 16, 2025

lz4 is famous for its ultra-fast speed, the upgrade is not free, my test shows it has perf impact - apache/spark#53453

I understand that security takes precedence over performance, so I'm fine with this change.

for the suggestion of 'moving to fastestInstance.safeDecompressor', I think we can NOT do that blindly - Celeborn Spark/Flink clients use the lz4-java libs provided by the engine libs, since we support a wide range of Spark/Flink versions, it's possible that the engine still ships old lz4-java jar, we may need to dynamiclly check and bind the fastDecompressor or safeDecompressor based on runtime version of lz4-java

@yawkat
Copy link

yawkat commented Dec 16, 2025

@pan3793 safeDecompressor should work just fine on old versions, and even on those old versions, it should be slightly faster than fastDecompressor. In fact, using safeDecompressor gets rid of most (but not all) of the security impact of the CVEs on old versions.

@pan3793
Copy link
Member

pan3793 commented Mar 2, 2026

@SteNicholas Code change looks fine, but let's wait for a while to collect feedback from other reviewers, about the performance drop.

@SteNicholas
Copy link
Member Author

SteNicholas commented Mar 2, 2026

@yawkat. could you please take a look at performance drop of safeDecompressor which refers to the benchmark report of LZ4TPCDSDataBenchmark-jdk17-results.txt?

@yawkat
Copy link

yawkat commented Mar 2, 2026

@SteNicholas I don't have time for deep benchmarking, but I just threw an AI agent at it, and it figured out that some build flag changes that distros do can improve performance by 10%. Could you test with the 1.10.4 I just released?

@pan3793
Copy link
Member

pan3793 commented Mar 3, 2026

@yawkat thanks! it indeed solves the performance regression. our benchmark shows 1.10.4 is much faster than 1.10.3, and even faster than 1.8.0! and I got a similar result in Spark apache/spark#54585

@RexXiong
Copy link
Contributor

RexXiong commented Mar 3, 2026

@yawkat thanks! it indeed solves the performance regression. our benchmark shows 1.10.4 is much faster than 1.10.3, and even faster than 1.8.0! and I got a similar result in Spark apache/spark#54585

Sounds Great, I think we can keep LZ4 as default.

@SteNicholas SteNicholas changed the title [CELEBORN-2218] Bump lz4-java version from 1.8.0 to 1.10.3 to resolve CVE‐2025‐12183 and CVE-2025-66566 [CELEBORN-2218] Bump lz4-java version from 1.8.0 to 1.10.4 to resolve CVE‐2025‐12183 and CVE-2025-66566 Mar 3, 2026
SteNicholas added a commit that referenced this pull request Mar 3, 2026
… CVE‐2025‐12183 and CVE-2025-66566

- Bump lz4-java version from 1.8.0 to 1.10.4 to resolve CVE‐2025‐12183 and CVE-2025-66566.
- `Lz4Decompressor` follows the [suggestion](apache/spark#53290 (comment)) to move from `fastDecompressor` to `safeDecompressor` to mitigate the performance.

Backport:

- apache/spark#53327
- apache/spark#53347
- apache/spark#53971
- apache/spark#53454
- apache/spark#54585

- [CVE‐2025‐12183](https://sites.google.com/sonatype.com/vulnerabilities/cve-2025-12183): Various lz4-java compression and decompression implementations do not guard against out-of-bounds memory access. Untrusted input may lead to denial of service and information disclosure. Vulnerable Maven coordinates: org.lz4:lz4-java up to and including 1.8.0.

- [CVE-2025-66566](GHSA-cmp6-m4wj-q63q): Insufficient clearing of the output buffer in Java-based decompressor implementations in lz4-java 1.10.0 and earlier allows remote attackers to read previous buffer contents via crafted compressed input. In applications where the output buffer is reused without being cleared, this may lead to disclosure of sensitive data. JNI-based implementations are not affected.

Therefore, lz4-java version should upgrade to 1.10.4.

No.

No.

CI.

Closes #3555 from SteNicholas/CELEBORN-2218.

Lead-authored-by: SteNicholas <[email protected]>
Co-authored-by: Cheng Pan <[email protected]>
Signed-off-by: SteNicholas <[email protected]>
(cherry picked from commit dca3749)
Signed-off-by: SteNicholas <[email protected]>
@SteNicholas
Copy link
Member Author

Thanks for all. Merged to main(v0.7.0) and branch-0.6(v0.6.3).

@yawkat
Copy link

yawkat commented Mar 3, 2026

Given that zstd is from the same authors as lz4 but newer, it may still be a good idea to move to zstd as the default long-term.

pan3793 added a commit to pan3793/iceberg that referenced this pull request Mar 5, 2026
Iceberg switched to `at.yawk.lz4:lz4-java` group for security reasons, but it unintentionally introduced performance regression.

https://github.com/yawkat/lz4-java/releases/tag/v1.10.4

> These changes attempt to fix the native performance regression in 1.9+. They should have no functional or security impact.

See the benchmark reports in Celeborn and Spark projects

- CELEBORN-2218 / apache/celeborn#3555
- SPARK-55803 / apache/spark#54585
pan3793 added a commit to pan3793/trino that referenced this pull request Mar 5, 2026
Trino switched to `at.yawk.lz4:lz4-java` group for security reasons, but it unintentionally introduced performance regression.

https://github.com/yawkat/lz4-java/releases/tag/v1.10.4

> These changes attempt to fix the native performance regression in 1.9+. They should have no functional or security impact.

See the benchmark reports in Celeborn and Spark projects

- CELEBORN-2218 / apache/celeborn#3555
- SPARK-55803 / apache/spark#54585
electrum pushed a commit to trinodb/trino that referenced this pull request Mar 5, 2026
Trino switched to `at.yawk.lz4:lz4-java` group for security reasons, but it unintentionally introduced performance regression.

https://github.com/yawkat/lz4-java/releases/tag/v1.10.4

> These changes attempt to fix the native performance regression in 1.9+. They should have no functional or security impact.

See the benchmark reports in Celeborn and Spark projects

- CELEBORN-2218 / apache/celeborn#3555
- SPARK-55803 / apache/spark#54585
pan3793 added a commit to pan3793/clickhouse-java that referenced this pull request Mar 5, 2026
ClickHouse Java Client switched to `at.yawk.lz4:lz4-java` for security reasons, but it unintentionally introduced performance regression.

https://github.com/yawkat/lz4-java/releases/tag/v1.10.4

> These changes attempt to fix the native performance regression in 1.9+. They should have no functional or security impact.

See the benchmark reports in Apache Celeborn and Apache Spark projects

- CELEBORN-2218 / apache/celeborn#3555
- SPARK-55803 / apache/spark#54585
huaxingao pushed a commit to apache/iceberg that referenced this pull request Mar 6, 2026
Iceberg switched to `at.yawk.lz4:lz4-java` group for security reasons, but it unintentionally introduced performance regression.

https://github.com/yawkat/lz4-java/releases/tag/v1.10.4

> These changes attempt to fix the native performance regression in 1.9+. They should have no functional or security impact.

See the benchmark reports in Celeborn and Spark projects

- CELEBORN-2218 / apache/celeborn#3555
- SPARK-55803 / apache/spark#54585
RjLi13 pushed a commit to RjLi13/iceberg that referenced this pull request Mar 12, 2026
Iceberg switched to `at.yawk.lz4:lz4-java` group for security reasons, but it unintentionally introduced performance regression.

https://github.com/yawkat/lz4-java/releases/tag/v1.10.4

> These changes attempt to fix the native performance regression in 1.9+. They should have no functional or security impact.

See the benchmark reports in Celeborn and Spark projects

- CELEBORN-2218 / apache/celeborn#3555
- SPARK-55803 / apache/spark#54585
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants