[CELEBORN-2218] Bump lz4-java version from 1.8.0 to 1.10.4 to resolve CVE‐2025‐12183 and CVE-2025-66566#3555
[CELEBORN-2218] Bump lz4-java version from 1.8.0 to 1.10.4 to resolve CVE‐2025‐12183 and CVE-2025-66566#3555SteNicholas wants to merge 5 commits intoapache:mainfrom
Conversation
8599c17 to
9d04928
Compare
|
I recommend you stick with fastestInstance. It is secure as long as you are on 1.8.1+. It will be much slower than in previous versions, but that can be mitigated by moving to fastestInstance.safeDecompressor like you did in this PR, which is much faster. |
9d04928 to
241c31c
Compare
|
@yawkat, thanks for review. I have sticked with |
c113f4f to
202ba06
Compare
client/src/main/java/org/apache/celeborn/client/compress/Lz4Decompressor.java
Outdated
Show resolved
Hide resolved
95d3ed2 to
7ec6523
Compare
da9e0bc to
8043b4a
Compare
|
Also fyi there was another cve (CVE-2025-66566) that needs a newer version |
8043b4a to
30ed24a
Compare
30ed24a to
678a311
Compare
|
CVE-2025-66566 affects versions less than or equal to 1.10.0. You should upgrade to 1.10.1. |
678a311 to
fc23474
Compare
fc23474 to
4b94a83
Compare
|
Ping @pan3793, @yawkat, @Marcono1234. |
|
lz4 is famous for its ultra-fast speed, the upgrade is not free, my test shows it has perf impact - apache/spark#53453 I understand that security takes precedence over performance, so I'm fine with this change. for the suggestion of 'moving to fastestInstance.safeDecompressor', I think we can NOT do that blindly - Celeborn Spark/Flink clients use the lz4-java libs provided by the engine libs, since we support a wide range of Spark/Flink versions, it's possible that the engine still ships old lz4-java jar, we may need to dynamiclly check and bind the |
|
@pan3793 safeDecompressor should work just fine on old versions, and even on those old versions, it should be slightly faster than fastDecompressor. In fact, using safeDecompressor gets rid of most (but not all) of the security impact of the CVEs on old versions. |
|
@SteNicholas Code change looks fine, but let's wait for a while to collect feedback from other reviewers, about the performance drop. |
|
@yawkat. could you please take a look at performance drop of |
|
@SteNicholas I don't have time for deep benchmarking, but I just threw an AI agent at it, and it figured out that some build flag changes that distros do can improve performance by 10%. Could you test with the 1.10.4 I just released? |
|
@yawkat thanks! it indeed solves the performance regression. our benchmark shows 1.10.4 is much faster than 1.10.3, and even faster than 1.8.0! and I got a similar result in Spark apache/spark#54585 |
Sounds Great, I think we can keep LZ4 as default. |
… CVE‐2025‐12183 and CVE-2025-66566
… CVE‐2025‐12183 and CVE-2025-66566 - Bump lz4-java version from 1.8.0 to 1.10.4 to resolve CVE‐2025‐12183 and CVE-2025-66566. - `Lz4Decompressor` follows the [suggestion](apache/spark#53290 (comment)) to move from `fastDecompressor` to `safeDecompressor` to mitigate the performance. Backport: - apache/spark#53327 - apache/spark#53347 - apache/spark#53971 - apache/spark#53454 - apache/spark#54585 - [CVE‐2025‐12183](https://sites.google.com/sonatype.com/vulnerabilities/cve-2025-12183): Various lz4-java compression and decompression implementations do not guard against out-of-bounds memory access. Untrusted input may lead to denial of service and information disclosure. Vulnerable Maven coordinates: org.lz4:lz4-java up to and including 1.8.0. - [CVE-2025-66566](GHSA-cmp6-m4wj-q63q): Insufficient clearing of the output buffer in Java-based decompressor implementations in lz4-java 1.10.0 and earlier allows remote attackers to read previous buffer contents via crafted compressed input. In applications where the output buffer is reused without being cleared, this may lead to disclosure of sensitive data. JNI-based implementations are not affected. Therefore, lz4-java version should upgrade to 1.10.4. No. No. CI. Closes #3555 from SteNicholas/CELEBORN-2218. Lead-authored-by: SteNicholas <[email protected]> Co-authored-by: Cheng Pan <[email protected]> Signed-off-by: SteNicholas <[email protected]> (cherry picked from commit dca3749) Signed-off-by: SteNicholas <[email protected]>
|
Thanks for all. Merged to main(v0.7.0) and branch-0.6(v0.6.3). |
|
Given that zstd is from the same authors as lz4 but newer, it may still be a good idea to move to zstd as the default long-term. |
Iceberg switched to `at.yawk.lz4:lz4-java` group for security reasons, but it unintentionally introduced performance regression. https://github.com/yawkat/lz4-java/releases/tag/v1.10.4 > These changes attempt to fix the native performance regression in 1.9+. They should have no functional or security impact. See the benchmark reports in Celeborn and Spark projects - CELEBORN-2218 / apache/celeborn#3555 - SPARK-55803 / apache/spark#54585
Trino switched to `at.yawk.lz4:lz4-java` group for security reasons, but it unintentionally introduced performance regression. https://github.com/yawkat/lz4-java/releases/tag/v1.10.4 > These changes attempt to fix the native performance regression in 1.9+. They should have no functional or security impact. See the benchmark reports in Celeborn and Spark projects - CELEBORN-2218 / apache/celeborn#3555 - SPARK-55803 / apache/spark#54585
Trino switched to `at.yawk.lz4:lz4-java` group for security reasons, but it unintentionally introduced performance regression. https://github.com/yawkat/lz4-java/releases/tag/v1.10.4 > These changes attempt to fix the native performance regression in 1.9+. They should have no functional or security impact. See the benchmark reports in Celeborn and Spark projects - CELEBORN-2218 / apache/celeborn#3555 - SPARK-55803 / apache/spark#54585
ClickHouse Java Client switched to `at.yawk.lz4:lz4-java` for security reasons, but it unintentionally introduced performance regression. https://github.com/yawkat/lz4-java/releases/tag/v1.10.4 > These changes attempt to fix the native performance regression in 1.9+. They should have no functional or security impact. See the benchmark reports in Apache Celeborn and Apache Spark projects - CELEBORN-2218 / apache/celeborn#3555 - SPARK-55803 / apache/spark#54585
Iceberg switched to `at.yawk.lz4:lz4-java` group for security reasons, but it unintentionally introduced performance regression. https://github.com/yawkat/lz4-java/releases/tag/v1.10.4 > These changes attempt to fix the native performance regression in 1.9+. They should have no functional or security impact. See the benchmark reports in Celeborn and Spark projects - CELEBORN-2218 / apache/celeborn#3555 - SPARK-55803 / apache/spark#54585
Iceberg switched to `at.yawk.lz4:lz4-java` group for security reasons, but it unintentionally introduced performance regression. https://github.com/yawkat/lz4-java/releases/tag/v1.10.4 > These changes attempt to fix the native performance regression in 1.9+. They should have no functional or security impact. See the benchmark reports in Celeborn and Spark projects - CELEBORN-2218 / apache/celeborn#3555 - SPARK-55803 / apache/spark#54585
What changes were proposed in this pull request?
Lz4Decompressorfollows the suggestion to move fromfastDecompressortosafeDecompressorto mitigate the performance.Backport:
lz4-javato 1.10.0 spark#53327lz4-javato 1.10.1 spark#53347Why are the changes needed?
CVE‐2025‐12183: Various lz4-java compression and decompression implementations do not guard against out-of-bounds memory access. Untrusted input may lead to denial of service and information disclosure. Vulnerable Maven coordinates: org.lz4:lz4-java up to and including 1.8.0.
CVE-2025-66566: Insufficient clearing of the output buffer in Java-based decompressor implementations in lz4-java 1.10.0 and earlier allows remote attackers to read previous buffer contents via crafted compressed input. In applications where the output buffer is reused without being cleared, this may lead to disclosure of sensitive data. JNI-based implementations are not affected.
Therefore, lz4-java version should upgrade to 1.10.4.
Does this PR resolve a correctness bug?
No.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
CI.