Skip to content

Disallow Gorilla codec on non-float columns#45252

Merged
rschu1ze merged 7 commits intomasterfrom
block-nonfloat-gorilla
Jan 16, 2023
Merged

Disallow Gorilla codec on non-float columns#45252
rschu1ze merged 7 commits intomasterfrom
block-nonfloat-gorilla

Conversation

@rschu1ze
Copy link
Member

@rschu1ze rschu1ze commented Jan 13, 2023

Cf. #45195

Changelog category (leave one):

  • Backward Incompatible Change

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Disallow Gorilla compression on columns of non-Float32 or non-Float64 type.

Documentation entry for user-facing changes

  • [ X ] Documentation is written (mandatory for new features)

@robot-clickhouse robot-clickhouse added the pr-bugfix Pull request with bugfix, not backported by default label Jan 13, 2023
@Avogar Avogar self-assigned this Jan 13, 2023
@rschu1ze rschu1ze changed the title Disallow Gorilla codec non-float columns Disallow Gorilla codec on non-float columns Jan 13, 2023
@davenger

This comment was marked as outdated.

@rschu1ze

This comment was marked as outdated.

@rschu1ze rschu1ze force-pushed the block-nonfloat-gorilla branch from c886123 to 2cb6371 Compare January 13, 2023 16:52
@rschu1ze rschu1ze force-pushed the block-nonfloat-gorilla branch from 2cb6371 to 5d3f0ec Compare January 13, 2023 16:53
@den-crane

This comment was marked as outdated.

@den-crane
Copy link
Contributor

den-crane commented Jan 13, 2023

BTW, the issue is because of the combination of codecs codec (Delta, Gorilla, LZ4). Not because Decimal type.

Delta+Gorilla -- should be forbidden as suspicious

create table bug_gor_lz ( value_bug Float64 codec (Delta, Gorilla, LZ4)
) engine = MergeTree order by tuple() as select 0 from numbers(1e7);

select * from bug_gor_lz where 0.0 <> value_bug limit 10;

┌────────────────value_bug─┐
│    4.927549738046179e180 │
│   -1.154350463386742e-65 │
│   8.526606562949742e-194 │
│ -1.2810555172165677e-249 │
│  -1.0439411844753269e307 │
│   1.5110172119519233e241 │
│  -2.5236826111584395e175 │
│     6.628092400109508e37 │
│     6.628556554427993e37 │
│     6.628556558479204e37 │
└──────────────────────────┘

@rschu1ze
Copy link
Member Author

rschu1ze commented Jan 14, 2023

BTW, the issue is because of the combination of codecs codec (Delta, Gorilla, LZ4). Not because Decimal type.
Delta+Gorilla -- should be forbidden as suspicious

True, thanks. Will check/block this combination too but separately (--> #45277). This PR makes sense in itself.

@rschu1ze rschu1ze added the pr-backward-incompatible Pull request with backwards incompatible changes label Jan 14, 2023
@robot-clickhouse-ci-2 robot-clickhouse-ci-2 removed the pr-bugfix Pull request with bugfix, not backported by default label Jan 15, 2023
@rschu1ze
Copy link
Member Author

rschu1ze commented Jan 16, 2023

ClickHouse Integration Tests (tsan) [6/6]

  • test_merge_tree_s3/test.py::test_attach_detach_partition[node]: looks like an infra issue, test is a bit unstable

Various ClickHouse Stress Test failures, e.g.

2023.01.15 19:55:09.399051 [ 718520 ] {} <Error> Application: Caught exception while loading metadata: Code: 36. DB::Exception: Gorilla codec is not applicable for UInt32 because the data type is not float: Cannot attach table `test_33`.`codecs` from metadata file /var/lib/clickhouse/store/a76/a76b7007-9be4-417e-8a9e-ecc4e1837516/codecs.sql from query ATTACH TABLE test_33.codecs UUID 'cfdda7c2-a32d-4220-9214-68e571bab248' (`a` UInt8 CODEC(LZ4), `b` UInt16 CODEC(ZSTD(1)), `c` Float32 CODEC(Gorilla), `d` UInt8 CODEC(Delta(1), LZ4), `e` Float64 CODEC(Gorilla, ZSTD(1)), `f` UInt32 CODEC(Delta(4), Delta(4), Gorilla), `g` DateTime CODEC(DoubleDelta), `h` DateTime64(3) CODEC(DoubleDelta, LZ4), `i` String CODEC(NONE)) ENGINE = MergeTree ORDER BY tuple() SETTINGS index_granularity = 8192. (BAD_ARGUMENTS), Stack trace (when copying this message, always include the lines below

I think the system tries to start the old server from a persistence created by an new server (specifically, the stress test). So a failure is expected here (--> backward incompatible change).

@tavplubix
Copy link
Member

It should be possible to suppress BC check for intended backward incompatible changes. For example, we can use a allow_suspicious_codecs to allow incorrect usages of Gorilla codec and enable the setting before starting a new server in BC check here:

# Start new server
mv package_folder/clickhouse /usr/bin/
mv package_folder/clickhouse.debug /usr/lib/debug/usr/bin/clickhouse.debug
# Disable fault injections on start (we don't test them here, and it can lead to tons of requests in case of huge number of tables).
export ZOOKEEPER_FAULT_INJECTION=0
configure
start 500

@rschu1ze
Copy link
Member Author

Okay, thanks. Didn't think about suppressing the change. Will pick this this up again hopefully soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-backward-incompatible Pull request with backwards incompatible changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants