Fix INSERT into Distributed() table with MATERIALIZED column#5429
Merged
vitlibar merged 3 commits intoClickHouse:masterfrom Jun 1, 2019
Merged
Conversation
8f0c0d2 to
a06d492
Compare
By just skipping MATERIALIZED columns during processing. P.S. you cannot use insert_allow_materialized_columns since it works only for Buffer() engine. Fixes: ClickHouse#4015 Fixes: ClickHouse#3673 Fixes: 01501fa ("correct column list for rewritten INSERT query into Distributed [#CLICKHOUSE-4161]")
a06d492 to
e527def
Compare
Member
Author
|
Any thoughts on this? |
Member
Looks good. |
Member
Yes, we want to run as many tests as possible in parallel. |
azat
added a commit
to azat/ClickHouse
that referenced
this pull request
Oct 23, 2019
Previous patch e527def ("Fix INSERT into Distributed() table with MATERIALIZED column") fixes it only for cases when the node is local, i.e. direct insert. This patch address the problem when the node is not local (`is_local == false`), by erasing materialized columns on INSERT into Distributed. And this patch fixes two cases, depends on `insert_distributed_sync` setting: - `insert_distributed_sync=0` ``` Not found column value in block. There are only columns: date. Stack trace: 2. 0x7ffff7be92e0 DB::Exception::Exception() dbms/src/Common/Exception.h:27 3. 0x7fffec5d6cf6 DB::Block::getByName(...) dbms/src/Core/Block.cpp:187 4. 0x7fffec2fe067 DB::NativeBlockInputStream::readImpl() dbms/src/DataStreams/NativeBlockInputStream.cpp:159 5. 0x7fffec2d223f DB::IBlockInputStream::read() dbms/src/DataStreams/IBlockInputStream.cpp:61 6. 0x7ffff7c6d40d DB::TCPHandler::receiveData() dbms/programs/server/TCPHandler.cpp:971 7. 0x7ffff7c6cc1d DB::TCPHandler::receivePacket() dbms/programs/server/TCPHandler.cpp:855 8. 0x7ffff7c6a1ef DB::TCPHandler::readDataNext(unsigned long const&, int const&) dbms/programs/server/TCPHandler.cpp:406 9. 0x7ffff7c6a41b DB::TCPHandler::readData(DB::Settings const&) dbms/programs/server/TCPHandler.cpp:437 10. 0x7ffff7c6a5d9 DB::TCPHandler::processInsertQuery(DB::Settings const&) dbms/programs/server/TCPHandler.cpp:464 11. 0x7ffff7c687b5 DB::TCPHandler::runImpl() dbms/programs/server/TCPHandler.cpp:257 ``` - `insert_distributed_sync=1` ``` 2019.10.18 13:23:22.114578 [ 44 ] {a78f669f-0b08-4337-abf8-d31e958f6d12} <Error> executeQuery: Code: 171, e.displayText() = DB::Exception: Block structure mismatch in RemoteBlockOutputStream stream: different number of columns: date Date UInt16(size = 1), value Date UInt16(size = 1) date Date UInt16(size = 0): Insertion status: Wrote 1 blocks and 0 rows on shard 0 replica 0, 127.0.0.1:59000 (average 0 ms per block) Wrote 0 blocks and 0 rows on shard 1 replica 0, 127.0.0.2:59000 (average 2 ms per block) (version 19.16.1.1) (from [::1]:3624) (in query: INSERT INTO distributed_00952 VALUES ), Stack trace: 2. 0x7ffff7be92e0 DB::Exception::Exception() dbms/src/Common/Exception.h:27 3. 0x7fffec5da4e9 DB::checkBlockStructure<void>(...)::{...}::operator()(...) const dbms/src/Core/Block.cpp:460 4. 0x7fffec5da671 void DB::checkBlockStructure<void>(...) dbms/src/Core/Block.cpp:467 5. 0x7fffec5d8d58 DB::assertBlocksHaveEqualStructure(...) dbms/src/Core/Block.cpp:515 6. 0x7fffec326630 DB::RemoteBlockOutputStream::write(DB::Block const&) dbms/src/DataStreams/RemoteBlockOutputStream.cpp:68 7. 0x7fffe98bd154 DB::DistributedBlockOutputStream::runWritingJob(DB::DistributedBlockOutputStream::JobReplica&, DB::Block const&)::{lambda()ClickHouse#1}::operator()() const dbms/src/Storages/Distributed/DistributedBlockOutputStream.cpp:280 <snip> ```` Fixes: ClickHouse#7365 Fixes: ClickHouse#5429 Refs: ClickHouse#6891
akuzm
added a commit
that referenced
this pull request
Oct 24, 2019
* Fix INSERT into Distributed non local node with MATERIALIZED columns Previous patch e527def ("Fix INSERT into Distributed() table with MATERIALIZED column") fixes it only for cases when the node is local, i.e. direct insert. This patch address the problem when the node is not local (`is_local == false`), by erasing materialized columns on INSERT into Distributed. And this patch fixes two cases, depends on `insert_distributed_sync` setting: - `insert_distributed_sync=0` ``` Not found column value in block. There are only columns: date. Stack trace: 2. 0x7ffff7be92e0 DB::Exception::Exception() dbms/src/Common/Exception.h:27 3. 0x7fffec5d6cf6 DB::Block::getByName(...) dbms/src/Core/Block.cpp:187 4. 0x7fffec2fe067 DB::NativeBlockInputStream::readImpl() dbms/src/DataStreams/NativeBlockInputStream.cpp:159 5. 0x7fffec2d223f DB::IBlockInputStream::read() dbms/src/DataStreams/IBlockInputStream.cpp:61 6. 0x7ffff7c6d40d DB::TCPHandler::receiveData() dbms/programs/server/TCPHandler.cpp:971 7. 0x7ffff7c6cc1d DB::TCPHandler::receivePacket() dbms/programs/server/TCPHandler.cpp:855 8. 0x7ffff7c6a1ef DB::TCPHandler::readDataNext(unsigned long const&, int const&) dbms/programs/server/TCPHandler.cpp:406 9. 0x7ffff7c6a41b DB::TCPHandler::readData(DB::Settings const&) dbms/programs/server/TCPHandler.cpp:437 10. 0x7ffff7c6a5d9 DB::TCPHandler::processInsertQuery(DB::Settings const&) dbms/programs/server/TCPHandler.cpp:464 11. 0x7ffff7c687b5 DB::TCPHandler::runImpl() dbms/programs/server/TCPHandler.cpp:257 ``` - `insert_distributed_sync=1` ``` 2019.10.18 13:23:22.114578 [ 44 ] {a78f669f-0b08-4337-abf8-d31e958f6d12} <Error> executeQuery: Code: 171, e.displayText() = DB::Exception: Block structure mismatch in RemoteBlockOutputStream stream: different number of columns: date Date UInt16(size = 1), value Date UInt16(size = 1) date Date UInt16(size = 0): Insertion status: Wrote 1 blocks and 0 rows on shard 0 replica 0, 127.0.0.1:59000 (average 0 ms per block) Wrote 0 blocks and 0 rows on shard 1 replica 0, 127.0.0.2:59000 (average 2 ms per block) (version 19.16.1.1) (from [::1]:3624) (in query: INSERT INTO distributed_00952 VALUES ), Stack trace: 2. 0x7ffff7be92e0 DB::Exception::Exception() dbms/src/Common/Exception.h:27 3. 0x7fffec5da4e9 DB::checkBlockStructure<void>(...)::{...}::operator()(...) const dbms/src/Core/Block.cpp:460 4. 0x7fffec5da671 void DB::checkBlockStructure<void>(...) dbms/src/Core/Block.cpp:467 5. 0x7fffec5d8d58 DB::assertBlocksHaveEqualStructure(...) dbms/src/Core/Block.cpp:515 6. 0x7fffec326630 DB::RemoteBlockOutputStream::write(DB::Block const&) dbms/src/DataStreams/RemoteBlockOutputStream.cpp:68 7. 0x7fffe98bd154 DB::DistributedBlockOutputStream::runWritingJob(DB::DistributedBlockOutputStream::JobReplica&, DB::Block const&)::{lambda()#1}::operator()() const dbms/src/Storages/Distributed/DistributedBlockOutputStream.cpp:280 <snip> ```` Fixes: #7365 Fixes: #5429 Refs: #6891 * Cover INSERT into Distributed with MATERIALIZED columns and !is_local node I guess that adding new cluster into server-test.xml is not required, but it won't harm. * Update DistributedBlockOutputStream.cpp
akuzm
added a commit
that referenced
this pull request
Oct 29, 2019
* Fix INSERT into Distributed non local node with MATERIALIZED columns Previous patch e527def ("Fix INSERT into Distributed() table with MATERIALIZED column") fixes it only for cases when the node is local, i.e. direct insert. This patch address the problem when the node is not local (`is_local == false`), by erasing materialized columns on INSERT into Distributed. And this patch fixes two cases, depends on `insert_distributed_sync` setting: - `insert_distributed_sync=0` ``` Not found column value in block. There are only columns: date. Stack trace: 2. 0x7ffff7be92e0 DB::Exception::Exception() dbms/src/Common/Exception.h:27 3. 0x7fffec5d6cf6 DB::Block::getByName(...) dbms/src/Core/Block.cpp:187 4. 0x7fffec2fe067 DB::NativeBlockInputStream::readImpl() dbms/src/DataStreams/NativeBlockInputStream.cpp:159 5. 0x7fffec2d223f DB::IBlockInputStream::read() dbms/src/DataStreams/IBlockInputStream.cpp:61 6. 0x7ffff7c6d40d DB::TCPHandler::receiveData() dbms/programs/server/TCPHandler.cpp:971 7. 0x7ffff7c6cc1d DB::TCPHandler::receivePacket() dbms/programs/server/TCPHandler.cpp:855 8. 0x7ffff7c6a1ef DB::TCPHandler::readDataNext(unsigned long const&, int const&) dbms/programs/server/TCPHandler.cpp:406 9. 0x7ffff7c6a41b DB::TCPHandler::readData(DB::Settings const&) dbms/programs/server/TCPHandler.cpp:437 10. 0x7ffff7c6a5d9 DB::TCPHandler::processInsertQuery(DB::Settings const&) dbms/programs/server/TCPHandler.cpp:464 11. 0x7ffff7c687b5 DB::TCPHandler::runImpl() dbms/programs/server/TCPHandler.cpp:257 ``` - `insert_distributed_sync=1` ``` 2019.10.18 13:23:22.114578 [ 44 ] {a78f669f-0b08-4337-abf8-d31e958f6d12} <Error> executeQuery: Code: 171, e.displayText() = DB::Exception: Block structure mismatch in RemoteBlockOutputStream stream: different number of columns: date Date UInt16(size = 1), value Date UInt16(size = 1) date Date UInt16(size = 0): Insertion status: Wrote 1 blocks and 0 rows on shard 0 replica 0, 127.0.0.1:59000 (average 0 ms per block) Wrote 0 blocks and 0 rows on shard 1 replica 0, 127.0.0.2:59000 (average 2 ms per block) (version 19.16.1.1) (from [::1]:3624) (in query: INSERT INTO distributed_00952 VALUES ), Stack trace: 2. 0x7ffff7be92e0 DB::Exception::Exception() dbms/src/Common/Exception.h:27 3. 0x7fffec5da4e9 DB::checkBlockStructure<void>(...)::{...}::operator()(...) const dbms/src/Core/Block.cpp:460 4. 0x7fffec5da671 void DB::checkBlockStructure<void>(...) dbms/src/Core/Block.cpp:467 5. 0x7fffec5d8d58 DB::assertBlocksHaveEqualStructure(...) dbms/src/Core/Block.cpp:515 6. 0x7fffec326630 DB::RemoteBlockOutputStream::write(DB::Block const&) dbms/src/DataStreams/RemoteBlockOutputStream.cpp:68 7. 0x7fffe98bd154 DB::DistributedBlockOutputStream::runWritingJob(DB::DistributedBlockOutputStream::JobReplica&, DB::Block const&)::{lambda()#1}::operator()() const dbms/src/Storages/Distributed/DistributedBlockOutputStream.cpp:280 <snip> ```` Fixes: #7365 Fixes: #5429 Refs: #6891 * Cover INSERT into Distributed with MATERIALIZED columns and !is_local node I guess that adding new cluster into server-test.xml is not required, but it won't harm. * Update DistributedBlockOutputStream.cpp (cherry picked from commit 29052b6)
akuzm
added a commit
that referenced
this pull request
Oct 29, 2019
* Fix INSERT into Distributed non local node with MATERIALIZED columns Previous patch e527def ("Fix INSERT into Distributed() table with MATERIALIZED column") fixes it only for cases when the node is local, i.e. direct insert. This patch address the problem when the node is not local (`is_local == false`), by erasing materialized columns on INSERT into Distributed. And this patch fixes two cases, depends on `insert_distributed_sync` setting: - `insert_distributed_sync=0` ``` Not found column value in block. There are only columns: date. Stack trace: 2. 0x7ffff7be92e0 DB::Exception::Exception() dbms/src/Common/Exception.h:27 3. 0x7fffec5d6cf6 DB::Block::getByName(...) dbms/src/Core/Block.cpp:187 4. 0x7fffec2fe067 DB::NativeBlockInputStream::readImpl() dbms/src/DataStreams/NativeBlockInputStream.cpp:159 5. 0x7fffec2d223f DB::IBlockInputStream::read() dbms/src/DataStreams/IBlockInputStream.cpp:61 6. 0x7ffff7c6d40d DB::TCPHandler::receiveData() dbms/programs/server/TCPHandler.cpp:971 7. 0x7ffff7c6cc1d DB::TCPHandler::receivePacket() dbms/programs/server/TCPHandler.cpp:855 8. 0x7ffff7c6a1ef DB::TCPHandler::readDataNext(unsigned long const&, int const&) dbms/programs/server/TCPHandler.cpp:406 9. 0x7ffff7c6a41b DB::TCPHandler::readData(DB::Settings const&) dbms/programs/server/TCPHandler.cpp:437 10. 0x7ffff7c6a5d9 DB::TCPHandler::processInsertQuery(DB::Settings const&) dbms/programs/server/TCPHandler.cpp:464 11. 0x7ffff7c687b5 DB::TCPHandler::runImpl() dbms/programs/server/TCPHandler.cpp:257 ``` - `insert_distributed_sync=1` ``` 2019.10.18 13:23:22.114578 [ 44 ] {a78f669f-0b08-4337-abf8-d31e958f6d12} <Error> executeQuery: Code: 171, e.displayText() = DB::Exception: Block structure mismatch in RemoteBlockOutputStream stream: different number of columns: date Date UInt16(size = 1), value Date UInt16(size = 1) date Date UInt16(size = 0): Insertion status: Wrote 1 blocks and 0 rows on shard 0 replica 0, 127.0.0.1:59000 (average 0 ms per block) Wrote 0 blocks and 0 rows on shard 1 replica 0, 127.0.0.2:59000 (average 2 ms per block) (version 19.16.1.1) (from [::1]:3624) (in query: INSERT INTO distributed_00952 VALUES ), Stack trace: 2. 0x7ffff7be92e0 DB::Exception::Exception() dbms/src/Common/Exception.h:27 3. 0x7fffec5da4e9 DB::checkBlockStructure<void>(...)::{...}::operator()(...) const dbms/src/Core/Block.cpp:460 4. 0x7fffec5da671 void DB::checkBlockStructure<void>(...) dbms/src/Core/Block.cpp:467 5. 0x7fffec5d8d58 DB::assertBlocksHaveEqualStructure(...) dbms/src/Core/Block.cpp:515 6. 0x7fffec326630 DB::RemoteBlockOutputStream::write(DB::Block const&) dbms/src/DataStreams/RemoteBlockOutputStream.cpp:68 7. 0x7fffe98bd154 DB::DistributedBlockOutputStream::runWritingJob(DB::DistributedBlockOutputStream::JobReplica&, DB::Block const&)::{lambda()#1}::operator()() const dbms/src/Storages/Distributed/DistributedBlockOutputStream.cpp:280 <snip> ```` Fixes: #7365 Fixes: #5429 Refs: #6891 * Cover INSERT into Distributed with MATERIALIZED columns and !is_local node I guess that adding new cluster into server-test.xml is not required, but it won't harm. * Update DistributedBlockOutputStream.cpp (cherry picked from commit 29052b6)
CurtizJ
pushed a commit
that referenced
this pull request
Nov 15, 2019
* Fix INSERT into Distributed non local node with MATERIALIZED columns Previous patch e527def ("Fix INSERT into Distributed() table with MATERIALIZED column") fixes it only for cases when the node is local, i.e. direct insert. This patch address the problem when the node is not local (`is_local == false`), by erasing materialized columns on INSERT into Distributed. And this patch fixes two cases, depends on `insert_distributed_sync` setting: - `insert_distributed_sync=0` ``` Not found column value in block. There are only columns: date. Stack trace: 2. 0x7ffff7be92e0 DB::Exception::Exception() dbms/src/Common/Exception.h:27 3. 0x7fffec5d6cf6 DB::Block::getByName(...) dbms/src/Core/Block.cpp:187 4. 0x7fffec2fe067 DB::NativeBlockInputStream::readImpl() dbms/src/DataStreams/NativeBlockInputStream.cpp:159 5. 0x7fffec2d223f DB::IBlockInputStream::read() dbms/src/DataStreams/IBlockInputStream.cpp:61 6. 0x7ffff7c6d40d DB::TCPHandler::receiveData() dbms/programs/server/TCPHandler.cpp:971 7. 0x7ffff7c6cc1d DB::TCPHandler::receivePacket() dbms/programs/server/TCPHandler.cpp:855 8. 0x7ffff7c6a1ef DB::TCPHandler::readDataNext(unsigned long const&, int const&) dbms/programs/server/TCPHandler.cpp:406 9. 0x7ffff7c6a41b DB::TCPHandler::readData(DB::Settings const&) dbms/programs/server/TCPHandler.cpp:437 10. 0x7ffff7c6a5d9 DB::TCPHandler::processInsertQuery(DB::Settings const&) dbms/programs/server/TCPHandler.cpp:464 11. 0x7ffff7c687b5 DB::TCPHandler::runImpl() dbms/programs/server/TCPHandler.cpp:257 ``` - `insert_distributed_sync=1` ``` 2019.10.18 13:23:22.114578 [ 44 ] {a78f669f-0b08-4337-abf8-d31e958f6d12} <Error> executeQuery: Code: 171, e.displayText() = DB::Exception: Block structure mismatch in RemoteBlockOutputStream stream: different number of columns: date Date UInt16(size = 1), value Date UInt16(size = 1) date Date UInt16(size = 0): Insertion status: Wrote 1 blocks and 0 rows on shard 0 replica 0, 127.0.0.1:59000 (average 0 ms per block) Wrote 0 blocks and 0 rows on shard 1 replica 0, 127.0.0.2:59000 (average 2 ms per block) (version 19.16.1.1) (from [::1]:3624) (in query: INSERT INTO distributed_00952 VALUES ), Stack trace: 2. 0x7ffff7be92e0 DB::Exception::Exception() dbms/src/Common/Exception.h:27 3. 0x7fffec5da4e9 DB::checkBlockStructure<void>(...)::{...}::operator()(...) const dbms/src/Core/Block.cpp:460 4. 0x7fffec5da671 void DB::checkBlockStructure<void>(...) dbms/src/Core/Block.cpp:467 5. 0x7fffec5d8d58 DB::assertBlocksHaveEqualStructure(...) dbms/src/Core/Block.cpp:515 6. 0x7fffec326630 DB::RemoteBlockOutputStream::write(DB::Block const&) dbms/src/DataStreams/RemoteBlockOutputStream.cpp:68 7. 0x7fffe98bd154 DB::DistributedBlockOutputStream::runWritingJob(DB::DistributedBlockOutputStream::JobReplica&, DB::Block const&)::{lambda()#1}::operator()() const dbms/src/Storages/Distributed/DistributedBlockOutputStream.cpp:280 <snip> ```` Fixes: #7365 Fixes: #5429 Refs: #6891 * Cover INSERT into Distributed with MATERIALIZED columns and !is_local node I guess that adding new cluster into server-test.xml is not required, but it won't harm. * Update DistributedBlockOutputStream.cpp (cherry picked from commit 29052b6)
Member
Member
Author
|
@alexey-milovidov list of non-materialized columns is obtained from the Distributed table, while in #16897 the Distributed table has all columns non-materialized that's why it wasn't filtered out and rejected. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
By just skipping MATERIALIZED columns during processing.
P.S. you cannot use insert_allow_materialized_columns since it works
only for Buffer() engine.
Fixes: #4015
Fixes: #3673
Fixes: 01501fa ("correct column list for rewritten INSERT query into Distributed [#CLICKHOUSE-4161]")
Category (leave one):