Skip to content

Fix INSERT into Distributed non local node with MATERIALIZED columns#7377

Merged
akuzm merged 4 commits intoClickHouse:masterfrom
azat:INSERT-Distributed-MATERIALIZED-cols
Oct 24, 2019
Merged

Fix INSERT into Distributed non local node with MATERIALIZED columns#7377
akuzm merged 4 commits intoClickHouse:masterfrom
azat:INSERT-Distributed-MATERIALIZED-cols

Conversation

@azat
Copy link
Member

@azat azat commented Oct 18, 2019

Category (leave one):

  • Bug Fix

Short description (up to few sentences):
Fix INSERT into Distributed with MATERIALIZED columns

Previous patch e527def ("Fix INSERT
into Distributed() table with MATERIALIZED column") fixes it only for
cases when the node is local, i.e. direct insert.

This patch address the problem when the node is not local
(is_local == false), by erasing materialized columns on INSERT into
Distributed.

And this patch fixes two cases, depends on insert_distributed_sync
setting:

  • insert_distributed_sync=0

    Not found column value in block. There are only columns: date. Stack trace:
    
    2. 0x7ffff7be92e0 DB::Exception::Exception() dbms/src/Common/Exception.h:27
    3. 0x7fffec5d6cf6 DB::Block::getByName(...) dbms/src/Core/Block.cpp:187
    4. 0x7fffec2fe067 DB::NativeBlockInputStream::readImpl() dbms/src/DataStreams/NativeBlockInputStream.cpp:159
    5. 0x7fffec2d223f DB::IBlockInputStream::read() dbms/src/DataStreams/IBlockInputStream.cpp:61
    6. 0x7ffff7c6d40d DB::TCPHandler::receiveData() dbms/programs/server/TCPHandler.cpp:971
    7. 0x7ffff7c6cc1d DB::TCPHandler::receivePacket() dbms/programs/server/TCPHandler.cpp:855
    8. 0x7ffff7c6a1ef DB::TCPHandler::readDataNext(unsigned long const&, int const&) dbms/programs/server/TCPHandler.cpp:406
    9. 0x7ffff7c6a41b DB::TCPHandler::readData(DB::Settings const&) dbms/programs/server/TCPHandler.cpp:437
    10. 0x7ffff7c6a5d9 DB::TCPHandler::processInsertQuery(DB::Settings const&) dbms/programs/server/TCPHandler.cpp:464
    11. 0x7ffff7c687b5 DB::TCPHandler::runImpl() dbms/programs/server/TCPHandler.cpp:257
    
  • insert_distributed_sync=1

    2019.10.18 13:23:22.114578 [ 44 ] {a78f669f-0b08-4337-abf8-d31e958f6d12} <Error> executeQuery: Code: 171, e.displayText() = DB::Exception: Block structure mismatch in RemoteBlockOutputStream stream: different number of columns:
    date Date UInt16(size = 1), value Date UInt16(size = 1)
    date Date UInt16(size = 0): Insertion status:
    Wrote 1 blocks and 0 rows on shard 0 replica 0, 127.0.0.1:59000 (average 0 ms per block)
    Wrote 0 blocks and 0 rows on shard 1 replica 0, 127.0.0.2:59000 (average 2 ms per block)
     (version 19.16.1.1) (from [::1]:3624) (in query: INSERT INTO distributed_00952 VALUES ), Stack trace:
    
    2. 0x7ffff7be92e0 DB::Exception::Exception() dbms/src/Common/Exception.h:27
    3. 0x7fffec5da4e9 DB::checkBlockStructure<void>(...)::{...}::operator()(...) const dbms/src/Core/Block.cpp:460
    4. 0x7fffec5da671 void DB::checkBlockStructure<void>(...) dbms/src/Core/Block.cpp:467
    5. 0x7fffec5d8d58 DB::assertBlocksHaveEqualStructure(...) dbms/src/Core/Block.cpp:515
    6. 0x7fffec326630 DB::RemoteBlockOutputStream::write(DB::Block const&) dbms/src/DataStreams/RemoteBlockOutputStream.cpp:68
    7. 0x7fffe98bd154 DB::DistributedBlockOutputStream::runWritingJob(DB::DistributedBlockOutputStream::JobReplica&, DB::Block const&)::{lambda()#1}::operator()() const dbms/src/Storages/Distributed/DistributedBlockOutputStream.cpp:280
    <snip>
    

The test covers both cases.

Fixes: #7365
Fixes: #5429
Refs: #6891

@azat azat force-pushed the INSERT-Distributed-MATERIALIZED-cols branch 2 times, most recently from 11acebc to 59c36cc Compare October 18, 2019 07:31
@azat azat changed the title [WIP] Fix INSERT into Distributed with MATERIALIZED columns Fix INSERT into Distributed non local node with MATERIALIZED columns Oct 18, 2019
@azat
Copy link
Member Author

azat commented Oct 18, 2019

I'm going to expand the test, to cover both cases (insert_distributed_sync=0/1), so please don't merge until this will be done

@azat azat force-pushed the INSERT-Distributed-MATERIALIZED-cols branch from 59c36cc to 928852b Compare October 18, 2019 20:03
@azat
Copy link
Member Author

azat commented Oct 18, 2019

I'm going to expand the test, to cover both cases (insert_distributed_sync=0/1), so please don't merge until this will be done

Done

@azat azat force-pushed the INSERT-Distributed-MATERIALIZED-cols branch 2 times, most recently from dcbe865 to 9b584ea Compare October 19, 2019 08:33
azat added 2 commits October 23, 2019 21:54
Previous patch e527def ("Fix INSERT
into Distributed() table with MATERIALIZED column") fixes it only for
cases when the node is local, i.e. direct insert.

This patch address the problem when the node is not local
(`is_local == false`), by erasing materialized columns on INSERT into
Distributed.

And this patch fixes two cases, depends on `insert_distributed_sync`
setting:

- `insert_distributed_sync=0`

    ```
    Not found column value in block. There are only columns: date. Stack trace:

    2. 0x7ffff7be92e0 DB::Exception::Exception() dbms/src/Common/Exception.h:27
    3. 0x7fffec5d6cf6 DB::Block::getByName(...) dbms/src/Core/Block.cpp:187
    4. 0x7fffec2fe067 DB::NativeBlockInputStream::readImpl() dbms/src/DataStreams/NativeBlockInputStream.cpp:159
    5. 0x7fffec2d223f DB::IBlockInputStream::read() dbms/src/DataStreams/IBlockInputStream.cpp:61
    6. 0x7ffff7c6d40d DB::TCPHandler::receiveData() dbms/programs/server/TCPHandler.cpp:971
    7. 0x7ffff7c6cc1d DB::TCPHandler::receivePacket() dbms/programs/server/TCPHandler.cpp:855
    8. 0x7ffff7c6a1ef DB::TCPHandler::readDataNext(unsigned long const&, int const&) dbms/programs/server/TCPHandler.cpp:406
    9. 0x7ffff7c6a41b DB::TCPHandler::readData(DB::Settings const&) dbms/programs/server/TCPHandler.cpp:437
    10. 0x7ffff7c6a5d9 DB::TCPHandler::processInsertQuery(DB::Settings const&) dbms/programs/server/TCPHandler.cpp:464
    11. 0x7ffff7c687b5 DB::TCPHandler::runImpl() dbms/programs/server/TCPHandler.cpp:257
    ```

- `insert_distributed_sync=1`

    ```
    2019.10.18 13:23:22.114578 [ 44 ] {a78f669f-0b08-4337-abf8-d31e958f6d12} <Error> executeQuery: Code: 171, e.displayText() = DB::Exception: Block structure mismatch in RemoteBlockOutputStream stream: different number of columns:
    date Date UInt16(size = 1), value Date UInt16(size = 1)
    date Date UInt16(size = 0): Insertion status:
    Wrote 1 blocks and 0 rows on shard 0 replica 0, 127.0.0.1:59000 (average 0 ms per block)
    Wrote 0 blocks and 0 rows on shard 1 replica 0, 127.0.0.2:59000 (average 2 ms per block)
     (version 19.16.1.1) (from [::1]:3624) (in query: INSERT INTO distributed_00952 VALUES ), Stack trace:

    2. 0x7ffff7be92e0 DB::Exception::Exception() dbms/src/Common/Exception.h:27
    3. 0x7fffec5da4e9 DB::checkBlockStructure<void>(...)::{...}::operator()(...) const dbms/src/Core/Block.cpp:460
    4. 0x7fffec5da671 void DB::checkBlockStructure<void>(...) dbms/src/Core/Block.cpp:467
    5. 0x7fffec5d8d58 DB::assertBlocksHaveEqualStructure(...) dbms/src/Core/Block.cpp:515
    6. 0x7fffec326630 DB::RemoteBlockOutputStream::write(DB::Block const&) dbms/src/DataStreams/RemoteBlockOutputStream.cpp:68
    7. 0x7fffe98bd154 DB::DistributedBlockOutputStream::runWritingJob(DB::DistributedBlockOutputStream::JobReplica&, DB::Block const&)::{lambda()ClickHouse#1}::operator()() const dbms/src/Storages/Distributed/DistributedBlockOutputStream.cpp:280
    <snip>
    ````

Fixes: ClickHouse#7365
Fixes: ClickHouse#5429
Refs: ClickHouse#6891
… node

I guess that adding new cluster into server-test.xml is not required,
but it won't harm.
@azat azat force-pushed the INSERT-Distributed-MATERIALIZED-cols branch from 9b584ea to 80cf86f Compare October 23, 2019 18:55
@akuzm akuzm merged commit 29052b6 into ClickHouse:master Oct 24, 2019
@azat azat deleted the INSERT-Distributed-MATERIALIZED-cols branch October 24, 2019 09:52
@akuzm akuzm added the pr-bugfix Pull request with bugfix, not backported by default label Oct 29, 2019
akuzm added a commit that referenced this pull request Oct 29, 2019
* Fix INSERT into Distributed non local node with MATERIALIZED columns

Previous patch e527def ("Fix INSERT
into Distributed() table with MATERIALIZED column") fixes it only for
cases when the node is local, i.e. direct insert.

This patch address the problem when the node is not local
(`is_local == false`), by erasing materialized columns on INSERT into
Distributed.

And this patch fixes two cases, depends on `insert_distributed_sync`
setting:

- `insert_distributed_sync=0`

    ```
    Not found column value in block. There are only columns: date. Stack trace:

    2. 0x7ffff7be92e0 DB::Exception::Exception() dbms/src/Common/Exception.h:27
    3. 0x7fffec5d6cf6 DB::Block::getByName(...) dbms/src/Core/Block.cpp:187
    4. 0x7fffec2fe067 DB::NativeBlockInputStream::readImpl() dbms/src/DataStreams/NativeBlockInputStream.cpp:159
    5. 0x7fffec2d223f DB::IBlockInputStream::read() dbms/src/DataStreams/IBlockInputStream.cpp:61
    6. 0x7ffff7c6d40d DB::TCPHandler::receiveData() dbms/programs/server/TCPHandler.cpp:971
    7. 0x7ffff7c6cc1d DB::TCPHandler::receivePacket() dbms/programs/server/TCPHandler.cpp:855
    8. 0x7ffff7c6a1ef DB::TCPHandler::readDataNext(unsigned long const&, int const&) dbms/programs/server/TCPHandler.cpp:406
    9. 0x7ffff7c6a41b DB::TCPHandler::readData(DB::Settings const&) dbms/programs/server/TCPHandler.cpp:437
    10. 0x7ffff7c6a5d9 DB::TCPHandler::processInsertQuery(DB::Settings const&) dbms/programs/server/TCPHandler.cpp:464
    11. 0x7ffff7c687b5 DB::TCPHandler::runImpl() dbms/programs/server/TCPHandler.cpp:257
    ```

- `insert_distributed_sync=1`

    ```
    2019.10.18 13:23:22.114578 [ 44 ] {a78f669f-0b08-4337-abf8-d31e958f6d12} <Error> executeQuery: Code: 171, e.displayText() = DB::Exception: Block structure mismatch in RemoteBlockOutputStream stream: different number of columns:
    date Date UInt16(size = 1), value Date UInt16(size = 1)
    date Date UInt16(size = 0): Insertion status:
    Wrote 1 blocks and 0 rows on shard 0 replica 0, 127.0.0.1:59000 (average 0 ms per block)
    Wrote 0 blocks and 0 rows on shard 1 replica 0, 127.0.0.2:59000 (average 2 ms per block)
     (version 19.16.1.1) (from [::1]:3624) (in query: INSERT INTO distributed_00952 VALUES ), Stack trace:

    2. 0x7ffff7be92e0 DB::Exception::Exception() dbms/src/Common/Exception.h:27
    3. 0x7fffec5da4e9 DB::checkBlockStructure<void>(...)::{...}::operator()(...) const dbms/src/Core/Block.cpp:460
    4. 0x7fffec5da671 void DB::checkBlockStructure<void>(...) dbms/src/Core/Block.cpp:467
    5. 0x7fffec5d8d58 DB::assertBlocksHaveEqualStructure(...) dbms/src/Core/Block.cpp:515
    6. 0x7fffec326630 DB::RemoteBlockOutputStream::write(DB::Block const&) dbms/src/DataStreams/RemoteBlockOutputStream.cpp:68
    7. 0x7fffe98bd154 DB::DistributedBlockOutputStream::runWritingJob(DB::DistributedBlockOutputStream::JobReplica&, DB::Block const&)::{lambda()#1}::operator()() const dbms/src/Storages/Distributed/DistributedBlockOutputStream.cpp:280
    <snip>
    ````

Fixes: #7365
Fixes: #5429
Refs: #6891

* Cover INSERT into Distributed with MATERIALIZED columns and !is_local node

I guess that adding new cluster into server-test.xml is not required,
but it won't harm.

* Update DistributedBlockOutputStream.cpp

(cherry picked from commit 29052b6)
akuzm added a commit that referenced this pull request Oct 29, 2019
* Fix INSERT into Distributed non local node with MATERIALIZED columns

Previous patch e527def ("Fix INSERT
into Distributed() table with MATERIALIZED column") fixes it only for
cases when the node is local, i.e. direct insert.

This patch address the problem when the node is not local
(`is_local == false`), by erasing materialized columns on INSERT into
Distributed.

And this patch fixes two cases, depends on `insert_distributed_sync`
setting:

- `insert_distributed_sync=0`

    ```
    Not found column value in block. There are only columns: date. Stack trace:

    2. 0x7ffff7be92e0 DB::Exception::Exception() dbms/src/Common/Exception.h:27
    3. 0x7fffec5d6cf6 DB::Block::getByName(...) dbms/src/Core/Block.cpp:187
    4. 0x7fffec2fe067 DB::NativeBlockInputStream::readImpl() dbms/src/DataStreams/NativeBlockInputStream.cpp:159
    5. 0x7fffec2d223f DB::IBlockInputStream::read() dbms/src/DataStreams/IBlockInputStream.cpp:61
    6. 0x7ffff7c6d40d DB::TCPHandler::receiveData() dbms/programs/server/TCPHandler.cpp:971
    7. 0x7ffff7c6cc1d DB::TCPHandler::receivePacket() dbms/programs/server/TCPHandler.cpp:855
    8. 0x7ffff7c6a1ef DB::TCPHandler::readDataNext(unsigned long const&, int const&) dbms/programs/server/TCPHandler.cpp:406
    9. 0x7ffff7c6a41b DB::TCPHandler::readData(DB::Settings const&) dbms/programs/server/TCPHandler.cpp:437
    10. 0x7ffff7c6a5d9 DB::TCPHandler::processInsertQuery(DB::Settings const&) dbms/programs/server/TCPHandler.cpp:464
    11. 0x7ffff7c687b5 DB::TCPHandler::runImpl() dbms/programs/server/TCPHandler.cpp:257
    ```

- `insert_distributed_sync=1`

    ```
    2019.10.18 13:23:22.114578 [ 44 ] {a78f669f-0b08-4337-abf8-d31e958f6d12} <Error> executeQuery: Code: 171, e.displayText() = DB::Exception: Block structure mismatch in RemoteBlockOutputStream stream: different number of columns:
    date Date UInt16(size = 1), value Date UInt16(size = 1)
    date Date UInt16(size = 0): Insertion status:
    Wrote 1 blocks and 0 rows on shard 0 replica 0, 127.0.0.1:59000 (average 0 ms per block)
    Wrote 0 blocks and 0 rows on shard 1 replica 0, 127.0.0.2:59000 (average 2 ms per block)
     (version 19.16.1.1) (from [::1]:3624) (in query: INSERT INTO distributed_00952 VALUES ), Stack trace:

    2. 0x7ffff7be92e0 DB::Exception::Exception() dbms/src/Common/Exception.h:27
    3. 0x7fffec5da4e9 DB::checkBlockStructure<void>(...)::{...}::operator()(...) const dbms/src/Core/Block.cpp:460
    4. 0x7fffec5da671 void DB::checkBlockStructure<void>(...) dbms/src/Core/Block.cpp:467
    5. 0x7fffec5d8d58 DB::assertBlocksHaveEqualStructure(...) dbms/src/Core/Block.cpp:515
    6. 0x7fffec326630 DB::RemoteBlockOutputStream::write(DB::Block const&) dbms/src/DataStreams/RemoteBlockOutputStream.cpp:68
    7. 0x7fffe98bd154 DB::DistributedBlockOutputStream::runWritingJob(DB::DistributedBlockOutputStream::JobReplica&, DB::Block const&)::{lambda()#1}::operator()() const dbms/src/Storages/Distributed/DistributedBlockOutputStream.cpp:280
    <snip>
    ````

Fixes: #7365
Fixes: #5429
Refs: #6891

* Cover INSERT into Distributed with MATERIALIZED columns and !is_local node

I guess that adding new cluster into server-test.xml is not required,
but it won't harm.

* Update DistributedBlockOutputStream.cpp

(cherry picked from commit 29052b6)
@akuzm akuzm added the v19.16 label Oct 30, 2019
CurtizJ pushed a commit that referenced this pull request Nov 15, 2019
* Fix INSERT into Distributed non local node with MATERIALIZED columns

Previous patch e527def ("Fix INSERT
into Distributed() table with MATERIALIZED column") fixes it only for
cases when the node is local, i.e. direct insert.

This patch address the problem when the node is not local
(`is_local == false`), by erasing materialized columns on INSERT into
Distributed.

And this patch fixes two cases, depends on `insert_distributed_sync`
setting:

- `insert_distributed_sync=0`

    ```
    Not found column value in block. There are only columns: date. Stack trace:

    2. 0x7ffff7be92e0 DB::Exception::Exception() dbms/src/Common/Exception.h:27
    3. 0x7fffec5d6cf6 DB::Block::getByName(...) dbms/src/Core/Block.cpp:187
    4. 0x7fffec2fe067 DB::NativeBlockInputStream::readImpl() dbms/src/DataStreams/NativeBlockInputStream.cpp:159
    5. 0x7fffec2d223f DB::IBlockInputStream::read() dbms/src/DataStreams/IBlockInputStream.cpp:61
    6. 0x7ffff7c6d40d DB::TCPHandler::receiveData() dbms/programs/server/TCPHandler.cpp:971
    7. 0x7ffff7c6cc1d DB::TCPHandler::receivePacket() dbms/programs/server/TCPHandler.cpp:855
    8. 0x7ffff7c6a1ef DB::TCPHandler::readDataNext(unsigned long const&, int const&) dbms/programs/server/TCPHandler.cpp:406
    9. 0x7ffff7c6a41b DB::TCPHandler::readData(DB::Settings const&) dbms/programs/server/TCPHandler.cpp:437
    10. 0x7ffff7c6a5d9 DB::TCPHandler::processInsertQuery(DB::Settings const&) dbms/programs/server/TCPHandler.cpp:464
    11. 0x7ffff7c687b5 DB::TCPHandler::runImpl() dbms/programs/server/TCPHandler.cpp:257
    ```

- `insert_distributed_sync=1`

    ```
    2019.10.18 13:23:22.114578 [ 44 ] {a78f669f-0b08-4337-abf8-d31e958f6d12} <Error> executeQuery: Code: 171, e.displayText() = DB::Exception: Block structure mismatch in RemoteBlockOutputStream stream: different number of columns:
    date Date UInt16(size = 1), value Date UInt16(size = 1)
    date Date UInt16(size = 0): Insertion status:
    Wrote 1 blocks and 0 rows on shard 0 replica 0, 127.0.0.1:59000 (average 0 ms per block)
    Wrote 0 blocks and 0 rows on shard 1 replica 0, 127.0.0.2:59000 (average 2 ms per block)
     (version 19.16.1.1) (from [::1]:3624) (in query: INSERT INTO distributed_00952 VALUES ), Stack trace:

    2. 0x7ffff7be92e0 DB::Exception::Exception() dbms/src/Common/Exception.h:27
    3. 0x7fffec5da4e9 DB::checkBlockStructure<void>(...)::{...}::operator()(...) const dbms/src/Core/Block.cpp:460
    4. 0x7fffec5da671 void DB::checkBlockStructure<void>(...) dbms/src/Core/Block.cpp:467
    5. 0x7fffec5d8d58 DB::assertBlocksHaveEqualStructure(...) dbms/src/Core/Block.cpp:515
    6. 0x7fffec326630 DB::RemoteBlockOutputStream::write(DB::Block const&) dbms/src/DataStreams/RemoteBlockOutputStream.cpp:68
    7. 0x7fffe98bd154 DB::DistributedBlockOutputStream::runWritingJob(DB::DistributedBlockOutputStream::JobReplica&, DB::Block const&)::{lambda()#1}::operator()() const dbms/src/Storages/Distributed/DistributedBlockOutputStream.cpp:280
    <snip>
    ````

Fixes: #7365
Fixes: #5429
Refs: #6891

* Cover INSERT into Distributed with MATERIALIZED columns and !is_local node

I guess that adding new cluster into server-test.xml is not required,
but it won't harm.

* Update DistributedBlockOutputStream.cpp

(cherry picked from commit 29052b6)
@CurtizJ CurtizJ added the v19.14 label Nov 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-bugfix Pull request with bugfix, not backported by default

Projects

None yet

Development

Successfully merging this pull request may close these issues.

19.15.3.6 distributed+ReplicatedMergeTree table can not write to materialized columns

5 participants