Skip to content

bloomfilter: use libdivide to compute the bit location#79800

Merged
alexey-milovidov merged 4 commits intoClickHouse:masterfrom
dkratunov:faster-blooms
May 16, 2025
Merged

bloomfilter: use libdivide to compute the bit location#79800
alexey-milovidov merged 4 commits intoClickHouse:masterfrom
dkratunov:faster-blooms

Conversation

@dkratunov
Copy link
Contributor

@dkratunov dkratunov commented May 2, 2025

This PR improves bloom filter performance.

The existing code computes the word and bit index using a straight-forward div. This is a problem because 1. the word location is a necessary dependency for the rest of the loop body, so this stalls the pipeline and 2. divs are very slow, obviously.

Thankfully, ClickHouse already has libdivide, so converting this to a mul using the reciprocal value is easy. While doing this, also removed one mul in the body and turned it into an add.

Lastly, ensured that word_size is marked as a constant, so the compiler can always emit shifts and masks for the word_size divisions and modulo ops.

All of this together makes the primary bottleneck in the BloomFilter functions CityHash, as it should be.
We have inspected this as carefully as possible to ensure mathematical equivalence and have tested it on existing data with no issues.

In our very large production cluster, we saw index time per byte in merges drop by ~45%:

Screenshot 2025-05-02 at 12 16 53 PM

Query:

SELECT
  toStartOfInterval (event_time, interval 5 minute) time,
  sum(
    ProfileEvents[
      'MergeTreeDataWriterSkipIndicesCalculationMicroseconds'
    ]
  ) / sum(size_in_bytes) index_us_per_byte
FROM
  clusterAllReplicas('{cluster}', system.part_log)
WHERE
  table = 'main'
  and event_date > now() - interval 14 day
  and $__timeFilter(event_time)
  and event_type = 'MergeParts'
  and size_in_bytes > 0
group by
  1
order by
  1 asc

Changelog category (leave one):

  • Performance Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

  • Performance improvements to all bloom filter types.

@CLAassistant
Copy link

CLAassistant commented May 2, 2025

CLA assistant check
All committers have signed the CLA.

@alexey-milovidov alexey-milovidov added the can be tested Allows running workflows for external contributors label May 3, 2025
@clickhouse-gh
Copy link
Contributor

clickhouse-gh bot commented May 3, 2025

Workflow [PR], commit [2b1cd06]

@clickhouse-gh clickhouse-gh bot added the pr-performance Pull request with some performance improvements label May 3, 2025
@alexey-milovidov
Copy link
Member

Thanks, looks good! Can we provide an isolated performance test? See tests/performance.

@alexey-milovidov alexey-milovidov self-assigned this May 3, 2025
@dkratunov
Copy link
Contributor Author

@alexey-milovidov what would you like to see in a new test? There's already tests/performance/bloom_filter_{insert,select}.xml which I think should show the effect.

@alexey-milovidov
Copy link
Member

@alexey-milovidov alexey-milovidov added this pull request to the merge queue May 16, 2025
Merged via the queue into ClickHouse:master with commit 7b47a42 May 16, 2025
118 of 120 checks passed
@robot-ch-test-poll3 robot-ch-test-poll3 added the pr-synced-to-cloud The PR is synced to the cloud repo label May 16, 2025
@robot-ch-test-poll robot-ch-test-poll added the pr-backports-created-cloud deprecated label, NOOP label Jun 4, 2025
@robot-clickhouse robot-clickhouse added the pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR label Jul 2, 2025
zvonand pushed a commit to Altinity/ClickHouse that referenced this pull request Oct 3, 2025
bloomfilter: use libdivide to compute the bit location
zvonand pushed a commit to Altinity/ClickHouse that referenced this pull request Oct 6, 2025
bloomfilter: use libdivide to compute the bit location
joelynch added a commit to aiven/ClickHouse that referenced this pull request Oct 23, 2025
joelynch added a commit to aiven/ClickHouse that referenced this pull request Oct 24, 2025
Khatskevich pushed a commit to aiven/ClickHouse that referenced this pull request Nov 12, 2025
Khatskevich pushed a commit to aiven/ClickHouse that referenced this pull request Nov 30, 2025
Khatskevich pushed a commit to aiven/ClickHouse that referenced this pull request Mar 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors pr-backports-created-cloud deprecated label, NOOP pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR pr-performance Pull request with some performance improvements pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants