Skip to content

PromQL: rework converter to SQL#95673

Merged
vitlibar merged 14 commits intoClickHouse:masterfrom
vitlibar:promql-replace-converter-to-sql
Feb 11, 2026
Merged

PromQL: rework converter to SQL#95673
vitlibar merged 14 commits intoClickHouse:masterfrom
vitlibar:promql-replace-converter-to-sql

Conversation

@vitlibar
Copy link
Member

@vitlibar vitlibar commented Jan 30, 2026

Changelog category (leave one):

  • Not for changelog (changelog entry is not required)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

PromQL: rework converter to SQL

Part of #89356

@vitlibar vitlibar force-pushed the promql-replace-converter-to-sql branch from 0fe69a4 to 2aec388 Compare January 30, 2026 20:08
@clickhouse-gh
Copy link
Contributor

clickhouse-gh bot commented Jan 30, 2026

Workflow [PR], commit [04fd5eb]

Summary:

@clickhouse-gh clickhouse-gh bot added the pr-not-for-changelog This PR should not be mentioned in the changelog label Jan 30, 2026
@vitlibar vitlibar added the comp-promql Issues related to the PromQL support and TimeSeries table engine. label Jan 30, 2026
@vitlibar vitlibar requested a review from Copilot January 30, 2026 20:10
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the PromQL to SQL converter implementation by replacing the monolithic PrometheusQueryToSQLConverter class with a modular architecture. The changes introduce a new PrometheusQueryToSQL namespace containing separate components for different conversion tasks, making the code more maintainable and testable.

Changes:

  • Replaced PrometheusQueryToSQLConverter with modular PrometheusQueryToSQL::Converter
  • Split converter logic into focused components (selectors, functions, evaluation time, etc.)
  • Introduced PrometheusQueryEvaluationSettings to replace scattered evaluation parameters
  • Added helper functions for converting time series types to AST

Reviewed changes

Copilot reviewed 57 out of 57 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/Storages/TimeSeries/PrometheusQueryToSQL/Converter.{h,cpp} New main converter class with cleaner interface
src/Storages/TimeSeries/PrometheusQueryToSQL/*.{h,cpp} Modular components for specific conversion tasks
src/Storages/TimeSeries/PrometheusQueryEvaluationSettings.h Centralized evaluation settings structure
src/Storages/TimeSeries/timeSeriesTypesToAST.{h,cpp} Helper functions for AST conversion
src/Storages/StoragePrometheusQuery.{h,cpp} Updated to use new converter and configuration
src/Storages/StorageTimeSeriesSelector.{h,cpp} Updated to use new configuration structure
src/TableFunctions/TableFunctionPrometheusQuery.{h,cpp} Simplified using new converter
src/Parsers/Prometheus/parseTimeSeriesTypes.{h,cpp} New parsing utilities for timestamps and durations
src/DataTypes/DataTypesDecimal.h Added tryGetDecimalScale helper function
src/Core/DecimalFunctions.{h,cpp} Added getCurrentDateTime64 utility

@vitlibar vitlibar force-pushed the promql-replace-converter-to-sql branch 2 times, most recently from 06efb6a to 97a2869 Compare January 30, 2026 20:34
@vitlibar vitlibar marked this pull request as ready for review January 30, 2026 20:53
namespace DB::PrometheusQueryToSQL
{

/// Converts a prometheus query to SQL.
Copy link
Member Author

@vitlibar vitlibar Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The converter has been significantly reworked in this PR. Instead of one big file PrometheusQueryToSQL.cpp now there is a folder named PrometheusQueryToSQL which contains components of this conversion. The conversion itself has been also changed a lot to generate internal steps in a more suitable manner for implementing operators.

@nikitamikhaylov nikitamikhaylov self-assigned this Jan 30, 2026
@vitlibar vitlibar force-pushed the promql-replace-converter-to-sql branch 3 times, most recently from 9268b6c to f78996a Compare January 30, 2026 23:59
@vitlibar vitlibar changed the title PromQL: replace converter to SQL PromQL: rework converter to SQL Jan 31, 2026
@vitlibar vitlibar force-pushed the promql-replace-converter-to-sql branch from f78996a to f601fec Compare January 31, 2026 13:31
@vitlibar vitlibar force-pushed the promql-replace-converter-to-sql branch from f601fec to 00fe6fe Compare January 31, 2026 18:32
/// If `store_method` is VECTOR_GRID then the SELECT query outputs two columns `group` (UInt64), `values` (Array(Nullable(scalar_data_type))).
/// If `store_method` is RAW_DATA then the SELECT query outputs three columns `group` (UInt64), `timestamp` (timestamp_data_type), `value` (scalar_data_type).
/// If `store_method` is CONST_SCALAR or CONST_STRING then the SELECT query is not used.
ASTPtr select_query;
Copy link
Member Author

@vitlibar vitlibar Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other main change of this PR is that now instant vectors are represented as this select_query returning two columns: group (UInt64 meaning tags), and values (Array of nullable floats), and the timestamps aren't in the output of the select_query because they are fixed (in the context of evaluation) and stored instead in the fields SQLQueryPiece::start_time, SQLQueryPiece::end_time and step.

The values column is now convenient for applying operators and functions - we can just apply them to the corresponding values of such arrays of nullable floats, keeping nulls as is, and we get the result.

Before this PR the result of an instant vector was represented as two columns: group (UInt64 meaning tags), and time_series (Array of tuples (timestamp, value)), which was quite inconvenient for operator because it was too tricky to implement binary operators.

struct ConverterContext;

/// Represents how data is stored in a SQLQueryPiece.
enum class StoreMethod
Copy link
Member Author

@vitlibar vitlibar Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also this StoreMethod was introduced to keep the information how data is represented in SQL at each step while converting a prometheus query to SQL. Before this PR the code had to check often if the result contains specific columns which was quite error-prone and unclear. Now we're checking this StoreMethod so it's easier to see if we miss anything.

}

chassert(argument_index == args.size());
config = StoragePrometheusQuery::getConfiguration(args, context, over_range);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I moved parsing arguments of the table function to the storage class - as it's handled in some other our table functions and made the structure StoragePrometheusQuery::Configuration to help passing these arguments between the table function and the storage.

@vitlibar
Copy link
Member Author

vitlibar commented Feb 2, 2026

Ready for review

Copy link
Member

@nikitamikhaylov nikitamikhaylov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost everything is clear, but I really got confused with all the different naming regarding time: start, end, step, evaluation_time, evaluation_range, default_resolution, window, staleness. Can you please help me find out what is what?

Comment on lines +108 to +110
auto id_data_type = data_table_metadata->columns.get(TimeSeriesColumnNames::ID).type;
auto timestamp_data_type = data_table_metadata->columns.get(TimeSeriesColumnNames::Timestamp).type;
auto scalar_data_type = data_table_metadata->columns.get(TimeSeriesColumnNames::Value).type;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For future: we need to think if this variability is worth it. I think we can force the structure for all the tables.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's useful to allow a limited set of data types here. scalar_data_type can be either Float64 or Float32, and timestamp_data_type can be DateTime64 with any scale, or just number of seconds.

@vitlibar vitlibar force-pushed the promql-replace-converter-to-sql branch from 6db069e to 1c28451 Compare February 11, 2026 14:12
@vitlibar vitlibar force-pushed the promql-replace-converter-to-sql branch from 1c28451 to d6fa815 Compare February 11, 2026 14:31
@vitlibar
Copy link
Member Author

vitlibar commented Feb 11, 2026

Almost everything is clear, but I really got confused with all the different naming regarding time: start, end, step, evaluation_time, evaluation_range, default_resolution, window, staleness. Can you please help me find out what is what?

  1. evaluation_range is {start_time, end_time, step} (multiple timestamps starting fromstart_time with the specified step)

  2. evaluation_time is just one timestamp. I changed the code to just use start_time, end_time, step in PrometheusQueryEvaluationSettings without packing them in a separate structure to simplify the code.

  3. I renamed default_resolution to default_subquery_step. PromQL syntax allows subqueries as with explicit step (e.g. http_requests_total[1h:10m]) as without explicit step (e.g. http_requests_total[1h:] is a valid syntax). For such subqueries without explicit step we need to assign the step by default.

  4. window is the same as staleness, but the term window fits better for all cases. For example,

  • query last_over_time(http_requests_total[5m]) returns the last value in range (now - 5m, now], so we can call these 5m either staleness or window and both are ok; but
  • query avg_over_time(http_requests_total[5m]) returns the average value in range (now - 5m, now], and staleness doesn't sound very good, so window is better

@vitlibar vitlibar enabled auto-merge February 11, 2026 20:47
@vitlibar vitlibar added this pull request to the merge queue Feb 11, 2026
Merged via the queue into ClickHouse:master with commit cfb29a3 Feb 11, 2026
263 of 264 checks passed
@vitlibar vitlibar deleted the promql-replace-converter-to-sql branch February 11, 2026 21:02
@robot-clickhouse robot-clickhouse added the pr-synced-to-cloud The PR is synced to the cloud repo label Feb 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp-promql Issues related to the PromQL support and TimeSeries table engine. pr-not-for-changelog This PR should not be mentioned in the changelog pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants