Aggregate function running out of memory

Hi,

I'm creating a table containing 3 columns uid, ev and timestamp and running sequencematch function grouped by uid and searching for a pattern on events. The table has 10 million unique uids and 100 evs per uid, so a total of 1 billion rows. The query below is running out of memory(which is 8gb on the test machine). I understand from the docs that merge tree is sorted by the primary key in each part.

Is it guaranteed that a uid is present in a single part as it is part of the primary key? If so can we not call merge and insert for each part and not at the end and deallocate the data?

If uid is not guaranteed to be in the same part can you suggest a better alternative to achieve the below result.

Table:
CREATE TABLE ev ( uid String,  ev String,  t DateTime,  d Date) ENGINE = MergeTree(d, (uid, t, d), 8192)

Query:
SELECT count()  
FROM  
(  
    SELECT uid  
    FROM ev  
    GROUP BY uid  
    HAVING sequenceMatch('{"pattern":["ev1","ev2"]}')(t, ev)  
)   

Thanks


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Aggregate function running out of memory #38

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Aggregate function running out of memory #38

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions