Skip to content

Subindex find is broken by da['@c'] = new_da  #829

@AnneYang720

Description

@AnneYang720

After using da['@c'] = new_da with Redis or ElasticSearch storage backends, find on subindex level will no longer work. MRE is below.

from docarray import Document, DocumentArray
import numpy as np

with DocumentArray(
    storage='elasticsearch', # or redis
    config={
        'n_dim': 128,
    },
    subindex_configs={'@c': {'n_dim': 3}},
) as da:
    da.extend(
        [
            Document(
                id=f'{i}',
                chunks=[
                    Document(id=f'sub{i}_0', embedding=np.random.random(3)),
                    Document(id=f'sub{i}_1', embedding=np.random.random(3)),
                ],
            )
            for i in range(1)
        ]
    )
    res = da.find(np.random.random(3), on='@c') # this works
    
    da['@c'] = [Document(id='sub0_0', embedding=np.random.random(3)), Document(id='sub0_1', embedding=np.random.random(3))]
    res = da.find(np.random.random(3), on='@c') # this fails

This reason is da['@c'] = new_da will call subindex_da.clear() in _update_subindices_set and then _clear_storage. The _clear_storage of Redis and ElasticSearch deletes the index in database.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions