Skip to content

fix: add support independent len calculation#840

Merged
JoanFM merged 1 commit intomainfrom
add_support_for_len
Nov 28, 2022
Merged

fix: add support independent len calculation#840
JoanFM merged 1 commit intomainfrom
add_support_for_len

Conversation

@dongxiang123
Copy link
Copy Markdown
Contributor

Signed-off-by: dong xiang [email protected]

Goals:

  • ...
    As reported by users, after disabling the list-like flag in Docarray backends, the len should still work, because otherwise the redis backend will fail. Therefore it is necessary to add this functionality.

  • ...

  • check and update documentation, if required. See guide

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Nov 24, 2022

Codecov Report

Base: 81.08% // Head: 79.73% // Decreases project coverage by -1.34% ⚠️

Coverage data is based on head (52809ec) compared to base (5fce6b6).
Patch coverage: 83.33% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #840      +/-   ##
==========================================
- Coverage   81.08%   79.73%   -1.35%     
==========================================
  Files         138      138              
  Lines        7067     7072       +5     
==========================================
- Hits         5730     5639      -91     
- Misses       1337     1433      +96     
Flag Coverage Δ
docarray 79.73% <83.33%> (-1.35%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
docarray/array/storage/milvus/getsetdel.py 97.10% <0.00%> (+0.04%) ⬆️
docarray/array/storage/redis/seqlike.py 88.57% <100.00%> (+4.70%) ⬆️
docarray/array/mixins/io/csv.py 23.68% <0.00%> (-65.79%) ⬇️
docarray/array/mixins/sample.py 45.45% <0.00%> (-54.55%) ⬇️
docarray/array/mixins/io/pushpull.py 23.27% <0.00%> (-52.59%) ⬇️
docarray/array/mixins/plot.py 28.13% <0.00%> (-39.83%) ⬇️
docarray/array/mixins/io/binary.py 92.15% <0.00%> (-4.58%) ⬇️
docarray/array/storage/redis/backend.py 89.91% <0.00%> (-3.37%) ⬇️
docarray/array/mixins/getattr.py 78.78% <0.00%> (-3.04%) ⬇️
docarray/array/storage/weaviate/backend.py 85.25% <0.00%> (+0.64%) ⬆️
... and 21 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@dongxiang123 dongxiang123 marked this pull request as ready for review November 24, 2022 05:11
Copy link
Copy Markdown
Contributor

@AnneYang720 AnneYang720 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can change the previous test to

@pytest.fixture(scope='function')
def docs():
    d1 = Document(id='0', embedding=np.array([10, 0]))
    d2 = Document(id='1', embedding=np.array([0, 10]))
    d3 = Document(id='2', embedding=np.array([-10, -10]))
    yield d1, d2, d3


@pytest.mark.parametrize(
    'storage,config',
    [
        ('weaviate', {'n_dim': 2, 'list_like': False}),
        ('annlite', {'n_dim': 2, 'list_like': False}),
        ('qdrant', {'n_dim': 2, 'list_like': False}),
        ('elasticsearch', {'n_dim': 2, 'list_like': False}),
        ('redis', {'n_dim': 2, 'list_like': False}),
        ('milvus', {'n_dim': 2, 'list_like': False}),
    ],
)
def test_disable_offset2id(docs, storage, config, start_storage):
    if config:
        da = DocumentArray(storage=storage, config=config)
    else:
        da = DocumentArray(storage=storage)
    
    assert len(da) == 0
    
    da.extend(docs)
    assert len(da) == 3
    
    del da['0']
    assert len(da) == 2

    with pytest.raises(ValueError):
        da[0]

Copy link
Copy Markdown
Contributor

@AnneYang720 AnneYang720 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are three more things:

  • Since you have changed the extend function of Offset2ID, the _extend function for storage backends should also be changed, like for Redis, if self._list_like: should be removed.
  • The _del_doc and _del_docs function will call self._offset2ids.delete_by_id(_id) and self._del_doc_by_id(_id). self._del_doc_by_id(_id) won't delete the document in storage if the key doesn't exist, but the self.ids_count -= 1 of offset2id is executed anyway. You can notice this issue if you change the del da['0'] to del da['any_id'] in the script above.
  • The test for milvus fails with AttributeError: 'DocumentArrayMilvus' object has no attribute '_offset2ids'

@dongxiang123 dongxiang123 force-pushed the add_support_for_len branch 4 times, most recently from 357b4e6 to a40e17d Compare November 25, 2022 03:29
@JoanFM JoanFM linked an issue Nov 25, 2022 that may be closed by this pull request
@dongxiang123 dongxiang123 force-pushed the add_support_for_len branch 9 times, most recently from be3ddf3 to 579154a Compare November 28, 2022 10:40
@dongxiang123 dongxiang123 force-pushed the add_support_for_len branch 2 times, most recently from 00ed22a to c31e05c Compare November 28, 2022 13:33
@JoanFM JoanFM merged commit 4d9a678 into main Nov 28, 2022
@JoanFM JoanFM deleted the add_support_for_len branch November 28, 2022 16:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for __len__ in offset2id, making it indepdent from offset2id

4 participants