fix: milvus _get_docs_by_ids by AnneYang720 · Pull Request #859 · docarray/docarray

AnneYang720 · 2022-11-28T15:20:17Z

Goals:

This PR is related to issue #857

codes
add related test

Signed-off-by: AnneY <[email protected]>

codecov-commenter · 2022-11-28T15:33:47Z

Codecov Report

Base: 84.78% // Head: 86.36% // Increases project coverage by +1.58% 🎉

Coverage data is based on head (57f0a4c) compared to base (86c4cd4).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #859      +/-   ##
==========================================
+ Coverage   84.78%   86.36%   +1.58%     
==========================================
  Files         138      138              
  Lines        7117     7116       -1     
==========================================
+ Hits         6034     6146     +112     
+ Misses       1083      970     -113

Flag	Coverage Δ
docarray	`86.36% <100.00%> (+1.58%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
docarray/array/storage/milvus/getsetdel.py	`97.05% <100.00%> (-0.05%)`	⬇️
docarray/array/storage/qdrant/getsetdel.py	`80.51% <0.00%> (+1.29%)`	⬆️
docarray/array/storage/redis/getsetdel.py	`97.14% <0.00%> (+1.42%)`	⬆️
docarray/array/storage/base/getsetdel.py	`91.39% <0.00%> (+1.98%)`	⬆️
docarray/array/storage/sqlite/getsetdel.py	`97.77% <0.00%> (+2.22%)`	⬆️
docarray/document/mixins/porting.py	`94.36% <0.00%> (+2.81%)`	⬆️
docarray/array/storage/base/helper.py	`90.56% <0.00%> (+3.77%)`	⬆️
docarray/array/storage/elastic/getsetdel.py	`100.00% <0.00%> (+4.76%)`	⬆️
docarray/array/storage/annlite/getsetdel.py	`100.00% <0.00%> (+4.87%)`	⬆️
docarray/array/storage/weaviate/getsetdel.py	`100.00% <0.00%> (+7.50%)`	⬆️
... and 8 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

JohannesMessner

Good catch - just one small performance concern.

JohannesMessner · 2022-11-28T17:42:45Z

docarray/array/storage/milvus/getsetdel.py

        # sort output docs according to input id sorting
-        id_to_index = {id_: i for i, id_ in enumerate(ids)}
-        return DocumentArray(sorted(docs, key=lambda d: id_to_index[d.id]))
+        return DocumentArray([docs[d] for d in ids])


Can we keep the dict-based approach, but have every id_ point to a list of positions, and leverage that in key=... somehow?
The reason it was done this way is performance, we want to gather the id-to-position mapping only once, and then delegate everything else to sorted(), which leverages a fist implementation in C.

I can't think of a python built-in function. sort doesn't make the list longer than the original one.

Ok I see, then let's keep it.

AnneYang720 added 3 commits November 28, 2022 22:43

fix: fix milvus _get_docs_by_ids

9ff5390

Signed-off-by: AnneY <[email protected]>

test: test _get_docs_by_ids with duplicated id

f9f6785

Signed-off-by: AnneY <[email protected]>

Merge branch 'main' into fix-milvus-getbyids

c325629

AnneYang720 marked this pull request as draft November 28, 2022 15:20

github-actions bot added size/xs area/core area/testing component/array labels Nov 28, 2022

JohannesMessner requested changes Nov 28, 2022

View reviewed changes

Merge branch 'main' into fix-milvus-getbyids

57f0a4c

AnneYang720 marked this pull request as ready for review November 29, 2022 06:56

JohannesMessner approved these changes Nov 29, 2022

View reviewed changes

JohannesMessner merged commit 67209b8 into main Nov 29, 2022

JohannesMessner deleted the fix-milvus-getbyids branch November 29, 2022 09:15

alexcg1 mentioned this pull request Dec 6, 2022

chore: draft release note v0.20 #894

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: milvus _get_docs_by_ids#859

fix: milvus _get_docs_by_ids#859
JohannesMessner merged 4 commits intomainfrom
fix-milvus-getbyids

AnneYang720 commented Nov 28, 2022 •

edited

Loading

Uh oh!

codecov-commenter commented Nov 28, 2022 •

edited

Loading

Uh oh!

JohannesMessner left a comment

Uh oh!

JohannesMessner Nov 28, 2022

Uh oh!

AnneYang720 Nov 29, 2022

Uh oh!

JohannesMessner Nov 29, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

AnneYang720 commented Nov 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented Nov 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

JohannesMessner left a comment

Choose a reason for hiding this comment

Uh oh!

JohannesMessner Nov 28, 2022

Choose a reason for hiding this comment

Uh oh!

AnneYang720 Nov 29, 2022

Choose a reason for hiding this comment

Uh oh!

JohannesMessner Nov 29, 2022

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

AnneYang720 commented Nov 28, 2022 •

edited

Loading

codecov-commenter commented Nov 28, 2022 •

edited

Loading