Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions docarray/array/storage/milvus/getsetdel.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,8 +74,7 @@ def _get_docs_by_ids(self, ids: 'Iterable[str]', **kwargs) -> 'DocumentArray':
raise KeyError(f'No documents found for ids {ids}')
docs.extend(self._docs_from_query_response(res))
# sort output docs according to input id sorting
id_to_index = {id_: i for i, id_ in enumerate(ids)}
return DocumentArray(sorted(docs, key=lambda d: id_to_index[d.id]))
return DocumentArray([docs[d] for d in ids])
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we keep the dict-based approach, but have every id_ point to a list of positions, and leverage that in key=... somehow?
The reason it was done this way is performance, we want to gather the id-to-position mapping only once, and then delegate everything else to sorted(), which leverages a fist implementation in C.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't think of a python built-in function. sort doesn't make the list longer than the original one.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I see, then let's keep it.


def _del_docs_by_ids(self, ids: 'Iterable[str]', **kwargs) -> 'DocumentArray':
kwargs = self._update_kwargs_from_config('consistency_level', **kwargs)
Expand Down
3 changes: 2 additions & 1 deletion tests/unit/array/test_advance_indexing.py
Original file line number Diff line number Diff line change
Expand Up @@ -273,12 +273,13 @@ def test_sequence_str(docs, storage, config, start_storage):
else:
docs = DocumentArray(docs, storage=storage)
# getter
idx = [d.id for d in docs[1, 3, 5, 7, -1, -2]]
idx = [d.id for d in docs[1, 3, 5, 7, -1, -2, 1]]

assert len(docs[idx]) == len(idx)
assert len(docs[tuple(idx)]) == len(idx)

# setter
idx = [d.id for d in docs[1, 3, 5, 7, -1, -2]]
docs[idx] = [Document(text='repl') for _ in range(len(idx))]
idx = [d.id for d in docs[1, 3, 5, 7, -1, -2]]
for _id in idx:
Expand Down