Skip to content

feat: add compression and protocol in file name#138

Merged
hanxiao merged 16 commits intomainfrom
feat-add-serialization-info
Feb 25, 2022
Merged

feat: add compression and protocol in file name#138
hanxiao merged 16 commits intomainfrom
feat-add-serialization-info

Conversation

@davidbp
Copy link
Copy Markdown
Contributor

@davidbp davidbp commented Feb 23, 2022

Now there is the issue that if a DocumentArray is stored as binary users need to remember compression and protocol to load it back.

This PR adds this information as extensions in the file.

Example

da = DocumentArray([Document(tensor=np.array([1,2,3])),  Document(tensor=np.array([7,8,9]))])
da.save_binary( 'my_docarray.protobuf.lz4')

will save

my_docarray.protobuf.lz4

Thefore, later on users don't need to recall which protocol and compress were used.
Since the information is in the string this PR parses it so that users don't need to pass this info as keyword arguments to load_binary. They can simply do:

da.load_binary('my_docarray.protobuf.lz4')

This PR solves #128

@codecov
Copy link
Copy Markdown

codecov bot commented Feb 23, 2022

Codecov Report

Merging #138 (2545ed7) into main (c8fc4b8) will decrease coverage by 0.01%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #138      +/-   ##
==========================================
- Coverage   83.89%   83.87%   -0.02%     
==========================================
  Files         106      108       +2     
  Lines        4551     4651     +100     
==========================================
+ Hits         3818     3901      +83     
- Misses        733      750      +17     
Flag Coverage Δ
docarray 83.87% <100.00%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
docarray/__init__.py 100.00% <100.00%> (ø)
docarray/array/mixins/io/binary.py 94.65% <100.00%> (+0.30%) ⬆️
docarray/helper.py 70.87% <100.00%> (+3.59%) ⬆️
docarray/array/mixins/match.py 75.00% <0.00%> (-17.95%) ⬇️
docarray/array/storage/pqlite/find.py 83.33% <0.00%> (-2.39%) ⬇️
docarray/document/mixins/image.py 56.57% <0.00%> (-0.67%) ⬇️
docarray/array/storage/memory/backend.py 96.15% <0.00%> (-0.15%) ⬇️
docarray/array/mixins/__init__.py 100.00% <0.00%> (ø)
docarray/array/storage/memory/__init__.py 100.00% <0.00%> (ø)
docarray/array/storage/sqlite/__init__.py 100.00% <0.00%> (ø)
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 66e533f...2545ed7. Read the comment docs.

@davidbp davidbp requested a review from JoanFM February 23, 2022 13:03
@davidbp davidbp marked this pull request as ready for review February 23, 2022 13:03
@davidbp davidbp marked this pull request as draft February 23, 2022 13:11
@davidbp davidbp requested a review from numb3r3 February 24, 2022 11:16
Copy link
Copy Markdown
Contributor

@numb3r3 numb3r3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@davidbp davidbp self-assigned this Feb 25, 2022
@davidbp davidbp marked this pull request as ready for review February 25, 2022 12:05
@github-actions
Copy link
Copy Markdown

📝 Docs are deployed on https://ft-feat-add-serialization-info--jina-docs.netlify.app 🎉

@hanxiao hanxiao merged commit fac597e into main Feb 25, 2022
@hanxiao hanxiao deleted the feat-add-serialization-info branch February 25, 2022 18:03
@hanxiao hanxiao linked an issue Feb 25, 2022 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add protocol and compression info DocumentArray filename

5 participants