Skip to content

fix: fix save and load json behaviour#205

Merged
hanxiao merged 14 commits intomainfrom
fix-json-save-load
Mar 16, 2022
Merged

fix: fix save and load json behaviour#205
hanxiao merged 14 commits intomainfrom
fix-json-save-load

Conversation

@alaeddine-13
Copy link
Copy Markdown
Member

No description provided.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 15, 2022

Codecov Report

Merging #205 (b034960) into main (1b5f536) will increase coverage by 3.87%.
The diff coverage is 90.00%.

@@            Coverage Diff             @@
##             main     #205      +/-   ##
==========================================
+ Coverage   80.71%   84.59%   +3.87%     
==========================================
  Files         119      119              
  Lines        5295     5302       +7     
==========================================
+ Hits         4274     4485     +211     
+ Misses       1021      817     -204     
Flag Coverage Δ
docarray 84.59% <90.00%> (+3.87%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
docarray/document/mixins/porting.py 93.33% <75.00%> (+4.00%) ⬆️
docarray/array/mixins/io/json.py 91.42% <100.00%> (-1.08%) ⬇️
docarray/array/mixins/post.py 84.37% <0.00%> (-3.63%) ⬇️
docarray/array/queryset/lookup.py 78.26% <0.00%> (-1.44%) ⬇️
docarray/array/queryset/parser.py 87.09% <0.00%> (ø)
docarray/base.py 97.18% <0.00%> (+1.40%) ⬆️
docarray/array/storage/weaviate/find.py 89.74% <0.00%> (+2.56%) ⬆️
docarray/array/mixins/find.py 89.04% <0.00%> (+2.73%) ⬆️
docarray/array/mixins/getattr.py 81.81% <0.00%> (+3.03%) ⬆️
docarray/array/mixins/match.py 75.00% <0.00%> (+6.25%) ⬆️
... and 11 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1b5f536...b034960. Read the comment docs.

return cls.load_json(file, protocol=protocol, encoding=encoding, **kwargs)
from .... import Document

json_docs = json.loads(file)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are your sure this maps to the type hint?

        file: Union[str, bytes, bytearray, TextIO],

either this line or type hint, one of them is wrong

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed. Only str, bytes and bytearray are kept while TextIO and str are supported for load_json


return MessageToJson(self.to_protobuf(), **kwargs)
elif protocol == 'dynamic':
return json.dumps(self.to_dict(protocol=protocol))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pass the kwargs for json.dumps(.., **kwargs)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dumping is not supported anymore

Comment on lines +86 to +92
elif protocol == 'dynamic':
warnings.warn(
f'protocol=`{protocol}` is not supported, '
f'the result dict is a Python dynamic typing dict without any promise on the schema.'
'The result dict is a Python dynamic typing dict without any promise on the schema.'
)
return dataclasses.asdict(self._data)
res = dataclasses.asdict(self._data)
del res['_reference_doc']
return res
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need, you only need it in load/from

we do not support arbitrary/unschema-ed json dumping

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

Comment on lines +49 to +47
:param protocol: `jsonschema` or `protobuf`
:param protocol: `jsonschema`, `protobuf` or `dynamic`
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert to previous

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

pb_msg = DocumentProto()
json_format.ParseDict(obj, pb_msg, **kwargs)
return cls.from_protobuf(pb_msg)
elif protocol == 'dynamic':
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
elif protocol == 'dynamic':
else:

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 67 to 70
elif protocol == 'dynamic':
return cls.from_dict(json.loads(obj), protocol=protocol)
else:
raise ValueError(f'protocol=`{protocol}` is not supported')
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
elif protocol == 'dynamic':
return cls.from_dict(json.loads(obj), protocol=protocol)
else:
raise ValueError(f'protocol=`{protocol}` is not supported')
else:
return cls(json.loads(obj))

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done but using return cls.from_dict(json.loads(obj), protocol=protocol)

```

By default, it uses {ref}`JSON Schema and pydantic model<schema-gen>` for serialization, i.e. `protocol='jsonschema'`. You can switch the method to `protocol='protobuf'`, which leverages Protobuf as the JSON serialization backend.
By default, it uses {ref}`JSON Schema and pydantic model<schema-gen>` for serialization, i.e. `protocol='jsonschema'`. You can switch the method to `protocol='protobuf'`, which leverages Protobuf as the JSON serialization backend or `protocol='dynamic'` which accepts schemaless Documents.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
By default, it uses {ref}`JSON Schema and pydantic model<schema-gen>` for serialization, i.e. `protocol='jsonschema'`. You can switch the method to `protocol='protobuf'`, which leverages Protobuf as the JSON serialization backend or `protocol='dynamic'` which accepts schemaless Documents.
By default, it uses {ref}`JSON Schema and pydantic model<schema-gen>` for serialization, i.e. `protocol='jsonschema'`. You can switch the method to `protocol='protobuf'`, which leverages Protobuf as the JSON serialization backend.
To load an arbitrary JSON file, please set `protocol=None`. But as it is "arbitrary", you should not expect it can be succesfully loaded. DocArray tries its best reasonable effort by first load this JSON into `dict` and then load it via `Document(dict)`.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@github-actions
Copy link
Copy Markdown

📝 Docs are deployed on https://ft-fix-json-save-load--jina-docs.netlify.app 🎉

@hanxiao hanxiao merged commit a57d977 into main Mar 16, 2022
@hanxiao hanxiao deleted the fix-json-save-load branch March 16, 2022 16:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants