Skip to content

fix: initialize doc with dataclass obj and kwargs#694

Merged
JoanFM merged 10 commits intomainfrom
fix-init-doc-with-dataclass-and-kwargs-447
Oct 28, 2022
Merged

fix: initialize doc with dataclass obj and kwargs#694
JoanFM merged 10 commits intomainfrom
fix-init-doc-with-dataclass-and-kwargs-447

Conversation

@anna-charlotte
Copy link
Copy Markdown
Contributor

Goals:

  • Allow initialization of a Document instance with a dataclass obj as well as additional kwargs.
  • Currently, when a Document is initialized with those two, the attributes passed with the dataclass object are not being saved in the Documentinstance, due to the fact of overwriting self._data, as described here: Bug: initializing Document with dataclass and additional kwargs #447

Example

from docarray import DocumentArray, Document, dataclass
from docarray.typing import Text

@dataclass
class MyDoc:
    chunk_text: Text

d = Document(MyDoc(chunk_text='chunk level text'), text='top level text')
print(d.is_multimodal)  # should be True, but is False
print(d.text)  # should print 'top level text'
print(d.chunk_text.text)  # should print 'chunk level text', but fails
  • check and update documentation, if required. See guide

@codecov
Copy link
Copy Markdown

codecov bot commented Oct 26, 2022

Codecov Report

Merging #694 (0b6461c) into main (3693c17) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main     #694   +/-   ##
=======================================
  Coverage   88.50%   88.50%           
=======================================
  Files         133      133           
  Lines        6542     6542           
=======================================
  Hits         5790     5790           
  Misses        752      752           
Flag Coverage Δ
docarray 88.50% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@anna-charlotte anna-charlotte requested a review from JoanFM October 26, 2022 13:25
@anna-charlotte anna-charlotte marked this pull request as ready for review October 26, 2022 13:25
Copy link
Copy Markdown
Member

@JohannesMessner JohannesMessner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good!

But can we have a test where we see how conflicting fields are handled?
I think if you use a basic type like str in the data class it will be put in the tags of the top level Document. What happens if I then also set tags as **kwargs? What if the keys in these two different tag dictionaries clash, what if they don't?

@anna-charlotte
Copy link
Copy Markdown
Contributor Author

Ok, makes sense, I will check that out and add a test accordingly

@anna-charlotte anna-charlotte marked this pull request as draft October 27, 2022 06:28
@anna-charlotte
Copy link
Copy Markdown
Contributor Author

anna-charlotte commented Oct 27, 2022

How do we expect the Document to behave incase it is initialized as follows:

def test_doc_with_dataclass_with_str_attr_and_additional_unknown_attr_with_same_name():
    @dataclass
    class MyDoc:
        name: str

    d = Document(MyDoc(name='mydoc'), name='doc')

    assert d.tags['name'] == 'doc' 
    assert d.tags['name'] == 'mydoc'  # fails

In this case, tags['name'] is being set to 'mydoc' first and then overwritten with 'doc' afterwards.
(This test has not been added yet).

@anna-charlotte anna-charlotte marked this pull request as ready for review October 27, 2022 09:08
@JoanFM JoanFM merged commit 030f5b3 into main Oct 28, 2022
@JoanFM JoanFM deleted the fix-init-doc-with-dataclass-and-kwargs-447 branch October 28, 2022 11:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: initializing Document with dataclass and additional kwargs

3 participants