Skip to content

feat: index or collection name will default to doc-type name#1486

Merged
JoanFM merged 11 commits intomainfrom
feat-default-name-index
May 6, 2023
Merged

feat: index or collection name will default to doc-type name#1486
JoanFM merged 11 commits intomainfrom
feat-default-name-index

Conversation

@JoanFM
Copy link
Copy Markdown
Member

@JoanFM JoanFM commented May 4, 2023

No description provided.

@JoanFM JoanFM force-pushed the feat-default-name-index branch from 96469c2 to 20d648a Compare May 4, 2023 14:04
Copy link
Copy Markdown
Member

@JohannesMessner JohannesMessner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a breaking change I think we need to be careful to guide people to un-break their code.

Concretely what will happen is that collections/tables created with a docarray version before this will be called Document, and when a user tries to access their database with a new version of docarray it will assume a table called MyDoc or something like that, and it will fail.

I think for this case we should catch the error and add a message like:

You are trying to connect to table {whatever}, but no such table exists. If you are trying to access a table that was created with DocArray version {one smaller than this one} this could be because the default table name has changed between versions. In that case, try adding the following QdrantDocIndex(..., table_name='Document'), or equivalently, adding table_name='Document' to your DBConfig

The details of this messag will depend on the particular backend i suppose

Signed-off-by: Joan Fontanals Martinez <[email protected]>
@JoanFM JoanFM force-pushed the feat-default-name-index branch from 7ef4fa5 to 18ea080 Compare May 4, 2023 14:28
@JoanFM JoanFM force-pushed the feat-default-name-index branch from 32ef595 to 8d3990c Compare May 4, 2023 15:05
@JoanFM
Copy link
Copy Markdown
Member Author

JoanFM commented May 4, 2023

Since this is a breaking change I think we need to be careful to guide people to un-break their code.

Concretely what will happen is that collections/tables created with a docarray version before this will be called Document, and when a user tries to access their database with a new version of docarray it will assume a table called MyDoc or something like that, and it will fail.

I think for this case we should catch the error and add a message like:

You are trying to connect to table {whatever}, but no such table exists. If you are trying to access a table that was created with DocArray version {one smaller than this one} this could be because the default table name has changed between versions. In that case, try adding the following QdrantDocIndex(..., table_name='Document'), or equivalently, adding table_name='Document' to your DBConfig

The details of this messag will depend on the particular backend i suppose

This would be really nice, but it does not seem so feasible, nowhere in the code I see this check being done, plus I am not sure what each of the Backends throw when this happens, or they may simply create the table/etc ...

I would simply add a breaking change note in the release notes once this is released

@JoanFM JoanFM force-pushed the feat-default-name-index branch 4 times, most recently from 421df67 to 590b23f Compare May 4, 2023 15:49
Signed-off-by: Joan Fontanals Martinez <[email protected]>
@JoanFM JoanFM force-pushed the feat-default-name-index branch from 590b23f to 44fcd80 Compare May 5, 2023 07:03
@github-actions github-actions bot added size/l and removed size/m labels May 5, 2023
@AnneYang720 AnneYang720 force-pushed the feat-default-name-index branch from 5f10ffd to 7c1acf9 Compare May 5, 2023 13:39
Signed-off-by: AnneY <[email protected]>
@AnneYang720 AnneYang720 force-pushed the feat-default-name-index branch from 7c1acf9 to 7358f5d Compare May 5, 2023 13:57
@JoanFM
Copy link
Copy Markdown
Member Author

JoanFM commented May 5, 2023

@AnneYang720 why is it needed to change the names? why the tests do not pass with default naming?

@AnneYang720
Copy link
Copy Markdown
Contributor

@AnneYang720 why is it needed to change the names? why the tests do not pass with default naming?

The schema name is used as default index_name, and sometimes this index_name is used in some tests before. For example:

index = ElasticDocIndex[SimpleDoc]()

The index_name is simpledoc and is used in previous tests(SimpleDoc is used as schema), the index.num_docs() is not zero at the moment.

@JoanFM
Copy link
Copy Markdown
Member Author

JoanFM commented May 5, 2023

@AnneYang720 why is it needed to change the names? why the tests do not pass with default naming?

The schema name is used as default index_name, and sometimes this index_name is used in some tests before. For example:

index = ElasticDocIndex[SimpleDoc]()

The index_name is simpledoc and is used in previous tests(SimpleDoc is used as schema), the index.num_docs() is not zero at the moment.

Can we have some tests were the default name is used for indexing and retrieving in one shot?

@AnneYang720
Copy link
Copy Markdown
Contributor

AnneYang720 commented May 5, 2023

@AnneYang720 why is it needed to change the names? why the tests do not pass with default naming?

The schema name is used as default index_name, and sometimes this index_name is used in some tests before. For example:

index = ElasticDocIndex[SimpleDoc]()

The index_name is simpledoc and is used in previous tests(SimpleDoc is used as schema), the index.num_docs() is not zero at the moment.

Can we have some tests were the default name is used for indexing and retrieving in one shot?

Some tests with schema that is only used once(like FlatSchema) use default index name. And I just changed test_get_single, test_get_multiple in elastic and test_persist_and_restore in qdrant so that they also use default index name.

@AnneYang720 AnneYang720 force-pushed the feat-default-name-index branch from e201503 to 4f4c31a Compare May 5, 2023 15:07
@AnneYang720 AnneYang720 force-pushed the feat-default-name-index branch from 4f4c31a to ff172e9 Compare May 5, 2023 15:19
@github-actions
Copy link
Copy Markdown

github-actions bot commented May 6, 2023

📝 Docs are deployed on https://ft-feat-default-name-index--jina-docs.netlify.app 🎉

@JoanFM JoanFM merged commit eb69318 into main May 6, 2023
@JoanFM JoanFM deleted the feat-default-name-index branch May 6, 2023 08:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Storing different documents in weaviate does not work because the index_name is not inferred from the BaseDoc subclass' name

3 participants