docs: index predefined documents#1434
Conversation
Signed-off-by: Johannes Messner <[email protected]>
|
📝 Docs are deployed on https://ft-docs-predefined-index--jina-docs.netlify.app 🎉 |
|
|
||
| ### Using a predefined Document as schema | ||
|
|
||
| DocArray offers a number of predefined Documents, like [ImageDoce][docarray.documents.ImageDoc] and [TextDoc][docarray.documents.TextDoc]. |
There was a problem hiding this comment.
| DocArray offers a number of predefined Documents, like [ImageDoce][docarray.documents.ImageDoc] and [TextDoc][docarray.documents.TextDoc]. | |
| DocArray offers a number of predefined Documents, like [ImageDoc][docarray.documents.ImageDoc] and [TextDoc][docarray.documents.TextDoc]. |
|
|
||
| DocArray offers a number of predefined Documents, like [ImageDoce][docarray.documents.ImageDoc] and [TextDoc][docarray.documents.TextDoc]. | ||
| If you try to use these directly as a schema for a Document Index, you will get unexpected behavior: | ||
| Depending on the backend, and exception will be raised, or no vector index for ANN lookup will be built. |
There was a problem hiding this comment.
| Depending on the backend, and exception will be raised, or no vector index for ANN lookup will be built. | |
| Depending on the backend, an exception will be raised, or no vector index for ANN lookup will be built. |
| ``` | ||
|
|
||
| Once the schema of your Document Index is defined in this way, the data that you are indexing can be either of the | ||
| predefined Document type, or of your custom Document type. |
There was a problem hiding this comment.
| predefined Document type, or of your custom Document type. | |
| predefined Document types, or your custom Document type. |
| - A and B have the same field names and field types | ||
| - A and B have the same field names, and, for every field, the type of B is a subclass of the type of A | ||
|
|
||
| In particular this means that you can easily [index predefined Documents](#using-a-predefined-document-as-schema) into a Document Index. |
There was a problem hiding this comment.
| In particular this means that you can easily [index predefined Documents](#using-a-predefined-document-as-schema) into a Document Index. | |
| In particular, this means that you can easily [index predefined Documents](#using-a-predefined-document-as-schema) into a Document Index. |
There was a problem hiding this comment.
What's the policy on capitalizing Document now that we don't use that class name? I think @samsja mentioned on Discord we don't do that any more.
There was a problem hiding this comment.
I think we should still capitalize it, since it is a concept in our library. Lowercased it looks a bit weird and "unofficial" to me. Plus, I think the rule of thumb was always that "concepts" are capitalized, whereas classes go in between backticks
There was a problem hiding this comment.
No strong feeling here. But tehcnically speaking Document is not a concept in term of code in the library
There was a problem hiding this comment.
I think it is a concept but just not a class, otherwise "concept" and "class" would be synonyms. But I just checked the pydantic documentation, they don't capitalize "model". So no strong feeling either
Explains how to index predefined documents into a document index