Conversation
939b627 to
5e6b9b0
Compare
|
Hi @jay-bhambhani, thanks for your contribution, we are rally excited about additional storage options for DocArray! I approved you for CI runs, but to get it going you will have to apply I will start reviewing this PR soon, but if you have any questions in the meantime, please just ping me here! |
Hi @JohannesMessner - excited to contribute in some way! Definitely will take a look - it might take a little bit because my laptop unfortunately crapped out on me. But - just wanted to let you know so you don't think I've disappeared! |
0495583 to
35c9038
Compare
Codecov ReportBase: 88.13% // Head: 70.51% // Decreases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #853 +/- ##
===========================================
- Coverage 88.13% 70.51% -17.62%
===========================================
Files 138 143 +5
Lines 7137 7462 +325
===========================================
- Hits 6290 5262 -1028
- Misses 847 2200 +1353
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
There was a problem hiding this comment.
I had a look, but I am not really familiar with OpenSearch (or even ES, for that matter), so could someone else chime in? @alaeddine-13 @violenil?
Edit: Looks like some tests are failing, maybe we let you do your thing @jay-bhambhani before we jump in. Please reach out if anything is unclear!
| #'ef_search': opensearch_config.ef_search, | ||
| 'ef_construction': opensearch_config.ef_construction, | ||
| 'm': opensearch_config.m, | ||
| #'encoder': opensearch_config.encoder, |
There was a problem hiding this comment.
what about these comments?
There was a problem hiding this comment.
added to do's for these - as a second step i think we should maybe allow for these options to be implemented as well?
|
thanks @JohannesMessner! will apply the feedback and continue chugging through these. one quick question - seeing this failure: hubble.excepts.AuthenticationRequiredError: -1: Jina auth token is not provided or has expired. <function PushPullMixin.push at 0x7f1321a538c0> requires login to Jina AI, please do |
I think you can ignore all the tests that fail due to this. We will probably end up moving these tests out of DocArray anyways |
b65946e to
61f655d
Compare
Awesome! Thanks. Sorry one more - I don't know what I'm breaking with commitlint. Can I please ask for a bit of assistance? I'll update the docs accordingly as well. Thank you so much for your help! |
Signed-off-by: Jayant Bhambhani <[email protected]> Signed-off-by: Jay Bhambhani <[email protected]> tests Signed-off-by: Jayant Bhambhani <[email protected]> Signed-off-by: Jay Bhambhani <[email protected]> progress including more testing Signed-off-by: Jayant Bhambhani <[email protected]> Signed-off-by: Jay Bhambhani <[email protected]> tests done Signed-off-by: Jayant Bhambhani <[email protected]> Signed-off-by: Jay Bhambhani <[email protected]> review Signed-off-by: Jayant Bhambhani <[email protected]> Signed-off-by: Jay Bhambhani <[email protected]> clean up elastic references Signed-off-by: Jayant Bhambhani <[email protected]> Signed-off-by: Jay Bhambhani <[email protected]> docs Signed-off-by: Jayant Bhambhani <[email protected]> Signed-off-by: Jay Bhambhani <[email protected]> setup Signed-off-by: Jayant Bhambhani <[email protected]> Signed-off-by: Jay Bhambhani <[email protected]> review and fixes Signed-off-by: Jay Bhambhani <[email protected]> more testing and hopefully fixing some old tests Signed-off-by: Jay Bhambhani <[email protected]> comment out unnecessary test Signed-off-by: Jay Bhambhani <[email protected]> deciding whether to make these variables available to config or not Signed-off-by: Jay Bhambhani <[email protected]> formatting Signed-off-by: Jay Bhambhani <[email protected]>
61f655d to
c9be60d
Compare
|
OK! Looks like the only tests that aren't passing are related to this JINA_AUTH_TOKEN issue. If somebody could please take a look! |
Co-authored-by: Johannes Messner <[email protected]> Signed-off-by: Jay Bhambhani <[email protected]>
alexcg1
left a comment
There was a problem hiding this comment.
Looks good from a preliminary docs check. Just a few small tweaks
|
|
||
| # Opensearch | ||
|
|
||
| One can use [Opensearch](https://opensearch.org/) as the document store for DocumentArray. It is useful when one wants to have faster Document retrieval on embeddings, i.e. `.match()`, `.find()`. |
There was a problem hiding this comment.
Could you change from "one" to "you" throughout the doc please? To follow our updated style guidelines (updated like yesterday - they're not public yet)
|
|
||
| ### Bulk request customization | ||
|
|
||
| You can customize how bulk requests is being sent to Opensearch when adding documents by adding additional `kwargs` on `extend` method call. See [the official Documentation](https://opensearch.org/docs/1.2/opensearch/rest-api/document-apis/bulk/) for more details. See the following code for example: |
There was a problem hiding this comment.
| You can customize how bulk requests is being sent to Opensearch when adding documents by adding additional `kwargs` on `extend` method call. See [the official Documentation](https://opensearch.org/docs/1.2/opensearch/rest-api/document-apis/bulk/) for more details. See the following code for example: | |
| You can customize how bulk requests are sent to Opensearch when adding Documents by adding additional `kwargs` on `extend` method call. See [the official documentation](https://opensearch.org/docs/1.2/opensearch/rest-api/document-apis/bulk/) for more details. See the following code for example: |
|
|
||
| One can perform Approximate Nearest Neighbor Search and pre-filter results using a filter query . | ||
|
|
||
| Consider Documents with embeddings `[0,0,0]` up to `[9,9,9]` where the document with embedding `[i,i,i]` |
There was a problem hiding this comment.
| Consider Documents with embeddings `[0,0,0]` up to `[9,9,9]` where the document with embedding `[i,i,i]` | |
| Consider Documents with embeddings `[0,0,0]` up to `[9,9,9]` where the Document with embedding `[i,i,i]` |
| ``` | ||
|
|
||
| Consider we want the nearest vectors to the embedding `[8. 8. 8.]`, with the restriction that | ||
| prices must follow a filter. As an example, let's consider that retrieved documents must have `price` value lower |
There was a problem hiding this comment.
Please ensure all occurrences of Document or DocumentArray are capitalized (since they are a class in DocArray)
| ``` | ||
| ```` | ||
|
|
||
| ## Config |
There was a problem hiding this comment.
| ## Config | |
| ## Configuration |
|
|
||
| The following configs can be set: |
There was a problem hiding this comment.
| The following configs can be set: |
This is redundant (I removed the line from other document store docs in a PR earlier today)
Signed-off-by: Jay Bhambhani <[email protected]>
|
Thanks a lot for your contribution @jay-bhambhani |
|
Glad I could be of service! I'll keep adding and improving things as we go! Great project! |
Goals: