Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 14 additions & 10 deletions docs/user_guide/storing/doc_store/store_file.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,13 @@
# Store on-disk

When you want to use your [DocList][docarray.array.doc_list.doc_list.DocList] in another place, you can use the
[`.push()`][docarray.array.doc_list.pushpull.PushPullMixin.push] function to push the [DocList][docarray.array.doc_list.doc_list.DocList]
to one place and later use the [`.pull()`][docarray.array.doc_list.pushpull.PushPullMixin.pull] function to pull its content back.
When you want to use your [DocList][docarray.array.doc_list.doc_list.DocList] in another place, you can use:

- the [`.push()`][docarray.array.doc_list.pushpull.PushPullMixin.push] method to push the [DocList][docarray.array.doc_list.doc_list.DocList]
to one place.
- the [`.pull()`][docarray.array.doc_list.pushpull.PushPullMixin.pull] method to pull its content back.

## Push and pull

## Push & pull
To use the store locally, you need to pass a local file path to the function starting with `'file://'`.

```python
Expand All @@ -21,14 +24,15 @@ dl.push('file://simple_dl')
dl_pull = DocList[SimpleDoc].pull('file://simple_dl')
```

A file with the name of `simple_dl.docs` being created to store the `DocList`.
A file with the name of `simple_dl.docs` will be created in `$HOME/.docarray/cache` to store the `DocList`.


## Push and pull with streaming

## Push & pull with streaming
When you have a large amount of documents to push and pull, you could use the streaming function.
When you have a large amount of documents to push and pull, you can use the streaming method:
[`.push_stream()`][docarray.array.doc_list.pushpull.PushPullMixin.push_stream] and
[`.pull_stream()`][docarray.array.doc_list.pushpull.PushPullMixin.pull_stream] can help you to stream the `DocList` in
order to save the memory usage. You set multiple `DocList` to pull from the same source as well.
[`.pull_stream()`][docarray.array.doc_list.pushpull.PushPullMixin.pull_stream] stream the `DocList`
to save memory usage. You set multiple `DocList`s to pull from the same source as well:

```python
from docarray import BaseDoc, DocList
Expand Down Expand Up @@ -63,4 +67,4 @@ for d1, d2 in zip(dl_pull_stream_1, dl_pull_stream_2):
get SimpleDoc(id='1389877ac97b3e6d0e8eb17568934708', text='doc 6'), get SimpleDoc(id='1389877ac97b3e6d0e8eb17568934708', text='doc 6')
get SimpleDoc(id='264b0eff2cd138d296f15c685e15bf23', text='doc 7'), get SimpleDoc(id='264b0eff2cd138d296f15c685e15bf23', text='doc 7')
```
</details>
</details>
38 changes: 20 additions & 18 deletions docs/user_guide/storing/doc_store/store_jac.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,21 @@
# Store on Jina AI Cloud
When you want to use your [`DocList`][docarray.DocList] in another place, you can use the
[`.push()`][docarray.array.doc_list.pushpull.PushPullMixin.push] method to push the `DocList` to Jina AI Cloud and later use the
[`.pull()`][docarray.array.doc_list.pushpull.PushPullMixin.pull] function to pull its content back.

When you want to use your [`DocList`][docarray.DocList] in another place, you can use:
- the [`.push()`][docarray.array.doc_list.pushpull.PushPullMixin.push] method to push the `DocList` to Jina AI Cloud .
- the [`.pull()`][docarray.array.doc_list.pushpull.PushPullMixin.pull] function to pull its content back.

!!! note
To store on Jina AI Cloud, you need to install the extra dependency with the following line
To store documents on Jina AI Cloud, you need to install the extra dependency with the following line:

```cmd
pip install "docarray[jac]"
```

## Push & pull
## Push and pull

To use the store [`DocList`][docarray.DocList] on Jina AI Cloud, you need to pass a Jina AI Cloud path to the function starting with `'jac://'`.

Before getting started, you need to have an account at [Jina AI Cloud](http://cloud.jina.ai/) and created a [Personal Access Token (PAT)](https://cloud.jina.ai/settings/tokens).
Before getting started, create an account at [Jina AI Cloud](http://cloud.jina.ai/) and a [Personal Access Token (PAT)](https://cloud.jina.ai/settings/tokens).

```python
from docarray import BaseDoc, DocList
Expand All @@ -34,26 +37,25 @@ dl.push(f'jac://{DL_NAME}')
dl_pull = DocList[SimpleDoc].pull(f'jac://{DL_NAME}')
```


!!! note
When using `.push()` and `.pull()`, `DocList` calls the default boto3 client. Be sure your default session is correctly set up.
When using `.push()` and `.pull()`, `DocList` calls the default `boto3` client. Be sure your default session is correctly set up.

## Push and pull with streaming

## Push & pull with streaming
When you have a large amount of documents to push and pull, you could use the streaming function.
When you have a large amount of documents to push and pull, you can use the streaming function.
[`.push_stream()`][docarray.array.doc_list.pushpull.PushPullMixin.push_stream] and
[`.pull_stream()`][docarray.array.doc_list.pushpull.PushPullMixin.pull_stream] can help you to stream the
[`DocList`][docarray.DocList] in order to save the memory usage.
You set multiple `DocList` to pull from the same source as well.
The usage is the same as using streaming with local files.
Please refer to [Push & Pull with streaming with local files](store_file.md#push-pull-with-streaming).

[`.pull_stream()`][docarray.array.doc_list.pushpull.PushPullMixin.pull_stream] stream the
[`DocList`][docarray.DocList] to save memory usage.
You can set multiple `DocList` to pull from the same source as well.
The usage is the same as streaming with local files.
Please refer to [push and pull with streaming with local files](store_file.md#push-and-pull-with-streaming).

## Delete
To delete the store, you need to use the static method [`.delete()`][docarray.store.jac.JACDocStore.delete] of [`JACDocStore`][docarray.store.jac.JACDocStore] class.

To delete the store, you need to use the static method [`.delete()`][docarray.store.jac.JACDocStore.delete] of [`JACDocStore`][docarray.store.jac.JACDocStore] class:

```python
from docarray.store import JACDocStore

JACDocStore.delete(f'jac://{DL_NAME}')
```
```
Loading