docs: refactor getting started section#728
docs: refactor getting started section#728NicholasDunham wants to merge 3 commits intodocarray:mainfrom
Conversation
Codecov ReportBase: 87.58% // Head: 62.13% // Decreases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #728 +/- ##
===========================================
- Coverage 87.58% 62.13% -25.46%
===========================================
Files 133 133
Lines 6703 6703
===========================================
- Hits 5871 4165 -1706
- Misses 832 2538 +1706
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
Co-authored-by: Joan Fontanals <[email protected]> Signed-off-by: Nicholas Dunham <[email protected]>
| (interaction-cloud)= | ||
| # Interaction with Jina AI Cloud | ||
|
|
||
| ```{important} |
There was a problem hiding this comment.
@samsja I believe this note is not needed right? they come when u install docarray with pip install docarray?
| da = DocumentArray.from_dataframe(df) | ||
| ``` | ||
|
|
||
| ## From/to cloud |
There was a problem hiding this comment.
I think removing this section is breaking for some of our content
In this PR I chose to keep it and document cloud support further in a separate section
| @@ -0,0 +1,41 @@ | |||
| (interaction-cloud)= | |||
There was a problem hiding this comment.
This is duplicate
We already added a cloud support section in this PR
#697
Let's remove this file
| # Construct | ||
|
|
||
| Initializing a Document object is super easy. This chapter introduces the ways of constructing empty Document, filled Document. One can also construct Document from bytes, JSON, Protobuf message as introduced {ref}`in the next chapter<serialize>`. | ||
| Initializing a Document object is super easy. This chapter introduces the ways of constructing empty Documents and filled Documents. One can also construct Documents from bytes, JSON, and Protobuf messages, as introduced {ref}`in the next chapter<serialize>`. |
There was a problem hiding this comment.
| Initializing a Document object is super easy. This chapter introduces the ways of constructing empty Documents and filled Documents. One can also construct Documents from bytes, JSON, and Protobuf messages, as introduced {ref}`in the next chapter<serialize>`. | |
| This section introduces the ways of constructing empty Documents and filled Documents. One can also construct Documents from bytes, JSON, and Protobuf messages, as introduced {ref}`in the next chapter<serialize>`. |
We should avoid this language, there is always something that for some users will not be easy.
| da = DocumentArray.pull('myda123', show_progress=True) | ||
| ``` | ||
|
|
||
| Now you can continue your work locally, analyzing `da` or visualizing it. Your friends & colleagues who know the token `myda123` can also pull that DocumentArray. It's useful when you want to quickly share the results with your colleagues & friends. |
There was a problem hiding this comment.
| Now you can continue your work locally, analyzing `da` or visualizing it. Your friends & colleagues who know the token `myda123` can also pull that DocumentArray. It's useful when you want to quickly share the results with your colleagues & friends. | |
| Now you can continue your work locally, analyzing `da` or visualizing it. Your friends and colleagues who know the token `myda123` can also pull that DocumentArray. It's useful when you want to quickly share the results with your colleagues and friends. |
| from docarray import DocumentArray | ||
|
|
||
| da = DocumentArray(...) # heavy lifting, processing, GPU tasks... | ||
| da.push('myda123', show_progress=True) |
There was a problem hiding this comment.
Don't we require login for this now?
| @@ -358,43 +357,3 @@ To build a DocumentArray from dataframe, | |||
| df = ... | |||
| da = DocumentArray.from_dataframe(df) | |||
| ``` | |||
There was a problem hiding this comment.
As I said, removing this section is breaking for content that rely on push/pull briefly discussed here.
We also have a cloud support section. so this content should not be moved to docs/fundamentals/documentarray/interaction-cloud.md
| ``` | |
| ``` | |
| ## From/to cloud | |
| ```{important} | |
| This feature requires `rich` and `requests` dependency. You can do `pip install "docarray[full]"` to install it. | |
| ``` | |
| {meth}`~docarray.array.mixins.io.pushpull.PushPullMixin.push` and {meth}`~docarray.array.mixins.io.pushpull.PushPullMixin.pull` allows you to serialize a DocumentArray object to Jina Cloud and share it across machines. | |
| Considering you are working on a GPU machine via Google Colab/Jupyter. After preprocessing and embedding, you got everything you need in a DocumentArray. You can easily store it to the cloud via: | |
| ```python | |
| from docarray import DocumentArray | |
| da = DocumentArray(...) # heavylifting, processing, GPU task, ... | |
| da.push('myda123', show_progress=True) | |
| ``` | |
| ```{figure} images/da-push.png | |
| ``` | |
| Then on your local laptop, simply pull it: | |
| ```python | |
| from docarray import DocumentArray | |
| da = DocumentArray.pull('myda123', show_progress=True) | |
| ``` | |
| Now you can continue the work at local, analyzing `da` or visualizing it. Your friends & colleagues who know the token `myda123` can also pull that DocumentArray. It's useful when you want to quickly share the results with your colleagues & friends. | |
| The maximum size of an upload is 4GB under the `protocol='protobuf'` and `compress='gzip'` setting. The lifetime of an upload is one week after its creation. | |
| To avoid unnecessary download when upstream DocumentArray is unchanged, you can add `DocumentArray.pull(..., local_cache=True)`. | |
| ```{seealso} | |
| DocArray allows pushing, pulling, and managing your DocumentArrays in Jina AI Cloud. | |
| Read more about how to manage your data in Jina AI Cloud, using either the console or the DocArray Python API, in the | |
| {ref}`Data Management section <data-management>`. | |
| ``` | |
| matching | ||
| subindex | ||
| evaluation | ||
| interaction-cloud |
There was a problem hiding this comment.
we alread have a cloud-support section
| interaction-cloud |
|
The part of the integration with Jina Cloud should have more concrete examples. Some example of creating a docarray from pdf, image, video should be nice. I know it's explained on the other sections, but we need an easy onboarding for new-comers that don't need to know the technical details |
Goals: