From ce3720085643a6c14e430dd009cdfd4c1a17a1b1 Mon Sep 17 00:00:00 2001
From: samsja <sami.jaghouar@hotmail.fr>
Date: Wed, 29 Mar 2023 15:03:16 +0200
Subject: [PATCH 01/11] docs: add user guide

Signed-off-by: samsja <sami.jaghouar@hotmail.fr>
---
 docs/user_guide/first_step.md | 71 +++++++++++++++++++++++++++++++++++
 docs/user_guide/intro.md      | 43 +++++++++++++++++++++
 2 files changed, 114 insertions(+)

diff --git a/docs/user_guide/first_step.md b/docs/user_guide/first_step.md
index 0671e3a096a..f4850f11489 100644
--- a/docs/user_guide/first_step.md
+++ b/docs/user_guide/first_step.md
@@ -1 +1,72 @@
 # First Step : BaseDoc
+
+At the heart of `DocArray` lies the concept of [`BaseDoc`][docarray.base_doc.doc.BaseDoc].
+
+A [BaseDoc][docarray.base_doc.doc.BaseDoc] is very similar to [Pydantic](https://docs.pydantic.dev/)
+[`BaseModel`](https://docs.pydantic.dev/usage/models). It allows to define custom `Document` schema (or `Model` in
+the Pydantic world) to represent your data.
+
+## Basic `Doc` usage.
+
+Before going in detail about what we can do with [BaseDoc][docarray.base_doc.doc.BaseDoc] and how to use it, let's
+take a look at how it looks like in practice.
+
+The following python code will define a `BannerDoc` class that will be used to represent banner data.
+
+```python
+from docarray import BaseDoc
+from docarray.typing import ImageUrl
+
+
+class BannerDoc(BaseDoc):
+    image_url: ImageUrl
+    title: str
+    description: str
+```
+
+you can then instantiate a `BannerDoc` object and access its attributes.
+
+```python
+banner = BannerDoc(
+    image_url="https://example.com/image.png",
+    title="Hello World",
+    description="This is a banner",
+)
+
+assert banner.image_url == "https://example.com/image.png"
+assert banner.title == "Hello World"
+assert banner.description == "This is a banner"
+```
+
+## `BaseDoc` allows to represent MultiModal and nested Data.
+
+more complex example
+
+
+## `BaseDoc` is a Pydantic `BaseModel`
+
+The class [BaseDoc][docarray.base_doc.doc.BaseDoc] inherits from pydantic [BaseModel](https://docs.pydantic.dev/usage/models) from Pydantic. So you can use
+all the features of `BaseModel` in your `Doc` class. 
+
+This namely means that `BaseDoc`:
+
+* Will perform data validation: `BaseDoc` will check that the data you pass to it is valid. If not, it will raise an
+  error. Data being "valid" is actually define by the type use in the docstring itself, but we will come back on this concept later (TODO add typing section)
+
+* Can be configured using a nested `Config` class, see pydantic [documentation](https://docs.pydantic.dev/usage/model_config/) for more details on what kind of config Pydantic offer.
+
+* Can be used as a drop in replacement for `BaseModel` in your code and is compatible with tools using Pydantic like [FastAPI]('https://fastapi.tiangolo.com/').
+
+
+###  What is the difference with Pydantic `BaseModel`? (INCOMPLETE)
+
+here maybe need the link to the versus section
+
+[BaseDoc][docarray.base_doc.doc.BaseDoc] is not only a [BaseModel](https://docs.pydantic.dev/usage/models), 
+
+* it allows to be used with DocArray [Typed](docarray.typing) that are oriented toward MultiModal (image, audio, ...) data and for 
+Machine Learning use case TODO link the type section. 
+
+Another tiny difference is that [BaseDoc][docarray.base_doc.doc.BaseDoc] has a generated by default `id` field that is used to uniquely identify a document.
+
+
diff --git a/docs/user_guide/intro.md b/docs/user_guide/intro.md
index c500c92629f..cb51945c154 100644
--- a/docs/user_guide/intro.md
+++ b/docs/user_guide/intro.md
@@ -1 +1,44 @@
 # User Guide - Intro
+
+This user guide show you how to use `DocArray` with most of its features, step by step.
+
+You wil first need to install `DocArray` in you python environment. 
+## Install DocArray
+
+To install `DocArray` to follow this user guide, you can use the following command:
+
+```console
+pip install "docarray[full]"
+```
+
+This will install the main dependencies of `DocArray` and will work will all the modalities supported.
+
+
+!!! note 
+    To install a very light version of `DocArray` with only the core dependencies, you can use the following command:
+    ```
+    pip install "docarray"
+    ``` 
+    
+    If you want to install user protobuf with the minimal dependencies you can do
+
+    ```
+    pip install "docarray[common]"
+    ``` 
+
+Depending on your usage you might want to only use `DocArray` with only a couple of specific modalities. 
+For instance lets say you only want to work with images, you can do install `DocArray` using the following command:
+
+```
+pip install "docarray[image]"
+```
+
+or with image and audio
+
+
+```
+pip install "docarray[image, audio]"
+```
+
+!!! warning 
+    This way of installing `DocArray` is only valid starting with version `0.30`
\ No newline at end of file

From 7da84f4f51f762f7dad79ccafa1b46900e063085 Mon Sep 17 00:00:00 2001
From: samsja <sami.jaghouar@hotmail.fr>
Date: Wed, 29 Mar 2023 15:55:43 +0200
Subject: [PATCH 02/11] docs: add base docs docs

Signed-off-by: samsja <sami.jaghouar@hotmail.fr>
---
 docs/user_guide/first_step.md | 89 +++++++++++++++++++++++++++++++++--
 1 file changed, 84 insertions(+), 5 deletions(-)

diff --git a/docs/user_guide/first_step.md b/docs/user_guide/first_step.md
index f4850f11489..bc66baff83b 100644
--- a/docs/user_guide/first_step.md
+++ b/docs/user_guide/first_step.md
@@ -9,7 +9,7 @@ the Pydantic world) to represent your data.
 ## Basic `Doc` usage.
 
 Before going in detail about what we can do with [BaseDoc][docarray.base_doc.doc.BaseDoc] and how to use it, let's
-take a look at how it looks like in practice.
+see how it looks like in practice.
 
 The following python code will define a `BannerDoc` class that will be used to represent banner data.
 
@@ -38,9 +38,7 @@ assert banner.title == "Hello World"
 assert banner.description == "This is a banner"
 ```
 
-## `BaseDoc` allows to represent MultiModal and nested Data.
 
-more complex example
 
 
 ## `BaseDoc` is a Pydantic `BaseModel`
@@ -60,13 +58,94 @@ This namely means that `BaseDoc`:
 
 ###  What is the difference with Pydantic `BaseModel`? (INCOMPLETE)
 
-here maybe need the link to the versus section
+LINK TO THE VERSUS (not ready)
 
 [BaseDoc][docarray.base_doc.doc.BaseDoc] is not only a [BaseModel](https://docs.pydantic.dev/usage/models), 
 
 * it allows to be used with DocArray [Typed](docarray.typing) that are oriented toward MultiModal (image, audio, ...) data and for 
 Machine Learning use case TODO link the type section. 
 
-Another tiny difference is that [BaseDoc][docarray.base_doc.doc.BaseDoc] has a generated by default `id` field that is used to uniquely identify a document.
+Another difference is that [BaseDoc][docarray.base_doc.doc.BaseDoc] has a generated by default `id` field that is used to uniquely identify a document.
+
+
+
+## `BaseDoc` allows to represent MultiModal and nested Data.
+
+Let's say you want to represent a Youtube video in your application. Maybe to build a search system of Youtube video.
+A Youtube video is not only composed of a video, but it also has a title, a description, a thumbnail (and more but let's keep it simple).
+
+All of these elements are from different `modalities` LINK TO MODALITIES SECTION (not ready), title and description are text, the thumbnail is an image, and the video in itself is, well, a video.
+
+DocArray allows to represent all of this Multi Modal data in a single object. 
+
+Let's first create an `BaseDoc` for each of elements of that compose the Youtube video.
+
+First for the thumbnail which is an image
+```python
+from docarray import BaseDoc
+from docarray.typing import ImageUrl, ImageBytes
+
+
+class ImageDoc(BaseDoc):
+    url: ImageUrl
+    bytes: ImageBytes = (
+        None  # bytes are not always loaded in memory, so we make it optional
+    )
+```
+
+Then for the video which is a video
+```python
+from docarray import BaseDoc
+from docarray.typing import VideoUrl, VideoBytes
+
+
+class ImageDoc(BaseDoc):
+    url: VideoUrl
+    bytes: VideoBytes = (
+        None  # bytes are not always loaded in memory, so we make it optional
+    )
+``` 
+
+
+Then for the title and description which are text we will just use a `str` type.
+
+All the elements that compose a Youtube video are ready:
+
+```python
+from docarray import BaseDoc
+
+
+class YoutubeVideoDoc(BaseDoc):
+    title: str
+    description: str
+    thumbnail: ImageDoc
+    video: VideoDoc
+```
+
+
+You now hava `YoutubeVideoDoc` that is a pythonic representation of a Youtube video. 
+
+This representation can now be used to send (LINK) or to store (LINK) data. You can even use it directly to [train a machine learning](../how_to/multimodal_training_and_serving.md) [Pytorch](https://pytorch.org/docs/stable/index.html) model on this representation. 
+
+
+!!! note
+
+    You see here that `ImageDoc` and `VideoDoc` are as well [BaseDoc][docarray.base_doc.doc.BaseDoc] that is later use inside another [BaseDoc][docarray.base_doc.doc.BaseDoc]`.
+    This is what we call nested data representation. 
+
+    [BaseDoc][docarray.base_doc.doc.BaseDoc] can be nested to represent any kind of data hierarchy.
+  
+  
+
+
+See also:
+
+* [BaseDoc][docarray.base_doc.doc.BaseDoc] API Reference
+* DOCUMENT_ARARY REF
+* DOCUMENT INDEX REF
+* DOCUMENT STORE REF
+* ...
+
 
 
+See also
\ No newline at end of file

From 58679105c0bc7f26f3f21831542d0395bf305b3e Mon Sep 17 00:00:00 2001
From: samsja <sami.jaghouar@hotmail.fr>
Date: Wed, 29 Mar 2023 16:05:55 +0200
Subject: [PATCH 03/11] docs: add base docs docs

Signed-off-by: samsja <sami.jaghouar@hotmail.fr>
---
 docs/user_guide/intro.md                         | 10 +++++++++-
 docs/user_guide/{ => representing}/first_step.md |  4 ++--
 docs/user_guide/sending/first_step.md            |  1 +
 docs/user_guide/storing/first_step.md            |  1 +
 mkdocs.yml                                       |  4 +++-
 5 files changed, 16 insertions(+), 4 deletions(-)
 rename docs/user_guide/{ => representing}/first_step.md (96%)
 create mode 100644 docs/user_guide/sending/first_step.md
 create mode 100644 docs/user_guide/storing/first_step.md

diff --git a/docs/user_guide/intro.md b/docs/user_guide/intro.md
index cb51945c154..edd59f9663a 100644
--- a/docs/user_guide/intro.md
+++ b/docs/user_guide/intro.md
@@ -1,6 +1,14 @@
 # User Guide - Intro
 
-This user guide show you how to use `DocArray` with most of its features, step by step.
+This user guide show you how to use `DocArray` with most of its features.
+
+They are three main section:
+
+- [Representing Data](representing/first_step.md): This section will show you how to use `DocArray` to represent your data.
+- [Sending Data](sending/first_step.md): This section will show you how to use `DocArray` to send your data.
+- [Storing Data](storing/first_step.md): This section will show you how to use `DocArray` to store your data.
+
+You should first start by reading the [Representing Data](representing/first_step.md) section and both the [Sending Data](sending/first_step.md) and [Storing Data](storing/first_step.md) section can be read in any order.
 
 You wil first need to install `DocArray` in you python environment. 
 ## Install DocArray
diff --git a/docs/user_guide/first_step.md b/docs/user_guide/representing/first_step.md
similarity index 96%
rename from docs/user_guide/first_step.md
rename to docs/user_guide/representing/first_step.md
index bc66baff83b..8fa16f6a0fc 100644
--- a/docs/user_guide/first_step.md
+++ b/docs/user_guide/representing/first_step.md
@@ -1,4 +1,4 @@
-# First Step : BaseDoc
+# Representing
 
 At the heart of `DocArray` lies the concept of [`BaseDoc`][docarray.base_doc.doc.BaseDoc].
 
@@ -125,7 +125,7 @@ class YoutubeVideoDoc(BaseDoc):
 
 You now hava `YoutubeVideoDoc` that is a pythonic representation of a Youtube video. 
 
-This representation can now be used to send (LINK) or to store (LINK) data. You can even use it directly to [train a machine learning](../how_to/multimodal_training_and_serving.md) [Pytorch](https://pytorch.org/docs/stable/index.html) model on this representation. 
+This representation can now be used to send (LINK) or to store (LINK) data. You can even use it directly to [train a machine learning](../../how_to/multimodal_training_and_serving.md) [Pytorch](https://pytorch.org/docs/stable/index.html) model on this representation. 
 
 
 !!! note
diff --git a/docs/user_guide/sending/first_step.md b/docs/user_guide/sending/first_step.md
new file mode 100644
index 00000000000..a18433535b9
--- /dev/null
+++ b/docs/user_guide/sending/first_step.md
@@ -0,0 +1 @@
+# Sending
diff --git a/docs/user_guide/storing/first_step.md b/docs/user_guide/storing/first_step.md
new file mode 100644
index 00000000000..5be8b39165b
--- /dev/null
+++ b/docs/user_guide/storing/first_step.md
@@ -0,0 +1 @@
+# Storing
diff --git a/mkdocs.yml b/mkdocs.yml
index e7749bc2874..9e4209520ef 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -74,7 +74,9 @@ nav:
   - Home: README.md
   - Tutorial - User Guide:
     - user_guide/intro.md
-    - user_guide/first_step.md
+    - user_guide/representing/first_step.md
+    - user_guide/sending/first_step.md
+    - user_guide/storing/first_step.md
 
   - How-to:
     - how_to/add_doc_index.md

From 835ea7e87e59057541ae622f9a01b6c08a48f419 Mon Sep 17 00:00:00 2001
From: samsja <sami.jaghouar@hotmail.fr>
Date: Thu, 30 Mar 2023 11:11:06 +0200
Subject: [PATCH 04/11] fix: apply grammarly on frist step

Signed-off-by: samsja <sami.jaghouar@hotmail.fr>
---
 docs/user_guide/representing/first_step.md | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/docs/user_guide/representing/first_step.md b/docs/user_guide/representing/first_step.md
index 8fa16f6a0fc..2a93cec6032 100644
--- a/docs/user_guide/representing/first_step.md
+++ b/docs/user_guide/representing/first_step.md
@@ -49,11 +49,11 @@ all the features of `BaseModel` in your `Doc` class.
 This namely means that `BaseDoc`:
 
 * Will perform data validation: `BaseDoc` will check that the data you pass to it is valid. If not, it will raise an
-  error. Data being "valid" is actually define by the type use in the docstring itself, but we will come back on this concept later (TODO add typing section)
+  error. Data being "valid"  is actually defined by the type used in the docstring itself, but we will come back to this concept later (TODO add typing section)
 
 * Can be configured using a nested `Config` class, see pydantic [documentation](https://docs.pydantic.dev/usage/model_config/) for more details on what kind of config Pydantic offer.
 
-* Can be used as a drop in replacement for `BaseModel` in your code and is compatible with tools using Pydantic like [FastAPI]('https://fastapi.tiangolo.com/').
+* Can be used as a drop-in replacement for `BaseModel` in your code and is compatible with tools using Pydantic like [FastAPI]('https://fastapi.tiangolo.com/').
 
 
 ###  What is the difference with Pydantic `BaseModel`? (INCOMPLETE)
@@ -71,14 +71,14 @@ Another difference is that [BaseDoc][docarray.base_doc.doc.BaseDoc] has a genera
 
 ## `BaseDoc` allows to represent MultiModal and nested Data.
 
-Let's say you want to represent a Youtube video in your application. Maybe to build a search system of Youtube video.
+Let's say you want to represent a Youtube video in your application. Maybe to build a search system for Youtube video.
 A Youtube video is not only composed of a video, but it also has a title, a description, a thumbnail (and more but let's keep it simple).
 
 All of these elements are from different `modalities` LINK TO MODALITIES SECTION (not ready), title and description are text, the thumbnail is an image, and the video in itself is, well, a video.
 
 DocArray allows to represent all of this Multi Modal data in a single object. 
 
-Let's first create an `BaseDoc` for each of elements of that compose the Youtube video.
+Let's first create an `BaseDoc` for each of the elements that compose the Youtube video.
 
 First for the thumbnail which is an image
 ```python
@@ -123,14 +123,14 @@ class YoutubeVideoDoc(BaseDoc):
 ```
 
 
-You now hava `YoutubeVideoDoc` that is a pythonic representation of a Youtube video. 
+You now have `YoutubeVideoDoc` which is a pythonic representation of a Youtube video. 
 
 This representation can now be used to send (LINK) or to store (LINK) data. You can even use it directly to [train a machine learning](../../how_to/multimodal_training_and_serving.md) [Pytorch](https://pytorch.org/docs/stable/index.html) model on this representation. 
 
 
 !!! note
 
-    You see here that `ImageDoc` and `VideoDoc` are as well [BaseDoc][docarray.base_doc.doc.BaseDoc] that is later use inside another [BaseDoc][docarray.base_doc.doc.BaseDoc]`.
+    You see here that `ImageDoc` and `VideoDoc` are as well [BaseDoc][docarray.base_doc.doc.BaseDoc] that is later used inside another [BaseDoc][docarray.base_doc.doc.BaseDoc]`.
     This is what we call nested data representation. 
 
     [BaseDoc][docarray.base_doc.doc.BaseDoc] can be nested to represent any kind of data hierarchy.

From 4dd920748099e26530f567cdab3a7fcf6de779b6 Mon Sep 17 00:00:00 2001
From: samsja <sami.jaghouar@hotmail.fr>
Date: Thu, 30 Mar 2023 11:12:14 +0200
Subject: [PATCH 05/11] fix: apply grammarly on isntall

Signed-off-by: samsja <sami.jaghouar@hotmail.fr>
---
 docs/user_guide/intro.md | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/docs/user_guide/intro.md b/docs/user_guide/intro.md
index edd59f9663a..146902dbcc8 100644
--- a/docs/user_guide/intro.md
+++ b/docs/user_guide/intro.md
@@ -1,14 +1,14 @@
 # User Guide - Intro
 
-This user guide show you how to use `DocArray` with most of its features.
+This user guide shows you how to use `DocArray` with most of its features.
 
-They are three main section:
+They are three main sections:
 
 - [Representing Data](representing/first_step.md): This section will show you how to use `DocArray` to represent your data.
 - [Sending Data](sending/first_step.md): This section will show you how to use `DocArray` to send your data.
 - [Storing Data](storing/first_step.md): This section will show you how to use `DocArray` to store your data.
 
-You should first start by reading the [Representing Data](representing/first_step.md) section and both the [Sending Data](sending/first_step.md) and [Storing Data](storing/first_step.md) section can be read in any order.
+You should first start by reading the [Representing Data](representing/first_step.md) section and both the [Sending Data](sending/first_step.md) and [Storing Data](storing/first_step.md) sections can be read in any order.
 
 You wil first need to install `DocArray` in you python environment. 
 ## Install DocArray
@@ -28,14 +28,14 @@ This will install the main dependencies of `DocArray` and will work will all the
     pip install "docarray"
     ``` 
     
-    If you want to install user protobuf with the minimal dependencies you can do
+    If you want to install user protobuf with minimal dependencies you can do
 
     ```
     pip install "docarray[common]"
     ``` 
 
 Depending on your usage you might want to only use `DocArray` with only a couple of specific modalities. 
-For instance lets say you only want to work with images, you can do install `DocArray` using the following command:
+For instance let's say you only want to work with images, you can install `DocArray` using the following command:
 
 ```
 pip install "docarray[image]"

From bfe447018d2985edea30b76481bc4bd1915eb7d8 Mon Sep 17 00:00:00 2001
From: samsja <sami.jaghouar@hotmail.fr>
Date: Fri, 31 Mar 2023 11:19:13 +0200
Subject: [PATCH 06/11] fix: apply johannes suggestion

Signed-off-by: samsja <sami.jaghouar@hotmail.fr>
---
 README.md                | 4 ++--
 docs/user_guide/intro.md | 4 ++--
 pyproject.toml           | 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/README.md b/README.md
index 52826b5cd8b..8d4b45ae264 100644
--- a/README.md
+++ b/README.md
@@ -482,13 +482,13 @@ INFO - docarray - HnswDocumentIndex[SimpleDoc] has been initialized
 To try out the alpha you can install it via git:
 
 ```shell
-pip install "git+https://github.com/docarray/docarray@2023.01.18.alpha#egg=docarray[common,torch,image]"
+pip install "git+https://github.com/docarray/docarray@2023.01.18.alpha#egg=docarray[proto,torch,image]"
 ```
 
 ...or from the latest development branch
 
 ```shell
-pip install "git+https://github.com/docarray/docarray@feat-rewrite-v2#egg=docarray[common,torch,image]"
+pip install "git+https://github.com/docarray/docarray@feat-rewrite-v2#egg=docarray[proto,torch,image]"
 ```
 
 ## See also
diff --git a/docs/user_guide/intro.md b/docs/user_guide/intro.md
index 146902dbcc8..bf3e14c1cba 100644
--- a/docs/user_guide/intro.md
+++ b/docs/user_guide/intro.md
@@ -28,10 +28,10 @@ This will install the main dependencies of `DocArray` and will work will all the
     pip install "docarray"
     ``` 
     
-    If you want to install user protobuf with minimal dependencies you can do
+    If you want to use protobuf and DocArray you can do
 
     ```
-    pip install "docarray[common]"
+    pip install "docarray[proto]"
     ``` 
 
 Depending on your usage you might want to only use `DocArray` with only a couple of specific modalities. 
diff --git a/pyproject.toml b/pyproject.toml
index 3114ff8dc61..6982b351b47 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -29,7 +29,7 @@ smart-open = {version = ">=6.3.0", extras = ["s3"], optional = true}
 jina-hubble-sdk = {version = ">=0.34.0", optional = true}
 
 [tool.poetry.extras]
-common = ["protobuf", "lz4"]
+proto = ["protobuf", "lz4"]
 pandas = ["pandas"]
 image = ["pillow", "types-pillow"]
 video = ["av"]

From 2a3702149bb099f520e8a5d5d43b333e246bbf9d Mon Sep 17 00:00:00 2001
From: samsja <sami.jaghouar@hotmail.fr>
Date: Fri, 31 Mar 2023 11:25:56 +0200
Subject: [PATCH 07/11] fix: poetry lock

Signed-off-by: samsja <sami.jaghouar@hotmail.fr>
---
 poetry.lock | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/poetry.lock b/poetry.lock
index a9bc680af7f..12dd4370927 100644
--- a/poetry.lock
+++ b/poetry.lock
@@ -4590,7 +4590,6 @@ testing = ["flake8 (<5)", "func-timeout", "jaraco.functools", "jaraco.itertools"
 [extras]
 audio = ["pydub"]
 aws = ["smart-open"]
-common = ["protobuf", "lz4"]
 elasticsearch = ["elasticsearch"]
 full = ["protobuf", "lz4", "pandas", "pillow", "types-pillow", "av", "pydub", "trimesh"]
 hnswlib = ["hnswlib"]
@@ -4598,6 +4597,7 @@ image = ["pillow", "types-pillow"]
 jac = ["jina-hubble-sdk"]
 mesh = ["trimesh"]
 pandas = ["pandas"]
+proto = ["protobuf", "lz4"]
 torch = ["torch"]
 video = ["av"]
 web = ["fastapi"]
@@ -4605,4 +4605,4 @@ web = ["fastapi"]
 [metadata]
 lock-version = "2.0"
 python-versions = ">=3.7,<4.0"
-content-hash = "821f6cd00f78c456f6146f39c14f0704e4f2d113c35db00c58462d8cfbe3a538"
+content-hash = "dd56d7cfa5b6758063baba58a5259f06535e0f425366360d042836aa479eab15"

From b27e09a4c26d53ae33de9a4efa92aeab20327b89 Mon Sep 17 00:00:00 2001
From: samsja <55492238+samsja@users.noreply.github.com>
Date: Fri, 31 Mar 2023 11:50:53 +0200
Subject: [PATCH 08/11] feat: apply johannes suggestion

Co-authored-by: Johannes Messner <44071807+JohannesMessner@users.noreply.github.com>
Signed-off-by: samsja <55492238+samsja@users.noreply.github.com>
---
 docs/user_guide/intro.md                   | 16 +++++++--------
 docs/user_guide/representing/first_step.md | 24 +++++++++++-----------
 2 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/docs/user_guide/intro.md b/docs/user_guide/intro.md
index bf3e14c1cba..9597bcd686d 100644
--- a/docs/user_guide/intro.md
+++ b/docs/user_guide/intro.md
@@ -2,15 +2,15 @@
 
 This user guide shows you how to use `DocArray` with most of its features.
 
-They are three main sections:
+There are three main sections:
 
-- [Representing Data](representing/first_step.md): This section will show you how to use `DocArray` to represent your data.
-- [Sending Data](sending/first_step.md): This section will show you how to use `DocArray` to send your data.
-- [Storing Data](storing/first_step.md): This section will show you how to use `DocArray` to store your data.
+- [Representing Data](representing/first_step.md): This section will show you how to use `DocArray` to represent your data. This is a great starting point if you want to better organize the data in your ML models, or if you are looking for a "pydantic for ML".
+- [Sending Data](sending/first_step.md): This section will show you how to use `DocArray` to send your data. This is a great starting point if you want to serve your ML model, for example through FastAPI.
+- [Storing Data](storing/first_step.md): This section will show you how to use `DocArray` to store your data. This is a great starting point if you are looking for an "ORM for vector databases".
 
 You should first start by reading the [Representing Data](representing/first_step.md) section and both the [Sending Data](sending/first_step.md) and [Storing Data](storing/first_step.md) sections can be read in any order.
 
-You wil first need to install `DocArray` in you python environment. 
+You will first need to install `DocArray` in your Python environment. 
 ## Install DocArray
 
 To install `DocArray` to follow this user guide, you can use the following command:
@@ -19,7 +19,7 @@ To install `DocArray` to follow this user guide, you can use the following comma
 pip install "docarray[full]"
 ```
 
-This will install the main dependencies of `DocArray` and will work will all the modalities supported.
+This will install the main dependencies of `DocArray` and will work will all the supported data modalities.
 
 
 !!! note 
@@ -34,8 +34,8 @@ This will install the main dependencies of `DocArray` and will work will all the
     pip install "docarray[proto]"
     ``` 
 
-Depending on your usage you might want to only use `DocArray` with only a couple of specific modalities. 
-For instance let's say you only want to work with images, you can install `DocArray` using the following command:
+Depending on your usage you might want to use `DocArray` with only a couple of specific modalities and their dependencies. 
+For instance, let's say you only want to work with images, you can install `DocArray` using the following command:
 
 ```
 pip install "docarray[image]"
diff --git a/docs/user_guide/representing/first_step.md b/docs/user_guide/representing/first_step.md
index 2a93cec6032..c65e66fa976 100644
--- a/docs/user_guide/representing/first_step.md
+++ b/docs/user_guide/representing/first_step.md
@@ -2,16 +2,16 @@
 
 At the heart of `DocArray` lies the concept of [`BaseDoc`][docarray.base_doc.doc.BaseDoc].
 
-A [BaseDoc][docarray.base_doc.doc.BaseDoc] is very similar to [Pydantic](https://docs.pydantic.dev/)
-[`BaseModel`](https://docs.pydantic.dev/usage/models). It allows to define custom `Document` schema (or `Model` in
+A [BaseDoc][docarray.base_doc.doc.BaseDoc] is very similar to a [Pydantic](https://docs.pydantic.dev/)
+[`BaseModel`](https://docs.pydantic.dev/usage/models) - in fact it _is_ a specialized Pydantic `BaseModel`. It allows you to define custom `Document` schemas (or `Model` in
 the Pydantic world) to represent your data.
 
 ## Basic `Doc` usage.
 
 Before going in detail about what we can do with [BaseDoc][docarray.base_doc.doc.BaseDoc] and how to use it, let's
-see how it looks like in practice.
+see what it looks like in practice.
 
-The following python code will define a `BannerDoc` class that will be used to represent banner data.
+The following Python code defines a `BannerDoc` class that can be used to represent the data of a website banner.
 
 ```python
 from docarray import BaseDoc
@@ -24,7 +24,7 @@ class BannerDoc(BaseDoc):
     description: str
 ```
 
-you can then instantiate a `BannerDoc` object and access its attributes.
+You can then instantiate a `BannerDoc` object and access its attributes.
 
 ```python
 banner = BannerDoc(
@@ -43,13 +43,13 @@ assert banner.description == "This is a banner"
 
 ## `BaseDoc` is a Pydantic `BaseModel`
 
-The class [BaseDoc][docarray.base_doc.doc.BaseDoc] inherits from pydantic [BaseModel](https://docs.pydantic.dev/usage/models) from Pydantic. So you can use
+The class [BaseDoc][docarray.base_doc.doc.BaseDoc] inherits from pydantic [BaseModel](https://docs.pydantic.dev/usage/models). So you can use
 all the features of `BaseModel` in your `Doc` class. 
 
 This namely means that `BaseDoc`:
 
 * Will perform data validation: `BaseDoc` will check that the data you pass to it is valid. If not, it will raise an
-  error. Data being "valid"  is actually defined by the type used in the docstring itself, but we will come back to this concept later (TODO add typing section)
+  error. Data being "valid"  is actually defined by the type used in the type hint itself, but we will come back to this concept later (TODO add typing section)
 
 * Can be configured using a nested `Config` class, see pydantic [documentation](https://docs.pydantic.dev/usage/model_config/) for more details on what kind of config Pydantic offer.
 
@@ -69,14 +69,14 @@ Another difference is that [BaseDoc][docarray.base_doc.doc.BaseDoc] has a genera
 
 
 
-## `BaseDoc` allows to represent MultiModal and nested Data.
+## `BaseDoc` allows to represent multimodal and nested data.
 
-Let's say you want to represent a Youtube video in your application. Maybe to build a search system for Youtube video.
-A Youtube video is not only composed of a video, but it also has a title, a description, a thumbnail (and more but let's keep it simple).
+Let's say you want to represent a Youtube video in your application, perhaps to build a search system for Youtube videos.
+A Youtube video is not only composed of a video, but it also has a title, a description, a thumbnail (and more, but let's keep it simple).
 
-All of these elements are from different `modalities` LINK TO MODALITIES SECTION (not ready), title and description are text, the thumbnail is an image, and the video in itself is, well, a video.
+All of these elements are from different `modalities` LINK TO MODALITIES SECTION (not ready): title and description are text, the thumbnail is an image, and the video in itself is, well, a video.
 
-DocArray allows to represent all of this Multi Modal data in a single object. 
+DocArray allows to represent all of this multimodal data in a single object. 
 
 Let's first create an `BaseDoc` for each of the elements that compose the Youtube video.
 

From 1a66cff7007689423ad9a32e5a6bbd57def168ea Mon Sep 17 00:00:00 2001
From: samsja <sami.jaghouar@hotmail.fr>
Date: Fri, 31 Mar 2023 12:27:06 +0200
Subject: [PATCH 09/11] fix: fix name

Signed-off-by: samsja <sami.jaghouar@hotmail.fr>
---
 docs/user_guide/representing/first_step.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/user_guide/representing/first_step.md b/docs/user_guide/representing/first_step.md
index c65e66fa976..8de8ac00051 100644
--- a/docs/user_guide/representing/first_step.md
+++ b/docs/user_guide/representing/first_step.md
@@ -99,7 +99,7 @@ from docarray import BaseDoc
 from docarray.typing import VideoUrl, VideoBytes
 
 
-class ImageDoc(BaseDoc):
+class VideoDoc(BaseDoc):
     url: VideoUrl
     bytes: VideoBytes = (
         None  # bytes are not always loaded in memory, so we make it optional

From 09a75e6752d5441226dea96684990baa1b8d407b Mon Sep 17 00:00:00 2001
From: Alex C-G <alexcg@outlook.com>
Date: Fri, 31 Mar 2023 12:31:08 +0200
Subject: [PATCH 10/11] docs: tidy up wording

Signed-off-by: Alex C-G <alexcg@outlook.com>
---
 docs/user_guide/intro.md                   | 13 ++--
 docs/user_guide/representing/first_step.md | 76 +++++++++-------------
 2 files changed, 36 insertions(+), 53 deletions(-)

diff --git a/docs/user_guide/intro.md b/docs/user_guide/intro.md
index 9597bcd686d..084805bddd2 100644
--- a/docs/user_guide/intro.md
+++ b/docs/user_guide/intro.md
@@ -1,4 +1,4 @@
-# User Guide - Intro
+# User Guide - Introduction
 
 This user guide shows you how to use `DocArray` with most of its features.
 
@@ -8,9 +8,10 @@ There are three main sections:
 - [Sending Data](sending/first_step.md): This section will show you how to use `DocArray` to send your data. This is a great starting point if you want to serve your ML model, for example through FastAPI.
 - [Storing Data](storing/first_step.md): This section will show you how to use `DocArray` to store your data. This is a great starting point if you are looking for an "ORM for vector databases".
 
-You should first start by reading the [Representing Data](representing/first_step.md) section and both the [Sending Data](sending/first_step.md) and [Storing Data](storing/first_step.md) sections can be read in any order.
+You should start by reading the [Representing Data](representing/first_step.md) section, and then the [Sending Data](sending/first_step.md) and [Storing Data](storing/first_step.md) sections can be read in any order.
 
 You will first need to install `DocArray` in your Python environment. 
+
 ## Install DocArray
 
 To install `DocArray` to follow this user guide, you can use the following command:
@@ -21,14 +22,13 @@ pip install "docarray[full]"
 
 This will install the main dependencies of `DocArray` and will work will all the supported data modalities.
 
-
 !!! note 
     To install a very light version of `DocArray` with only the core dependencies, you can use the following command:
     ```
     pip install "docarray"
     ``` 
     
-    If you want to use protobuf and DocArray you can do
+    If you want to use protobuf and DocArray you can run:
 
     ```
     pip install "docarray[proto]"
@@ -41,12 +41,11 @@ For instance, let's say you only want to work with images, you can install `DocA
 pip install "docarray[image]"
 ```
 
-or with image and audio
-
+...or with images and audio:
 
 ```
 pip install "docarray[image, audio]"
 ```
 
 !!! warning 
-    This way of installing `DocArray` is only valid starting with version `0.30`
\ No newline at end of file
+    This way of installing `DocArray` is only valid starting with version `0.30`
diff --git a/docs/user_guide/representing/first_step.md b/docs/user_guide/representing/first_step.md
index 8de8ac00051..c20b0dc553f 100644
--- a/docs/user_guide/representing/first_step.md
+++ b/docs/user_guide/representing/first_step.md
@@ -3,12 +3,12 @@
 At the heart of `DocArray` lies the concept of [`BaseDoc`][docarray.base_doc.doc.BaseDoc].
 
 A [BaseDoc][docarray.base_doc.doc.BaseDoc] is very similar to a [Pydantic](https://docs.pydantic.dev/)
-[`BaseModel`](https://docs.pydantic.dev/usage/models) - in fact it _is_ a specialized Pydantic `BaseModel`. It allows you to define custom `Document` schemas (or `Model` in
+[`BaseModel`](https://docs.Pydantic.dev/usage/models) - in fact it _is_ a specialized Pydantic `BaseModel`. It allows you to define custom `Document` schemas (or `Model` in
 the Pydantic world) to represent your data.
 
 ## Basic `Doc` usage.
 
-Before going in detail about what we can do with [BaseDoc][docarray.base_doc.doc.BaseDoc] and how to use it, let's
+Before going into detail about what we can do with [BaseDoc][docarray.base_doc.doc.BaseDoc] and how to use it, let's
 see what it looks like in practice.
 
 The following Python code defines a `BannerDoc` class that can be used to represent the data of a website banner.
@@ -28,33 +28,27 @@ You can then instantiate a `BannerDoc` object and access its attributes.
 
 ```python
 banner = BannerDoc(
-    image_url="https://example.com/image.png",
-    title="Hello World",
-    description="This is a banner",
+    image_url='https://example.com/image.png',
+    title='Hello World',
+    description='This is a banner',
 )
 
-assert banner.image_url == "https://example.com/image.png"
-assert banner.title == "Hello World"
-assert banner.description == "This is a banner"
+assert banner.image_url == 'https://example.com/image.png'
+assert banner.title == 'Hello World'
+assert banner.description == 'This is a banner'
 ```
 
-
-
-
 ## `BaseDoc` is a Pydantic `BaseModel`
 
-The class [BaseDoc][docarray.base_doc.doc.BaseDoc] inherits from pydantic [BaseModel](https://docs.pydantic.dev/usage/models). So you can use
+The class [BaseDoc][docarray.base_doc.doc.BaseDoc] inherits from Pydantic [BaseModel](https://docs.pydantic.dev/usage/models). So you can use
 all the features of `BaseModel` in your `Doc` class. 
 
-This namely means that `BaseDoc`:
-
-* Will perform data validation: `BaseDoc` will check that the data you pass to it is valid. If not, it will raise an
-  error. Data being "valid"  is actually defined by the type used in the type hint itself, but we will come back to this concept later (TODO add typing section)
-
-* Can be configured using a nested `Config` class, see pydantic [documentation](https://docs.pydantic.dev/usage/model_config/) for more details on what kind of config Pydantic offer.
-
-* Can be used as a drop-in replacement for `BaseModel` in your code and is compatible with tools using Pydantic like [FastAPI]('https://fastapi.tiangolo.com/').
+This means that `BaseDoc`:
 
+* Will perform data validation: `BaseDoc` will check that the data you pass to it is valid. If not, it will raise an 
+error. Data being "valid" is actually defined by the type used in the type hint itself, but we will come back to this concept later. (TODO add typing section)
+* Can be configured using a nested `Config` class, see Pydantic [documentation](https://docs.pydantic.dev/usage/model_config/) for more detail on what kind of config pydantic offers.
+* Can be used as a drop-in replacement for `BaseModel` in your code and is compatible with tools that use Pydantic like [FastAPI]('https://fastapi.tiangolo.com/').
 
 ###  What is the difference with Pydantic `BaseModel`? (INCOMPLETE)
 
@@ -62,25 +56,24 @@ LINK TO THE VERSUS (not ready)
 
 [BaseDoc][docarray.base_doc.doc.BaseDoc] is not only a [BaseModel](https://docs.pydantic.dev/usage/models), 
 
-* it allows to be used with DocArray [Typed](docarray.typing) that are oriented toward MultiModal (image, audio, ...) data and for 
+* You can use it with DocArray [Typed](docarray.typing) that are oriented toward MultiModal (image, audio, ...) data and for 
 Machine Learning use case TODO link the type section. 
 
-Another difference is that [BaseDoc][docarray.base_doc.doc.BaseDoc] has a generated by default `id` field that is used to uniquely identify a document.
-
+Another difference is that [BaseDoc][docarray.base_doc.doc.BaseDoc] has an `id` field that is generated by default that is used to uniquely identify a Document.
 
+## `BaseDoc` allows representing multimodal and nested data
 
-## `BaseDoc` allows to represent multimodal and nested data.
+Let's say you want to represent a YouTube video in your application, perhaps to build a search system for YouTube videos.
+A YouTube video is not only composed of a video, but also has a title, description, thumbnail (and more, but let's keep it simple).
 
-Let's say you want to represent a Youtube video in your application, perhaps to build a search system for Youtube videos.
-A Youtube video is not only composed of a video, but it also has a title, a description, a thumbnail (and more, but let's keep it simple).
-
-All of these elements are from different `modalities` LINK TO MODALITIES SECTION (not ready): title and description are text, the thumbnail is an image, and the video in itself is, well, a video.
+All of these elements are from different `modalities` LINK TO MODALITIES SECTION (not ready): the title and description are text, the thumbnail is an image, and the video in itself is, well, a video.
 
 DocArray allows to represent all of this multimodal data in a single object. 
 
-Let's first create an `BaseDoc` for each of the elements that compose the Youtube video.
+Let's first create an `BaseDoc` for each of the elements that compose the YouTube video.
+
+First for the thumbnail which is an image:
 
-First for the thumbnail which is an image
 ```python
 from docarray import BaseDoc
 from docarray.typing import ImageUrl, ImageBytes
@@ -93,7 +86,8 @@ class ImageDoc(BaseDoc):
     )
 ```
 
-Then for the video which is a video
+Then for the video itself:
+
 ```python
 from docarray import BaseDoc
 from docarray.typing import VideoUrl, VideoBytes
@@ -106,37 +100,31 @@ class VideoDoc(BaseDoc):
     )
 ``` 
 
+Then for the title and description (which are text) we will just use a `str` type.
 
-Then for the title and description which are text we will just use a `str` type.
-
-All the elements that compose a Youtube video are ready:
+All the elements that compose a YouTube video are ready:
 
 ```python
 from docarray import BaseDoc
 
 
-class YoutubeVideoDoc(BaseDoc):
+class YouTubeVideoDoc(BaseDoc):
     title: str
     description: str
     thumbnail: ImageDoc
     video: VideoDoc
 ```
 
-
-You now have `YoutubeVideoDoc` which is a pythonic representation of a Youtube video. 
+You now have `YouTubeVideoDoc` which is a pythonic representation of a YouTube video. 
 
 This representation can now be used to send (LINK) or to store (LINK) data. You can even use it directly to [train a machine learning](../../how_to/multimodal_training_and_serving.md) [Pytorch](https://pytorch.org/docs/stable/index.html) model on this representation. 
 
-
 !!! note
 
-    You see here that `ImageDoc` and `VideoDoc` are as well [BaseDoc][docarray.base_doc.doc.BaseDoc] that is later used inside another [BaseDoc][docarray.base_doc.doc.BaseDoc]`.
+    You see here that `ImageDoc` and `VideoDoc` are also [BaseDoc][docarray.base_doc.doc.BaseDoc], and they later used inside another [BaseDoc][docarray.base_doc.doc.BaseDoc]`.
     This is what we call nested data representation. 
 
     [BaseDoc][docarray.base_doc.doc.BaseDoc] can be nested to represent any kind of data hierarchy.
-  
-  
-
 
 See also:
 
@@ -145,7 +133,3 @@ See also:
 * DOCUMENT INDEX REF
 * DOCUMENT STORE REF
 * ...
-
-
-
-See also
\ No newline at end of file

From 75624d0b77b32d530abd180c66de6f066dcb2d3a Mon Sep 17 00:00:00 2001
From: samsja <55492238+samsja@users.noreply.github.com>
Date: Mon, 3 Apr 2023 08:59:02 +0200
Subject: [PATCH 11/11] feat: apply saba suggestion

Co-authored-by: Saba Sturua <45267439+jupyterjazz@users.noreply.github.com>
Signed-off-by: samsja <55492238+samsja@users.noreply.github.com>
---
 docs/user_guide/intro.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/docs/user_guide/intro.md b/docs/user_guide/intro.md
index 084805bddd2..5c9fbb14d1f 100644
--- a/docs/user_guide/intro.md
+++ b/docs/user_guide/intro.md
@@ -14,13 +14,13 @@ You will first need to install `DocArray` in your Python environment.
 
 ## Install DocArray
 
-To install `DocArray` to follow this user guide, you can use the following command:
+To install `DocArray`, you can use the following command:
 
 ```console
 pip install "docarray[full]"
 ```
 
-This will install the main dependencies of `DocArray` and will work will all the supported data modalities.
+This will install the main dependencies of `DocArray` and will work with all the supported data modalities.
 
 !!! note 
     To install a very light version of `DocArray` with only the core dependencies, you can use the following command:
@@ -28,7 +28,7 @@ This will install the main dependencies of `DocArray` and will work will all the
     pip install "docarray"
     ``` 
     
-    If you want to use protobuf and DocArray you can run:
+    If you want to use `protobuf` and `DocArray`, you can run:
 
     ```
     pip install "docarray[proto]"