Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
127 changes: 59 additions & 68 deletions README.md

Large diffs are not rendered by default.

File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
37 changes: 18 additions & 19 deletions docs/data_types/3d_mesh/3d_mesh.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,12 @@
DocArray supports many different modalities including `3D Mesh`.
This section will show you how to load and handle 3D data using DocArray.

A 3D mesh is the structural build of a 3D model consisting of polygons. Most 3D meshes are created via professional software packages, such as commercial suites like Unity, or the free open-source Blender 3D.

A 3D mesh is the structural build of a 3D model consisting of polygons. Most 3D meshes are created via professional software packages, such as commercial suites like [Unity](https://unity.com/), or the open-source [Blender](https://www.blender.org/).

!!! note
This feature requires `trimesh`. You can install all necessary dependencies via:
```cm

```cmd
pip install "docarray[mesh]"
```

Expand All @@ -21,13 +21,16 @@ A 3D mesh can be represented by its vertices and faces:

### Load vertices and faces

First, let's define our class `MyMesh3D`, which extends [`BaseDoc`][docarray.base_doc.doc.BaseDoc] and provides attributes to store our 3D data. It has an `url` attribute of type [`Mesh3DUrl`][docarray.typing.url.url_3d.mesh_url.Mesh3DUrl]. To store the vertices and faces, DocArray provides the [`VerticesAndFaces`][docarray.documents.mesh.vertices_and_faces.VerticesAndFaces] class, which has a `vertices` attribute and a `faces` attribute, both of type [`AnyTensor`](../../../../api_references/typing/tensor/tensor). This especially comes in handy later when we want to display our 3D mesh.
First, let's define our class `MyMesh3D`, which extends [`BaseDoc`][docarray.base_doc.doc.BaseDoc] and provides attributes to store our 3D data:

- The `mesh_url` attribute of type [`Mesh3DUrl`][docarray.typing.url.url_3d.mesh_url.Mesh3DUrl].
- The optional `tensors` attribute, of type [`VerticesAndFaces`][docarray.documents.mesh.vertices_and_faces.VerticesAndFaces]
- The `VerticesAndFaces` class has the attributes `vertices` and `faces`, both of type [`AnyTensor`](../../../../api_references/typing/tensor/tensor). This especially comes in handy later when we want to display our 3D mesh.

!!! tip
Check out our predefined [`Mesh3D`](#getting-started-predefined-docs) to get started and play around with our 3D features.

But for now, let's create a `MyMesh3D` instance with an URL to a remote `.obj` file:

But for now, let's create a `MyMesh3D` instance with a URL to a remote `.obj` file:

```python
from typing import Optional
Expand All @@ -45,7 +48,7 @@ class MyMesh3D(BaseDoc):
doc = MyMesh3D(mesh_url="https://people.sc.fsu.edu/~jburkardt/data/obj/al.obj")
```

To load the vertices and faces information, you can simply call [`.load()`][docarray.typing.url.url_3d.mesh_url.Mesh3DUrl.load] on the [`Mesh3DUrl`][docarray.typing.url.url_3d.mesh_url.Mesh3DUrl] instance. This will return a [`VerticesAndFaces`][docarray.documents.mesh.vertices_and_faces.VerticesAndFaces] object.
To load the vertices and faces information, you can call [`.load()`][docarray.typing.url.url_3d.mesh_url.Mesh3DUrl.load] on the [`Mesh3DUrl`][docarray.typing.url.url_3d.mesh_url.Mesh3DUrl] instance. This will return a [`VerticesAndFaces`][docarray.documents.mesh.vertices_and_faces.VerticesAndFaces] object:

```python
doc.tensors = doc.mesh_url.load()
Expand Down Expand Up @@ -1329,10 +1332,9 @@ function render(){tracklight.position.copy(camera.position);renderer.render(scen
init();</script></body>
</html>" width="100%" height="500px" style="border:none;"></iframe>


## Point cloud representation

A point cloud is a representation of a 3D mesh. It is made by repeatedly and uniformly sampling points within the surface of the 3D body. Compared to the mesh representation, the point cloud is a fixed size ndarray and hence easier for deep learning algorithms to handle.
A point cloud is a representation of a 3D mesh. It is made by repeatedly and uniformly sampling points within the surface of the 3D body. Compared to the mesh representation, the point cloud is a fixed size `ndarray` and hence easier for deep learning algorithms to handle.

### Load point cloud

Expand All @@ -1341,7 +1343,7 @@ A point cloud is a representation of a 3D mesh. It is made by repeatedly and uni

In DocArray, loading a point cloud from a [`PointCloud3DUrl`][docarray.typing.url.url_3d.point_cloud_url.PointCloud3DUrl] instance will return a [`PointsAndColors`][docarray.documents.point_cloud.points_and_colors.PointsAndColors] instance. Such an object has a `points` attribute containing the information about the points in 3D space as well as an optional `colors` attribute.

First, let's define our class `MyPointCloud`, which extends [`BaseDoc`][docarray.base_doc.doc.BaseDoc] and provides attributes to store the point cloud information.
First, let's define our class `MyPointCloud`, which extends [`BaseDoc`][docarray.base_doc.doc.BaseDoc] and provides attributes to store the point cloud information:

```python
from typing import Optional
Expand All @@ -1365,6 +1367,7 @@ Next, we can load a point cloud of size `samples` by simply calling [`.load()`][
doc.tensors = doc.url.load(samples=1000)
doc.summary()
```

<details>
<summary>Output</summary>
``` { .text .no-copy }
Expand All @@ -1386,8 +1389,8 @@ doc.summary()
```
</details>


### Display 3D point cloud in notebook

You can display your point cloud and interact with it from its URL as well as from a PointsAndColors instance. The first will always display without color, whereas the display from [`PointsAndColors`][docarray.documents.point_cloud.points_and_colors.PointsAndColors] will show with color if `.colors` is not None.

``` { .python}
Expand Down Expand Up @@ -2642,16 +2645,13 @@ function render(){tracklight.position.copy(camera.position);renderer.render(scen
init();</script></body>
</html>" width="100%" height="500px" style="border:none;"></iframe>

## Getting started - Predefined documents




## Getting started - Predefined Docs
To get started and play around with the 3D modalities, DocArray provides the predefined documents [`Mesh3D`][docarray.documents.mesh.Mesh3D] and [`PointCloud3D`][docarray.documents.point_cloud.PointCloud3D], which includes all of the previously mentioned functionalities.

### `Mesh3D`
The [`Mesh3D`][docarray.documents.mesh.Mesh3D] class for instance provides a [`Mesh3DUrl`][docarray.typing.Mesh3DUrl] field as well as a [`VerticesAndFaces`][docarray.documents.mesh.vertices_and_faces.VerticesAndFaces] field.

The [`Mesh3D`][docarray.documents.mesh.Mesh3D] class provides a [`Mesh3DUrl`][docarray.typing.Mesh3DUrl] field and [`VerticesAndFaces`][docarray.documents.mesh.vertices_and_faces.VerticesAndFaces] field.

``` { .python }
class Mesh3D(BaseDoc):
Expand All @@ -2671,7 +2671,7 @@ class PointCloud3D(BaseDoc):
bytes_: Optional[bytes]
```

You can use them directly, extend or compose them.
You can use them directly, extend or compose them:

```python
from docarray import BaseDoc
Expand All @@ -2692,7 +2692,6 @@ doc = My3DObject(
pc=PointCloud3D(url=obj_file),
)


doc.mesh.tensors = doc.mesh.url.load()
doc.pc.tensors = doc.pc.url.load(samples=100)
```
```
28 changes: 16 additions & 12 deletions docs/data_types/audio/audio.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,28 +7,31 @@ Moreover, you will learn about DocArray's audio-specific types, to represent you

!!! note
This requires a `pydub` dependency. You can install all necessary dependencies via:

```cmd
pip install "docarray[audio]"
```

Additionally, you have to install `ffmpeg` (see more info [here](https://github.com/jiaaro/pydub#getting-ffmpeg-set-up)):

```cmd
# on Mac with brew:
brew install ffmpeg
```

```cmd
# on Linux with apt-get
apt-get install ffmpeg libavcodec-extra
```


## Load audio file

First, let's define a class, which extends [`BaseDoc`][docarray.base_doc.doc.BaseDoc] and has an `url` attribute of type [`AudioUrl`][docarray.typing.url.AudioUrl], and an optional `tensor` attribute of type [`AudioTensor`](../../../../api_references/typing/tensor/audio).
First, let's define a class which extends [`BaseDoc`][docarray.base_doc.doc.BaseDoc] and has a `url` attribute of type [`AudioUrl`][docarray.typing.url.AudioUrl], and an optional `tensor` attribute of type [`AudioTensor`](../../../../api_references/typing/tensor/audio).

!!! tip
Check out our predefined [`AudioDoc`](#getting-started-predefined-audiodoc) to get started and play around with our audio features.

Next, you can instantiate an object of that class with a local or remote URL.
Next, you can instantiate an object of that class with a local or remote URL:

```python
from docarray import BaseDoc
Expand All @@ -50,13 +53,14 @@ Loading the content of the audio file is as easy as calling [`.load()`][docarray

This will return a tuple of:

- an [`AudioNdArray`][docarray.typing.tensor.audio.AudioNdArray] representing the audio file content
- an integer representing the frame rate (number of signals for a certain period of time)
- An [`AudioNdArray`][docarray.typing.tensor.audio.AudioNdArray] representing the audio file content
- An integer representing the frame rate (number of signals for a certain period of time)

```python
doc.tensor, doc.frame_rate = doc.url.load()
doc.summary()
```

<details>
<summary>Output</summary>
``` { .text .no-copy }
Expand All @@ -72,7 +76,6 @@ doc.summary()
```
</details>


## AudioTensor

DocArray offers several [`AudioTensor`s](../../../../api_references/typing/tensor/audio) to store your data to:
Expand Down Expand Up @@ -105,7 +108,6 @@ assert isinstance(doc.tf_tensor, AudioTensorFlowTensor)
assert isinstance(doc.torch_tensor, AudioTorchTensor)
```


## AudioBytes

Alternatively, you can load your [`AudioUrl`][docarray.typing.url.AudioUrl] instance to [`AudioBytes`][docarray.typing.bytes.AudioBytes], and your [`AudioBytes`][docarray.typing.bytes.AudioBytes] instance to an [`AudioTensor`](../../../../api_references/typing/tensor/audio) of your choice:
Expand Down Expand Up @@ -142,7 +144,9 @@ assert isinstance(bytes_from_tensor, AudioBytes)
```

## Save audio to file

You can save your [`AudioTensor`](../../../../api_references/typing/tensor/audio) to an audio file of any format as follows:

``` { .python }
tensor_reversed = doc.tensor[::-1]
tensor_reversed.save(
Expand All @@ -152,7 +156,7 @@ tensor_reversed.save(
```
## Play audio in a notebook

You can play your audio sound in a notebook from its URL as well as its tensor, by calling `.display()` on either one.
You can play your audio sound in a notebook from its URL or tensor, by calling `.display()` on either one.

Play from `url`:
``` { .python }
Expand All @@ -166,18 +170,17 @@ doc.url.display()
</table>

Play from `tensor`:

``` { .python }
tensor_reversed.display()
```

<table>
<tr>
<td><audio controls><source src="../olleh.mp3" type="audio/mp3"></audio></td>
</tr>
</table>




## Getting started - Predefined `AudioDoc`

To get started and play around with your audio data, DocArray provides a predefined [`AudioDoc`][docarray.documents.audio.AudioDoc], which includes all of the previously mentioned functionalities:
Expand All @@ -192,6 +195,7 @@ class AudioDoc(BaseDoc):
```

You can use this class directly or extend it to your preference:

```python
from docarray.documents import AudioDoc
from typing import Optional
Expand All @@ -205,7 +209,7 @@ class MyAudio(AudioDoc):
audio = MyAudio(
url='https://github.com/docarray/docarray/blob/main/tests/toydata/hello.mp3?raw=true'
)

audio.name = 'My first audio doc!'
audio.tensor, audio.frame_rate = audio.url.load()
```

6 changes: 3 additions & 3 deletions docs/data_types/first_steps.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Intro
# Introduction

With DocArray you can represent text, image, video, audio, and 3D meshes, whether separate, nested or combined,
and process them as a DocList.
and process them as a [`DocList`][docarray.array.doc_list.doc_list.DocList].

This section covers the following sections:

Expand All @@ -11,4 +11,4 @@ This section covers the following sections:
- [Video](video/video.md)
- [3D Mesh](3d_mesh/3d_mesh.md)
- [Table](table/table.md)
- [Multimodal data](multimodal/multimodal.md)
- [Multimodal data](multimodal/multimodal.md)
17 changes: 8 additions & 9 deletions docs/data_types/image/image.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ Moreover, we will introduce DocArray's image-specific types, to represent your i

!!! note
This requires `Pillow` dependency. You can install all necessary dependencies via:

```cmd
pip install "docarray[image]"
```
Expand All @@ -16,9 +17,9 @@ Moreover, we will introduce DocArray's image-specific types, to represent your i
!!! tip
Check out our predefined [`ImageDoc`](#getting-started-predefined-imagedoc) to get started and play around with our image features.

First, let's define our class `MyImage`, which extends [`BaseDoc`][docarray.base_doc.doc.BaseDoc] and has an `url` attribute of type [`ImageUrl`][docarray.typing.url.ImageUrl], as well as an optional `tensor` attribute of type [`ImageTensor`](../../../../api_references/typing/tensor/image).
First, let's define the class `MyImage`, which extends [`BaseDoc`][docarray.base_doc.doc.BaseDoc] and has a `url` attribute of type [`ImageUrl`][docarray.typing.url.ImageUrl], as well as an optional `tensor` attribute of type [`ImageTensor`](../../../../api_references/typing/tensor/image).

Next, let's instantiate a `MyImage` object with a local or remote URL.
Next, let's instantiate a `MyImage` object with a local or remote URL:

```python
from docarray.typing import ImageTensor, ImageUrl
Expand All @@ -35,7 +36,7 @@ img = MyImage(
)
```

To load the image data you can call [`.load()`][docarray.typing.url.ImageUrl.load] on the `url` attribute. By default, [`ImageUrl.load()`][docarray.typing.url.ImageUrl.load] returns an [`ImageNdArray`][docarray.typing.tensor.image.image_ndarray.ImageNdArray] object.
To load the image data you can call [`.load()`][docarray.typing.url.ImageUrl.load] on the `url` attribute. By default, [`ImageUrl.load()`][docarray.typing.url.ImageUrl.load] returns an [`ImageNdArray`][docarray.typing.tensor.image.image_ndarray.ImageNdArray] object:

```python
from docarray.typing import ImageNdArray
Expand Down Expand Up @@ -108,7 +109,7 @@ img = MyImage(tensor=np.ones(shape=(200, 300, 3)))
# img = MyImage(tensor=np.ones(shape=(224, 224, 3)))
```

If you have RGB images of different shapes, you could specify only the dimension as well as the number of channels:
If you have RGB images of different shapes, you can specify only the dimensions and number of channels:

```python
import numpy as np
Expand All @@ -124,8 +125,6 @@ img_1 = MyFlexibleImage(tensor=np.zeros(shape=(200, 300, 3)))
img_2 = MyFlexibleImage(tensor=np.ones(shape=(224, 224, 3)))
```



## ImageBytes

Alternatively, you can load your [`ImageUrl`][docarray.typing.url.ImageUrl] instance to [`ImageBytes`][docarray.typing.bytes.ImageBytes], and your [`ImageBytes`][docarray.typing.bytes.ImageBytes] instance to an [`ImageTensor`](../../../../api_references/typing/tensor/image) of your choice.
Expand Down Expand Up @@ -162,15 +161,13 @@ assert isinstance(bytes_from_tensor, ImageBytes)
You can display your image in a notebook from both an [`ImageUrl`][docarray.typing.url.ImageUrl] instance as well as an
[`ImageTensor`](../../../../api_references/typing/tensor/image) instance.


<figure markdown>
![](display_notebook.jpg){ width="900" }
</figure>


## Getting started - Predefined `ImageDoc`

To get started and play around with the image-modality, DocArray provides a predefined [`ImageDoc`][docarray.documents.image.ImageDoc], which includes all of the previously mentioned functionalities:
To get started and play around with the image modality, DocArray provides a predefined [`ImageDoc`][docarray.documents.image.ImageDoc], which includes all of the previously mentioned functionalities:

``` { .python }
class ImageDoc(BaseDoc):
Expand All @@ -181,6 +178,7 @@ class ImageDoc(BaseDoc):
```

You can use this class directly or extend it to your preference:

``` { .python }
from docarray.documents import ImageDoc
from docarray.typing import AnyEmbedding
Expand All @@ -197,6 +195,7 @@ image = MyImage(
image_title='My first image',
url='http://www.jina.ai/image.jpg',
)

image.tensor = image.url.load()
model = SomeEmbeddingModel()
image.embedding = model(image.tensor)
Expand Down
Loading