v0.31.0 release note draft

# Release Note

This release contains 4 new features, 11 bug fixes, and several documentation improvements.

## 💥 Breaking changes

### Return type of `DocVec` Optional Tensor (#1472)

Optional tensor fields in a `DocVec` will return `None` instead of a list of `Nan` if the column does not hold any tensor.

This code snippet shows the breaking change:

```python
from typing import Optional

from docarray import BaseDoc, DocVec
from docarray.typing import NdArray

class MyDoc(BaseDoc):
    tensor: Optional[NdArray[10]]

docs = DocVec[MyDoc]([MyDoc() for j in range(2)])

print(docs.tensor)
```

| Version | Return type |
| --- | --- |
| 0.30.0 | `[nan nan]` |
| 0.31.0 | `None` |

## 🆕 Features

### Add `InMemoryDocIndex` (#1441)

In this version we have introduced the `InMemoryDocIndex` Document Index which allows you to perform in-memory exact vector search (as opposed to approximate nearest neighbor search in vector databases). 

The `InMemoryDocIndex` can be used for prototyping and is suitable for dealing with small-scale documents (1k-10k), as opposed to a vector database that is suitable for larger scales but comes with a performance overhead at smaller scales.

```python
from docarray import BaseDoc, DocList
from docarray.index.backends.in_memory import InMemoryDocIndex
from docarray.typing import NdArray

import numpy as np

class MyDoc(BaseDoc):
    tensor: NdArray[512]

docs = DocList[MyDoc](MyDoc(tensor=i*np.ones(512)) for i in range(10))

doc_index = InMemoryDocIndex[MyDoc]()
doc_index.index(docs)

print(doc_index.find(3*np.ones(512), search_field='tensor', top_k=3))
```

```python
FindResult(documents=<DocList[MyDoc] (length=10)>, scores=array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]))
```

### `DocList` inherits from Python list (#1457)

`DocList` is now a subclass of Python's `list`. This means that you can now use all the methods that are available to Python lists on `DocList` objects. For example, you can now use `len` on `DocList` objects and tools like Pydantic or FastAPI will be able to work with it more easily.

### Add `len` to `DocIndex` (#1454)

You can now perform `len(vector_index)` which is equivalent to `vector_index.num_docs()`.

### Other minor features

- Add a `to_json` alias to `BaseDoc` (#1494)

## 🐞 Bug Fixes

### Point to older versions when importing `Document` or `Documentarray` (#1422)

Trying to load `Document` or `DocumentArray` from DocArray would previously raise an error, saying that you needed to downgrade your version of DocArray if you wanted to use these two objects. This behavior has been fixed.

### Fix `AnyDoc` `from_protobuf` (#1437)

`AnyDoc` can now read any `BaseDoc` protobuf file. The same applies to `DocList`.

### Other bug fixes

- Fix `extend` to `DocList` (#1493)
- Fix bug when calling `dict()` on `BaseDoc` (#1481)
- Fix bug when calling `json()` on `BaseDoc` (#1481)
- Support Pandas 2.0 by using `pd.concat()` instead of `df.append()` in `to_dataframe()` to avoid warning (#1478)
- Add logs to Elasticsearch index  (#1427)
- Fix a bug in Document Index where Torch tensors that required grad were not able to be converted to `ndarray` (#1429)
- Fix a bug with HNSW (#1426)
- Hubble Binary format version bump (#1414)
- Save index during creation for `hnswlib` (#1424)

## 📗 Documentation Improvements

- Fix FastAPI docs (#1453)
- Index predefined Documents (#1434)
- Clean up data types section (#1412)
- Remove duplicate API reference section (#1408)
- `Docindex` URLs (#1433)
- Fix Install commands hint (#1421)
- Add Google Analytics (#1432)
- Add install instructions for `hnswlib` and `elastic` document indexes (#1431)
- Various fixes (#1436, #1417, #1423, #1418, #1411, #1419)

## 🤟 Contributors

We would like to thank all contributors to this release:

- Alex Cureton-Griffiths (@alexcg1)
- samsja (@samsja)
- Johannes Messner (@JohannesMessner)
- Anne Yang (@AnneYang720)
- Scott Martens (@scott-martens)
- カレン (@RStar2022)
- Aman Agarwal (@agaraman0)
- Yanlong Wang (@nomagick)
- Charlotte Gerhaher (@anna-charlotte)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.31.0 release note draft #1456

Release Note

💥 Breaking changes

Return type of `DocVec` Optional Tensor (#1472)

🆕 Features

Add `InMemoryDocIndex` (#1441)

`DocList` inherits from Python list (#1457)

Add `len` to `DocIndex` (#1454)

Other minor features

🐞 Bug Fixes

Point to older versions when importing `Document` or `Documentarray` (#1422)

Fix `AnyDoc` `from_protobuf` (#1437)

Other bug fixes

📗 Documentation Improvements

🤟 Contributors

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

v0.31.0 release note draft #1456

Description

Release Note

💥 Breaking changes

Return type of DocVec Optional Tensor (#1472)

🆕 Features

Add InMemoryDocIndex (#1441)

DocList inherits from Python list (#1457)

Add len to DocIndex (#1454)

Other minor features

🐞 Bug Fixes

Point to older versions when importing Document or Documentarray (#1422)

Fix AnyDoc from_protobuf (#1437)

Other bug fixes

📗 Documentation Improvements

🤟 Contributors

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Return type of `DocVec` Optional Tensor (#1472)

Add `InMemoryDocIndex` (#1441)

`DocList` inherits from Python list (#1457)

Add `len` to `DocIndex` (#1454)

Point to older versions when importing `Document` or `Documentarray` (#1422)

Fix `AnyDoc` `from_protobuf` (#1437)