Chore: draft release note v0.18.0

# Release Note

This release contains 7 new features, 6 bug fixes and 8 documentation improvements.


## 🆕 Features
### Support geospatial filters in Redis backend (#579)

The Redis Document Store can now accept geospatial filter queries in the `DocumentArray.find()` method:

```python
from docarray import Document, DocumentArray

n_dim = 3
da = DocumentArray(
    storage='redis',
    config={
        'n_dim': n_dim,
        'columns': {'location': 'geo'},
    },
)
with da:
    da.extend(
        [
            Document(id=f'r{i}', tags={'location': f"{-98.17+i},{38.71+i}"})
            for i in range(10)
        ]
    )
max_distance = 300
filter = f'@location:[-98.71 38.71 {max_distance} km] '
results = da.find(filter=filter, limit=10)
print(
    f'Locations within: {max_distance} km',
    [(doc.id, doc.tags['location']) for doc in results],
)
```

Results:

```text
Locations within: 300 km [('r0', '-98.17,38.71'), ('r1', '-97.17,39.71')]
```

### Support multiple metrics in evaluate (#643)

`DocumentArray.evaluate()` now supports computing evaluations for multiple metrics at once. The `metric` parameter is
renamed to `metrics`, and `metric_name` is renamed to `metric_names`.

The `evaluate()` method expects a list for `metrics` and `metric_names` rather than a single value.
For instance, instead of doing:

```python
da2.evaluate(
    ground_truth=da1, metric='precision_at_k', metric_name='precision@k', k=5
)  # returns average_evaluation
```

use:

```python
da2.evaluate(
    ground_truth=da1, metrics=['precision_at_k'], metric_names=['precision@k'], k=10
)  # returns {'precision@k': prec_at_k_average_evaluation}
```

The first usage will raise a deprecation warning and will be deprecated soon.

The return type is also changed: `evaluate()` will now return a dict mapping metric names to their average evaluation scores 
instead of a single score value.

For more info, check the [Evaluate Matches](https://docarray.jina.ai/fundamentals/documentarray/evaluation/) section in the documentation.

### Show server error messages in push

When using `DocumentArray.push()`, error messages returned by the server will show up in the stack trace. For instance, pushing a `DocumentArray` with a name reserved by another user will return the following error:

```text
requests.exceptions.HTTPError: 403 Client Error: OperationNotAllowedError: Current user is not allowed to edit this artifact. Permission denied. for url: https://api.hubble.jina.ai/v2/rpc/artifact.upload
```

### Add warnings when using MongoDB-like filter QL syntax in Redis and support native filter QL (#645)

MongoDB-like filter QL is no longer supported in the Redis backend, and this release adds support for the native [Redis QL syntax](https://redis.io/docs/stack/search/reference/query_syntax/). Using MongoDB-like filter QL will raise a deprecation warning and will be deprecated soon.

Therefore, instead of using:

```python
redis_da.find(filter={'field': {'@eq': 'value'}})
```

use this syntax instead:

```python
redis_da.find(filter='@field:value')
```

For more information, check the [Redis Document Store](https://docarray.jina.ai/advanced/document-store/redis/) documentation.

### Add support for labeled datasets to the evaluate function (#617)

As of this release, `DocumentArray.evaluate()` supports labeled datasets. Labels can be added using a `tag` field in 
each Document of your DocumentArray:

```python
import numpy as np
from docarray import Document, DocumentArray

example_da = DocumentArray([Document(tags={'label': (i % 2)}) for i in range(10)])
example_da.embeddings = np.random.random([10, 3])
example_da.match(example_da)
print(example_da.evaluate(metric='precision_at_k'))
```

The results of the evaluation will be stored in the `evaluations` field of each Document.

You can specify the label field name using the `label_tag` attribute:

```python
example_da = DocumentArray(
    [Document(tags={'my_custom_label': (i % 2)}) for i in range(10)]
)
example_da.embeddings = np.random.random([10, 3])
example_da.match(example_da)
print(example_da.evaluate(metric='precision_at_k', label_tag='my_custom_label'))
```

### Allow progress bar while batching (#628)

You can see the progress of batching documents using `DocumentArray.batch()` with the `show_progress` parameter:

```python
import time
from docarray import Document, DocumentArray

da = DocumentArray.empty(100000)
for i in range(1, 100000):
    da.append(Document(text=str(i)))

print('append finished')

for batch in da.batch(500, show_progress=True):
    time.sleep(0.1)
```

![my gif](https://user-images.githubusercontent.com/15269265/196729199-8769ecb6-c3b1-45a3-87d2-09502bebb256.gif)

### Add the `n_components` PCA parameter to AnnLite configurations (#606)

The parameter `n_components` is added to [AnnLite](https://github.com/jina-ai/annlite)'s configuration in DocArray. Use this parameter when you want to use
PCA in your AnnLite backend.

## 🐞 Bug Fixes

### Support Qdrant 0.8.0

DocArray adds support for Qdrant versions greater than or equal to v0.8.0 and drops support for previous versions.
Therefore, make sure to use version 0.8.0 or higher for both `qdrant-client` and the Qdrant database. 

### Sync DocumentArray using sync() method and context manager (#625)

Fully persisting (syncing) data in a DocumentArray to a database now is ensured using either the context manager or 
the `sync()` method. Make sure to wrap write operations to a `DocumentArray` in a context manager like so: 

```python
my_da = DocumentArray(storage='my_storage', config=...)
with my_da:
    ...  # write operations
```

or use the `sync()` method:

```python
my_da = DocumentArray(storage='my_storage', config=...)
...  # write operations
my_da.sync()
```

### Close the file handler properly in `load_uri_to_audio_tensor` (#609)

Method `load_uri_to_audio_tensor` used to open a file handler without properly closing the file.
This release fixes this bug and makes sure the file is opened with a context manager and is closed properly.

### Fix add not performing deep copy (#582)

Concatenation operations in DocumentArray used to operate on objects in-place, without making a copy.
This resulted in the following unexpected behavior:

```python
from docarray import DocumentArray

da1 = DocumentArray.empty(3)
da2 = DocumentArray.empty(4)
da3 = DocumentArray.empty(5)
print(da1 + da2 + da3)

da1 += da2
print('length =', len(da1))  # expected length = 7 but prints length = 16
```

This release fixes the bug. Concatenation will operate on new copied objects each time rather than concatenating 
in-place.

### Fix loading from a database with subindices (#581)

Prior to this release, reloading a DocumentArray configured with subindices from a database used to produce a 
`unique ID existing` error (the actual error message depends on the backend). This happened because `DocumentArray`
attempted to index initial documents twice in the sub-index, although they had been already indexed.

This release fixes the issue.

### Remove check of default value in _non_empty_fields (#565)

Serializing a Document used to ignore scores with value `0.0`. For instance, the string representation of a Document
might ignore the scores with value 0 and consider them as an empty field. This release fixes the issue.

## 📗 Documentation Improvements

* Highlight the importance of using the context manager when it comes to fully persisting data in a database. Read more in [Persistence, mutations and context manager](https://docarray.jina.ai/advanced/document-store/#persistence-mutations-and-context-manager). (#613)
* Fix a mention of the `convert_uri_to_datauri()` method in the documentation. (#608)
* Fix the documentation build stage so that the API reference section appears correctly for Document Stores. Now you can find the API reference for Document Stores in [this section](https://docarray.jina.ai/api/#docarray-array-document-stores). (#594)
* Fix the docstring of the `set_image_normalization()` method so that it mentions proper usage and aligns with PyTorch
conventions. (#585)
* Clarify that the Query Language syntax of filter queries in DocArray depends on the Document Store used with the
DocumentArray instance. (#586)
* Introduce a few improvements to the README example, so that the user takes into consideration the dataset size and
requirements. (#577)
* Fix an example of plotting embeddings in the README. (#576)
* Introduce a few grammatical improvements to the [What is DocArray](https://docarray.jina.ai/get-started/what-is/) section. (#566)

## :boom: Backwards incompatible API changes

### Increased minimum versions for dependencies:

| Package  | Minimum Version  |
|---|---|
| `qdrant`  | `0.8.0`  |

The Qdrant backend in DocArray now requires Qdrant database v0.8.0 or higher.

### Other API Changes:

* The return type of `DocumentArray.evaluate()` changed from a single score float to a dict mapping score names to score values.
* Fully persisting data in `DocumentArray` using a storage backend now has to be ensured by using the context manager. Therefore, you need to wrap your write operations to a `DocumentArray` in a context manager like so: 

```python
my_da = DocumentArray(storage='my_storage', config=...)
with my_da:
    ...  # write operations
```

Alternatively, you can call the `sync()` method when you finish write operations:

```python
my_da = DocumentArray(storage='my_storage', config=...)
...  # write operations
my_da.sync()
```

### Future API Changes:

* The MongoDB-like query language syntax for filtering in the Redis backend will be deprecated soon.
* The `metric` and `metric_name` parameters in the `DocumentArray.evaluate()` method were renamed and accept a list type rather than a single value (as mentioned above). The old naming and type will be deprecated soon.


## 🤟 Contributors

We would like to thank all contributors to this release:
Jie Fu (@jemmyshin)
Wang Bo (@bwanglzu)
Jonathan Rowley (@jonathan-rowley)
Alex Cureton-Griffiths (@alexcg1)
Han Xiao (@hanxiao)
AlaeddineAbdessalem (@alaeddine-13)
Michael Günther (@guenthermi)
samsja (@samsja)
Johannes Messner (@JohannesMessner)
dong xiang (@dongxiang123)
Jackmin801 (@Jackmin801)
Joan Fontanals (@JoanFM)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chore: draft release note v0.18.0 #648

Release Note

🆕 Features

Support geospatial filters in Redis backend (#579)

Support multiple metrics in evaluate (#643)

Show server error messages in push

Add warnings when using MongoDB-like filter QL syntax in Redis and support native filter QL (#645)

Add support for labeled datasets to the evaluate function (#617)

Allow progress bar while batching (#628)

Add the `n_components` PCA parameter to AnnLite configurations (#606)

🐞 Bug Fixes

Support Qdrant 0.8.0

Sync DocumentArray using sync() method and context manager (#625)

Close the file handler properly in `load_uri_to_audio_tensor` (#609)

Fix add not performing deep copy (#582)

Fix loading from a database with subindices (#581)

Remove check of default value in _non_empty_fields (#565)

📗 Documentation Improvements

💥 Backwards incompatible API changes

Increased minimum versions for dependencies:

Other API Changes:

Future API Changes:

🤟 Contributors

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Chore: draft release note v0.18.0 #648

Description

Release Note

🆕 Features

Support geospatial filters in Redis backend (#579)

Support multiple metrics in evaluate (#643)

Show server error messages in push

Add warnings when using MongoDB-like filter QL syntax in Redis and support native filter QL (#645)

Add support for labeled datasets to the evaluate function (#617)

Allow progress bar while batching (#628)

Add the n_components PCA parameter to AnnLite configurations (#606)

🐞 Bug Fixes

Support Qdrant 0.8.0

Sync DocumentArray using sync() method and context manager (#625)

Close the file handler properly in load_uri_to_audio_tensor (#609)

Fix add not performing deep copy (#582)

Fix loading from a database with subindices (#581)

Remove check of default value in _non_empty_fields (#565)

📗 Documentation Improvements

💥 Backwards incompatible API changes

Increased minimum versions for dependencies:

Other API Changes:

Future API Changes:

🤟 Contributors

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Add the `n_components` PCA parameter to AnnLite configurations (#606)

Close the file handler properly in `load_uri_to_audio_tensor` (#609)