Msudhir/add vector update functionality#14
Conversation
Signed-off-by: Danny C <[email protected]>
update Catalog argument in athena connector Signed-off-by: Gyumin Lee <[email protected]> Co-authored-by: Gyumin Lee <[email protected]>
Signed-off-by: Danny C <[email protected]>
ensure correct precedence with the two operators Signed-off-by: Ben Fletcher <[email protected]>
Signed-off-by: Jiwon Park <[email protected]>
…east-dev#3677) Bumps [tough-cookie](https://github.com/salesforce/tough-cookie) from 4.0.0 to 4.1.3. - [Release notes](https://github.com/salesforce/tough-cookie/releases) - [Changelog](https://github.com/salesforce/tough-cookie/blob/master/CHANGELOG.md) - [Commits](salesforce/tough-cookie@v4.0.0...v4.1.3) --- updated-dependencies: - dependency-name: tough-cookie dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [tough-cookie](https://github.com/salesforce/tough-cookie) from 4.0.0 to 4.1.3. - [Release notes](https://github.com/salesforce/tough-cookie/releases) - [Changelog](https://github.com/salesforce/tough-cookie/blob/master/CHANGELOG.md) - [Commits](salesforce/tough-cookie@v4.0.0...v4.1.3) --- updated-dependencies: - dependency-name: tough-cookie dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…east-dev#3630) * sql.py data_sources.data_source_name String(255) Extend the limit of the data_source_name field from 50 to 255. Signed-off-by: Ross Donnachie <[email protected]>
…east-dev#3680) feat: Optimize bytes processed when retrieving entity df schema to 0 Signed-off-by: Hai Nguyen <[email protected]>
…tore.plan() on python (feast-dev#3640) * fix! KeyError: __dummy on entityless fv Signed-off-by: williamfoschiera <[email protected]> * fix! join_keys typing. Signed-off-by: williamfoschiera <[email protected]> --------- Signed-off-by: williamfoschiera <[email protected]> Co-authored-by: williamfoschiera <[email protected]>
Bumps [protobufjs](https://github.com/protobufjs/protobuf.js) from 7.1.1 to 7.2.4. - [Release notes](https://github.com/protobufjs/protobuf.js/releases) - [Changelog](https://github.com/protobufjs/protobuf.js/blob/master/CHANGELOG.md) - [Commits](protobufjs/protobuf.js@protobufjs-v7.1.1...protobufjs-v7.2.4) --- updated-dependencies: - dependency-name: protobufjs dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…east-dev#3675) Bumps [protobufjs](https://github.com/protobufjs/protobuf.js) from 7.1.2 to 7.2.4. - [Release notes](https://github.com/protobufjs/protobuf.js/releases) - [Changelog](https://github.com/protobufjs/protobuf.js/blob/master/CHANGELOG.md) - [Commits](protobufjs/protobuf.js@protobufjs-v7.1.2...protobufjs-v7.2.4) --- updated-dependencies: - dependency-name: protobufjs dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [semver](https://github.com/npm/node-semver) from 6.3.0 to 6.3.1. - [Release notes](https://github.com/npm/node-semver/releases) - [Changelog](https://github.com/npm/node-semver/blob/v6.3.1/CHANGELOG.md) - [Commits](npm/node-semver@v6.3.0...v6.3.1) --- updated-dependencies: - dependency-name: semver dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…-dev#3679) Bumps [semver](https://github.com/npm/node-semver) from 6.3.0 to 6.3.1. - [Release notes](https://github.com/npm/node-semver/releases) - [Changelog](https://github.com/npm/node-semver/blob/v6.3.1/CHANGELOG.md) - [Commits](npm/node-semver@v6.3.0...v6.3.1) --- updated-dependencies: - dependency-name: semver dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
) Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.47.0 to 1.53.0. - [Release notes](https://github.com/grpc/grpc-go/releases) - [Commits](grpc/grpc-go@v1.47.0...v1.53.0) --- updated-dependencies: - dependency-name: google.golang.org/grpc dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
# [0.32.0](feast-dev/feast@v0.31.0...v0.32.0) (2023-07-17) ### Bug Fixes * Added generic Feature store Creation for CLI ([feast-dev#3618](feast-dev#3618)) ([bf740d2](feast-dev@bf740d2)) * Broken non-root path with projects-list.json ([feast-dev#3665](feast-dev#3665)) ([4861af0](feast-dev@4861af0)) * Clean up snowflake to_spark_df() ([feast-dev#3607](feast-dev#3607)) ([e8e643e](feast-dev@e8e643e)) * Entityless fv breaks with `KeyError: __dummy` applying feature_store.plan() on python ([feast-dev#3640](feast-dev#3640)) ([ef4ef32](feast-dev@ef4ef32)) * Fix scan datasize to 0 for inference schema ([feast-dev#3628](feast-dev#3628)) ([c3dd74e](feast-dev@c3dd74e)) * Fix timestamp consistency in push api ([feast-dev#3614](feast-dev#3614)) ([9b227d7](feast-dev@9b227d7)) * For SQL registry, increase max data_source_name length to 255 ([feast-dev#3630](feast-dev#3630)) ([478caec](feast-dev@478caec)) * Implements connection pool for postgres online store ([feast-dev#3633](feast-dev#3633)) ([059509a](feast-dev@059509a)) * Manage redis pipe's context ([feast-dev#3655](feast-dev#3655)) ([48e0971](feast-dev@48e0971)) * Missing Catalog argument in athena connector ([feast-dev#3661](feast-dev#3661)) ([f6d3caf](feast-dev@f6d3caf)) * Optimize bytes processed when retrieving entity df schema to 0 ([feast-dev#3680](feast-dev#3680)) ([1c01035](feast-dev@1c01035)) ### Features * Add gunicorn for serve with multiprocess ([feast-dev#3636](feast-dev#3636)) ([4de7faf](feast-dev@4de7faf)) * Use string as a substitute for unregistered types during schema inference ([feast-dev#3646](feast-dev#3646)) ([c474ccd](feast-dev@c474ccd))
* Add fully-qualified-table-name Redshift prop Signed-off-by: Robin Neufeld <[email protected]> * pre-commit Signed-off-by: Robin Neufeld <[email protected]> * Docstring Signed-off-by: Robin Neufeld <[email protected]> * Test fully_qualified_table_name Signed-off-by: Robin Neufeld <[email protected]> * Simplify logic Signed-off-by: Robin Neufeld <[email protected]> * pre-commit Signed-off-by: Robin Neufeld <[email protected]> * pre-commit Signed-off-by: Robin Neufeld <[email protected]> * Test offline_write_batch Signed-off-by: Robin Neufeld <[email protected]> * Bump to trigger CI Signed-off-by: Robin Neufeld <[email protected]> * another bump for ci Signed-off-by: Robin Neufeld <[email protected]> --------- Signed-off-by: Robin Neufeld <[email protected]>
…SA role (feast-dev#3696) Add aws-sts dependency in java sdk Signed-off-by: harmeet-singh-discovery <[email protected]>
| collection_available = utility.has_collection(table_to_keep.name) | ||
| try: | ||
| if collection_available: | ||
| logger.error(f"Collection {table_to_keep.name} already exists.") |
There was a problem hiding this comment.
please make this log statement on "info" level. If a collection exists it is not an error and is expected.
|
|
||
|
|
||
| @dataclass | ||
| class MockFeatureView: |
There was a problem hiding this comment.
is it really necessary to use a mock? Can you use a VectorFeatureView instead?
|
|
||
| # Assert that logger.info was called with expected messages | ||
| expected_log_calls = [ | ||
| mocker.call("Closing the connection to Milvus"), |
There was a problem hiding this comment.
verification through logs is not a good practice in unit tests. Look into methods of "connections" like get_connection_addr() https://github.com/milvus-io/pymilvus/blob/master/pymilvus/orm/connections.py#L396 to use for the assert statements
| assert len(utility.list_collections()) == 1 | ||
| assert utility.has_collection(self.collection_to_write) is True | ||
| assert ( | ||
| f"Collection {self.collection_to_write} has been created successfully." |
There was a problem hiding this comment.
no need to assert a specific log statement here
| partial=None, | ||
| ) | ||
|
|
||
| assert f"Collection {self.collection_to_write} already exists." in caplog.text |
There was a problem hiding this comment.
assert with list_collections() == 1 instead
| ) | ||
|
|
||
| online_store_creator.teardown() | ||
| MilvusOnlineStore().update( |
There was a problem hiding this comment.
using the milvus online store here makes this test dependent on update() working as expected. Instead creating a collection via the pymilvus sdk directly would be better here to simulate that a collection already exists.
| if collection_available: | ||
| logger.error(f"Collection {table_to_keep.name} already exists.") | ||
| else: | ||
| Collection(name=table_to_keep.name, schema=table_to_keep.schema) |
There was a problem hiding this comment.
just catching this. FeatureView schemas are not the same as the schema that Milvus expects. We will need to write logic to translate that.
| ] | ||
| ) | ||
|
|
||
| MilvusOnlineStore().update( |
There was a problem hiding this comment.
here I would also rather create the collection first through the pymilvus sdk so that we do not introduce tests that are interdependent.
| assert utility.has_collection(self.collection_to_delete) is False | ||
| assert len(utility.list_collections()) == 0 | ||
| assert ( | ||
| f"Collection {self.collection_to_delete} has been deleted successfully." |
There was a problem hiding this comment.
do not assert log-statements. If the wording changes one has to go and update the tests.
| ) | ||
|
|
||
| assert ( | ||
| f"Collection {self.collection_to_delete} does not exist or is already deleted." |
There was a problem hiding this comment.
check with assert connections.has_connection(self.collection_to_delete) instead
…vus readable schema
| f"Collection {table_to_delete.name} has been deleted successfully." | ||
| ) | ||
| else: | ||
| return logger.error( |
There was a problem hiding this comment.
remove return statement
| field_name = field.name | ||
| description = field.tags.get("description") | ||
| is_primary = boolean_mapping_from_string.get(field.tags.get("is_primary")) | ||
| dimension = field.tags.get("dimension") |
There was a problem hiding this comment.
dimensions have to be retrieved from VectorFeatureView. Also, the name of the column holding the vector is defined separately in VectorFeatureView ("vector_field")
| is_primary = boolean_mapping_from_string.get(field.tags.get("is_primary")) | ||
| dimension = field.tags.get("dimension") | ||
|
|
||
| if dimension is not None: |
There was a problem hiding this comment.
why is this needed?
|
|
||
| # inheriting from FeatureView wouldn't work due to issue with conflicting proto classes | ||
| # therefore using composition instead | ||
| name: str |
There was a problem hiding this comment.
feature_view already has an attribute name
There was a problem hiding this comment.
It did not reflect and threw an error when I tried to add name
|
|
||
| def setup_method(self, milvus_online_setup): | ||
| # Ensuring that the collections created are dropped before the tests are run | ||
| with PymilvusConnectionContext(): |
There was a problem hiding this comment.
it's ok to re-use MilvusConnectionManager in order to not duplicate code
There was a problem hiding this comment.
I want to keep the test not dependent on any functionality in Milvus online store. This context manager is a very simple open and close connection, purely for testing purposes. Will not add in username, password, etc.
| tags={ | ||
| "is_primary": "False", | ||
| "description": "float32", | ||
| "dimension": "128", |
There was a problem hiding this comment.
thinking about it this is actually a great suggestion to use a tag to pass along the number of dimensions. I think we can do likewise with index algorithm and eliminate the need for a VectorFeatureView altogether. What do you think?
There was a problem hiding this comment.
Yes, thats a good idea
| ) | ||
|
|
||
| # Here we want to open and add a collection using pymilvus directly and close the connection, we need to check if the collection count remains 1 and exists. | ||
| with PymilvusConnectionContext(): |
There was a problem hiding this comment.
does the embedded Milvus actually persist data after it is stopped?
| partial=None, | ||
| ) | ||
|
|
||
| assert "Collection abc does not exist or is already deleted." in caplog.text |
There was a problem hiding this comment.
minor: I'd rather assert against Milvus not having any collection. The update() method should really have a return value, but that is an issue in Feast itself
Signed-off-by: Francisco Javier Arceo <[email protected]>
# [0.49.0](feast-dev/feast@v0.48.0...v0.49.0) (2025-04-29) ### Bug Fixes * Adding brackets to unit tests ([c46fea3](feast-dev@c46fea3)) * Adding logic back for a step ([2bb240b](feast-dev@2bb240b)) * Adjustment for unit test action ([a6f78ae](feast-dev@a6f78ae)) * Allow get_historical_features with only On Demand Feature View ([feast-dev#5256](feast-dev#5256)) ([0752795](feast-dev@0752795)) * CI adjustment ([3850643](feast-dev@3850643)) * Embed Query configuration breaks when switching between DataFrame and SQL ([feast-dev#5257](feast-dev#5257)) ([32375a5](feast-dev@32375a5)) * Fix for proto issue in utils ([1b291b2](feast-dev@1b291b2)) * Fix milvus online_read ([feast-dev#5233](feast-dev#5233)) ([4b91f26](feast-dev@4b91f26)) * Fix tests ([431d9b8](feast-dev@431d9b8)) * Fixed Permissions object parameter in example ([feast-dev#5259](feast-dev#5259)) ([045c100](feast-dev@045c100)) * Java CI [#12](feast-dev#12) ([d7e44ac](feast-dev@d7e44ac)) * Java PR [#15](feast-dev#15) ([a5da3bb](feast-dev@a5da3bb)) * Java PR [#16](feast-dev#16) ([e0320fe](feast-dev@e0320fe)) * Java PR [#17](feast-dev#17) ([49da810](feast-dev@49da810)) * Materialization logs ([feast-dev#5243](feast-dev#5243)) ([4aa2f49](feast-dev@4aa2f49)) * Moving to custom github action for checking skip tests ([caf312e](feast-dev@caf312e)) * Operator - remove default replicas setting from Feast Deployment ([feast-dev#5294](feast-dev#5294)) ([e416d01](feast-dev@e416d01)) * Patch java pr [#14](feast-dev#14) ([592526c](feast-dev@592526c)) * Patch update for test ([a3e8967](feast-dev@a3e8967)) * Remove conditional from steps ([995307f](feast-dev@995307f)) * Remove misleading HTTP prefix from gRPC endpoints in logs and doc ([feast-dev#5280](feast-dev#5280)) ([0ee3a1e](feast-dev@0ee3a1e)) * removing id ([268ade2](feast-dev@268ade2)) * Renaming workflow file ([5f46279](feast-dev@5f46279)) * Resolve `no pq wrapper` import issue ([feast-dev#5240](feast-dev#5240)) ([d5906f1](feast-dev@d5906f1)) * Update actions to remove check skip tests ([feast-dev#5275](feast-dev#5275)) ([b976f27](feast-dev@b976f27)) * Update docling demo ([446efea](feast-dev@446efea)) * Update java pr [#13](feast-dev#13) ([fda7db7](feast-dev@fda7db7)) * Update java_pr ([fa138f4](feast-dev@fa138f4)) * Update repo_config.py ([6a59815](feast-dev@6a59815)) * Update unit tests workflow ([06486a0](feast-dev@06486a0)) * Updated docs for docling demo ([768e6cc](feast-dev@768e6cc)) * Updating action for unit tests ([0996c28](feast-dev@0996c28)) * Updating github actions to filter at job level ([0a09622](feast-dev@0a09622)) * Updating Java CI ([c7c3a3c](feast-dev@c7c3a3c)) * Updating java pr to skip tests ([e997dd9](feast-dev@e997dd9)) * Updating workflows ([c66bcd2](feast-dev@c66bcd2)) ### Features * Add date_partition_column_format for spark source ([feast-dev#5273](feast-dev#5273)) ([7a61d6f](feast-dev@7a61d6f)) * Add Milvus tutorial with Feast integration ([feast-dev#5292](feast-dev#5292)) ([a1388a5](feast-dev@a1388a5)) * Add pgvector tutorial with PostgreSQL integration ([feast-dev#5290](feast-dev#5290)) ([bb1cbea](feast-dev@bb1cbea)) * Add ReactFlow visualization for Feast registry metadata ([feast-dev#5297](feast-dev#5297)) ([9768970](feast-dev@9768970)) * Add retrieve online documents v2 method into pgvector ([feast-dev#5253](feast-dev#5253)) ([6770ee6](feast-dev@6770ee6)) * Compute Engine Initial Implementation ([feast-dev#5223](feast-dev#5223)) ([64bdafd](feast-dev@64bdafd)) * Enable write node for compute engine ([feast-dev#5287](feast-dev#5287)) ([f9baf97](feast-dev@f9baf97)) * Local compute engine ([feast-dev#5278](feast-dev#5278)) ([8e06dfe](feast-dev@8e06dfe)) * Make transform on writes configurable for ingestion ([feast-dev#5283](feast-dev#5283)) ([ecad170](feast-dev@ecad170)) * Offline store update pull_all_from_table_or_query to make timestampfield optional ([feast-dev#5281](feast-dev#5281)) ([4b94608](feast-dev@4b94608)) * Serialization version 2 deprecation notice ([feast-dev#5248](feast-dev#5248)) ([327d99d](feast-dev@327d99d)) * Vector length definition moved to Feature View from Config ([feast-dev#5289](feast-dev#5289)) ([d8f1c97](feast-dev@d8f1c97))
What this PR does / why we need it:
Added updated functionality to MilvusOnlineStore. The update functionality is able to delete and create a collection.
Relevant tests added as well
Which issue(s) this PR fixes:
Fixes #