fix(postgres): Use end_date in synthetic entity_df for non-entity retrieval#6110
Merged
franciscojavierarceo merged 6 commits intofeast-dev:masterfrom Mar 17, 2026
Conversation
…_df for non-entity retrieval The non-entity retrieval path created a synthetic entity_df using pd.date_range(start=start_date, ...)[:1], which placed start_date as the event_timestamp. Since PIT joins use MAX(entity_timestamp) as the upper bound for feature data filtering, using start_date made end_date unreachable — no features after start_date would be returned. Fix: use [end_date] directly, matching the ClickHouse implementation (PR feast-dev#6066) and the Dask offline store behavior. Signed-off-by: yassinnouh21 <[email protected]>
The entity_df fix alone would cause min_event_timestamp to be computed as end_date - TTL (instead of start_date - TTL), clipping valid data from the query window. Override entity_df_event_timestamp_range to (start_date, end_date) in non-entity mode so the full range is used. Also fix ruff formatting in the test file. Signed-off-by: yassinnouh21 <[email protected]>
0f4df15 to
e82371d
Compare
Member
franciscojavierarceo
left a comment
There was a problem hiding this comment.
can you add an integration test for this so we can confirm the behavior?
3abfba5 to
efd3f9b
Compare
Signed-off-by: yassinnouh21 <[email protected]>
efd3f9b to
271825e
Compare
Anarion-zuo
pushed a commit
to Anarion-zuo/feast
that referenced
this pull request
Mar 17, 2026
…rieval (feast-dev#6110) * fix(postgres): Use end_date instead of start_date in synthetic entity_df for non-entity retrieval The non-entity retrieval path created a synthetic entity_df using pd.date_range(start=start_date, ...)[:1], which placed start_date as the event_timestamp. Since PIT joins use MAX(entity_timestamp) as the upper bound for feature data filtering, using start_date made end_date unreachable — no features after start_date would be returned. Fix: use [end_date] directly, matching the ClickHouse implementation (PR feast-dev#6066) and the Dask offline store behavior. Signed-off-by: yassinnouh21 <[email protected]> * fix: preserve timestamp range for min_event_timestamp and fix formatting The entity_df fix alone would cause min_event_timestamp to be computed as end_date - TTL (instead of start_date - TTL), clipping valid data from the query window. Override entity_df_event_timestamp_range to (start_date, end_date) in non-entity mode so the full range is used. Also fix ruff formatting in the test file. Signed-off-by: yassinnouh21 <[email protected]> * test: add integration test for non-entity retrieval Signed-off-by: yassinnouh21 <[email protected]> --------- Signed-off-by: yassinnouh21 <[email protected]> Co-authored-by: Francisco Javier Arceo <[email protected]> Signed-off-by: aaronzuo <[email protected]>
Shizoqua
pushed a commit
to Shizoqua/feast
that referenced
this pull request
Mar 18, 2026
…rieval (feast-dev#6110) * fix(postgres): Use end_date instead of start_date in synthetic entity_df for non-entity retrieval The non-entity retrieval path created a synthetic entity_df using pd.date_range(start=start_date, ...)[:1], which placed start_date as the event_timestamp. Since PIT joins use MAX(entity_timestamp) as the upper bound for feature data filtering, using start_date made end_date unreachable — no features after start_date would be returned. Fix: use [end_date] directly, matching the ClickHouse implementation (PR feast-dev#6066) and the Dask offline store behavior. Signed-off-by: yassinnouh21 <[email protected]> * fix: preserve timestamp range for min_event_timestamp and fix formatting The entity_df fix alone would cause min_event_timestamp to be computed as end_date - TTL (instead of start_date - TTL), clipping valid data from the query window. Override entity_df_event_timestamp_range to (start_date, end_date) in non-entity mode so the full range is used. Also fix ruff formatting in the test file. Signed-off-by: yassinnouh21 <[email protected]> * test: add integration test for non-entity retrieval Signed-off-by: yassinnouh21 <[email protected]> --------- Signed-off-by: yassinnouh21 <[email protected]> Co-authored-by: Francisco Javier Arceo <[email protected]> Signed-off-by: Shizoqua <[email protected]>
aniketpalu
pushed a commit
to aniketpalu/feast
that referenced
this pull request
Mar 23, 2026
…rieval (feast-dev#6110) * fix(postgres): Use end_date instead of start_date in synthetic entity_df for non-entity retrieval The non-entity retrieval path created a synthetic entity_df using pd.date_range(start=start_date, ...)[:1], which placed start_date as the event_timestamp. Since PIT joins use MAX(entity_timestamp) as the upper bound for feature data filtering, using start_date made end_date unreachable — no features after start_date would be returned. Fix: use [end_date] directly, matching the ClickHouse implementation (PR feast-dev#6066) and the Dask offline store behavior. Signed-off-by: yassinnouh21 <[email protected]> * fix: preserve timestamp range for min_event_timestamp and fix formatting The entity_df fix alone would cause min_event_timestamp to be computed as end_date - TTL (instead of start_date - TTL), clipping valid data from the query window. Override entity_df_event_timestamp_range to (start_date, end_date) in non-entity mode so the full range is used. Also fix ruff formatting in the test file. Signed-off-by: yassinnouh21 <[email protected]> * test: add integration test for non-entity retrieval Signed-off-by: yassinnouh21 <[email protected]> --------- Signed-off-by: yassinnouh21 <[email protected]> Co-authored-by: Francisco Javier Arceo <[email protected]> Signed-off-by: Aniket Paluskar <[email protected]>
yuan1j
pushed a commit
to yuan1j/feast
that referenced
this pull request
Apr 2, 2026
…rieval (feast-dev#6110) * fix(postgres): Use end_date instead of start_date in synthetic entity_df for non-entity retrieval The non-entity retrieval path created a synthetic entity_df using pd.date_range(start=start_date, ...)[:1], which placed start_date as the event_timestamp. Since PIT joins use MAX(entity_timestamp) as the upper bound for feature data filtering, using start_date made end_date unreachable — no features after start_date would be returned. Fix: use [end_date] directly, matching the ClickHouse implementation (PR feast-dev#6066) and the Dask offline store behavior. Signed-off-by: yassinnouh21 <[email protected]> * fix: preserve timestamp range for min_event_timestamp and fix formatting The entity_df fix alone would cause min_event_timestamp to be computed as end_date - TTL (instead of start_date - TTL), clipping valid data from the query window. Override entity_df_event_timestamp_range to (start_date, end_date) in non-entity mode so the full range is used. Also fix ruff formatting in the test file. Signed-off-by: yassinnouh21 <[email protected]> * test: add integration test for non-entity retrieval Signed-off-by: yassinnouh21 <[email protected]> --------- Signed-off-by: yassinnouh21 <[email protected]> Co-authored-by: Francisco Javier Arceo <[email protected]> Signed-off-by: yuanjun220 <[email protected]>
franciscojavierarceo
pushed a commit
that referenced
this pull request
Apr 7, 2026
# [0.61.0](v0.60.0...v0.61.0) (2026-04-07) ### Bug Fixes * Add grpcio dependency group to transformation server Dockerfile ([2c2150a](2c2150a)) * Add https readiness check for rest-registry tests ([ea85e63](ea85e63)) * Add website build check for PRs and fix blog frontmatter YAML error ([#6079](#6079)) ([30a3a43](30a3a43)) * Added missing jackc/pgx/v5 entries ([94ad0e7](94ad0e7)) * Added MLflow metric charts across feature selection ([#6080](#6080)) ([a403361](a403361)) * Check duplicate names for feature view across types ([#5999](#5999)) ([95b9af8](95b9af8)) * Fix integration tests ([#6046](#6046)) ([02d5548](02d5548)) * Fix missing error handling for resource_counts endpoint ([d9706ce](d9706ce)) * Fix non-specific label selector on metrics service ([a1a160d](a1a160d)) * fix path feature_definitions.py ([7d7df68](7d7df68)) * Fix regstry Rest API tests intermittent failure ([d53a339](d53a339)) * Fixed IntegrityError on SqlRegistry ([#6047](#6047)) ([325e148](325e148)) * Fixed intermittent failures in get_historical_features ([c335ec7](c335ec7)) * Fixed pre-commit check ([114b7db](114b7db)) * Fixed the intermittent FeatureViewNotFoundException ([661ecc7](661ecc7)) * Fixed uv cache permission error for docker build on mac ([ad807be](ad807be)) * Fixes a `PydanticDeprecatedSince20` warning for trino_offline_store ([#5991](#5991)) ([abfd18a](abfd18a)) * Handle existing RBAC role gracefully in namespace registry ([b46a62b](b46a62b)) * Ignore ipynb files during apply ([#6151](#6151)) ([4ea123d](4ea123d)) * Integration test failures ([#6040](#6040)) ([9165870](9165870)) * Mount TLS volumes for init container ([080a9b5](080a9b5)) * **postgres:** Use end_date in synthetic entity_df for non-entity retrieval ([#6110](#6110)) ([088a802](088a802)), closes [#6066](#6066) * Ray offline store tests are duplicated across 3 workflows ([54f705a](54f705a)) * Reenable tests ([#6036](#6036)) ([82ee7f8](82ee7f8)) * SSL/TLS mode by default for postgres connection ([4844488](4844488)) * Use commitlint pre-commit hook instead of a separate action ([35a81e7](35a81e7)) ### Features * Add Claude Code agent skills for Feast ([#6081](#6081)) ([1e5b60f](1e5b60f)), closes [#5976](#5976) [#6007](#6007) * Add complex type support (Map, JSON, Struct) with schema validation ([#5974](#5974)) ([1200dbf](1200dbf)) * Add decimal to supported feature types ([#6029](#6029)) ([#6226](#6226)) ([cff6fbf](cff6fbf)) * Add feast apply init container to automate registry population on pod start ([#6106](#6106)) ([6b31a43](6b31a43)) * Add feature view versioning support to PostgreSQL and MySQL online stores ([#6193](#6193)) ([940e0f0](940e0f0)), closes [#6168](#6168) [#6169](#6169) [#2728](#2728) * Add materialization, feature freshness, request latency, and push metrics to feature server ([2c6be18](2c6be18)) * Add metadata statistics to registry api ([ef1d4fc](ef1d4fc)) * Add non-entity retrieval support for ClickHouse offline store ([4d08ddc](4d08ddc)), closes [#5835](#5835) * Add OnlineStore for MongoDB ([#6025](#6025)) ([bf4e3fa](bf4e3fa)), closes [golang/go#74462](golang/go#74462) * Add Oracle DB as Offline store in python sdk & operator ([#6017](#6017)) ([9d35368](9d35368)) * Add RBAC aggregation labels to FeatureStore ClusterRoles ([daf77c6](daf77c6)) * Add ServiceMonitor auto-generation for Prometheus discovery ([#6126](#6126)) ([56e6d21](56e6d21)) * Add typed_features field to grpc write request (([#6117](#6117)) ([#6118](#6118)) ([eeaa6db](eeaa6db)), closes [#6116](#6116) * Add UUID and TIME_UUID as feature types ([#5885](#5885)) ([#5951](#5951)) ([5d6e311](5d6e311)) * Add version indicators to lineage graph nodes ([#6187](#6187)) ([73805d3](73805d3)) * Add version tracking to FeatureView ([#6101](#6101)) ([ed4a4f2](ed4a4f2)) * Added Agent skills for AI Agents ([#6007](#6007)) ([99008c8](99008c8)) * Added CodeQL SAST scanning and detect-secrets pre-commit hook ([547b516](547b516)) * Added odfv transformations metrics ([8b5a526](8b5a526)) * Adding optional name to Aggregation (feast-dev[#5994](#5994)) ([#6083](#6083)) ([56469f7](56469f7)) * Created DocEmbedder class ([#5973](#5973)) ([0719c06](0719c06)) * Extended OIDC support to extract groups & namespaces and token injection with multiple methods ([#6089](#6089)) ([7c04026](7c04026)) * Feature Server High-Availability on Kubernetes ([#6028](#6028)) ([9c07b4c](9c07b4c)), closes [Hi#Availability](https://github.com/Hi/issues/Availability) [Hi#Availability](https://github.com/Hi/issues/Availability) * **go:** Implement metrics and tracing for http and grpc servers ([#5925](#5925)) ([2b4ec9a](2b4ec9a)) * Horizontal scaling support to the Feast operator ([#6000](#6000)) ([3ec13e6](3ec13e6)) * Making feature view source optional (feast-dev[#6074](#6074)) ([#6075](#6075)) ([76917b7](76917b7)) * Replace ORJSONResponse with Pydantic response models for faster JSON serialization ([65cf03c](65cf03c)) * Support arm docker build ([#6061](#6061)) ([1e1f5d9](1e1f5d9)) * Support distinct count aggregation [[#6116](#6116)] ([3639570](3639570)) * Support HTTP in MCP ([#6109](#6109)) ([e72b983](e72b983)) * Support nested collection types (Array/Set of Array/Set) ([#5947](#5947)) ([#6132](#6132)) ([ab61642](ab61642)) * Support podAnnotations on Deployment pod template ([1b3cdc1](1b3cdc1)) * Use orjson for faster JSON serialization in feature server ([6f5203a](6f5203a)) * Utilize date partition column in BigQuery ([#6076](#6076)) ([4ea9b32](4ea9b32)) ### Performance Improvements * Online feature response construction in a single pass over read rows ([113fb04](113fb04)) * Optimize protobuf parsing in Redis online store ([#6023](#6023)) ([59dfdb8](59dfdb8)) * Optimize timestamp conversion in _convert_rows_to_protobuf ([33a2e95](33a2e95)) * Parallelize DynamoDB batch reads in sync online_read ([#6024](#6024)) ([9699944](9699944)) * Remove redundant entity key serialization in online_read ([d87283f](d87283f))
franciscojavierarceo
pushed a commit
that referenced
this pull request
Apr 8, 2026
# [0.62.0](v0.61.0...v0.62.0) (2026-04-08) ### Bug Fixes * Added missing jackc/pgx/v5 entries ([94ad0e7](94ad0e7)) * Fix missing error handling for resource_counts endpoint ([d9706ce](d9706ce)) * fix path feature_definitions.py ([7d7df68](7d7df68)) * Fix regstry Rest API tests intermittent failure ([d53a339](d53a339)) * Fixed intermittent failures in get_historical_features ([c335ec7](c335ec7)) * Fixed the intermittent FeatureViewNotFoundException ([661ecc7](661ecc7)) * Handle existing RBAC role gracefully in namespace registry ([b46a62b](b46a62b)) * Ignore ipynb files during apply ([#6151](#6151)) ([4ea123d](4ea123d)) * Mount TLS volumes for init container ([080a9b5](080a9b5)) * **postgres:** Use end_date in synthetic entity_df for non-entity retrieval ([#6110](#6110)) ([088a802](088a802)), closes [#6066](#6066) * SSL/TLS mode by default for postgres connection ([4844488](4844488)) * Sync v0.61-branch so v0.61.0 tag is reachable from master ([af66878](af66878)) ### Features * Add Claude Code agent skills for Feast ([#6081](#6081)) ([1e5b60f](1e5b60f)), closes [#5976](#5976) [#6007](#6007) * Add decimal to supported feature types ([#6029](#6029)) ([#6226](#6226)) ([cff6fbf](cff6fbf)) * Add feast apply init container to automate registry population on pod start ([#6106](#6106)) ([6b31a43](6b31a43)) * Add feature view versioning support to PostgreSQL and MySQL online stores ([#6193](#6193)) ([940e0f0](940e0f0)), closes [#6168](#6168) [#6169](#6169) [#2728](#2728) * Add metadata statistics to registry api ([ef1d4fc](ef1d4fc)) * Add Oracle DB as Offline store in python sdk & operator ([#6017](#6017)) ([9d35368](9d35368)) * Add RBAC aggregation labels to FeatureStore ClusterRoles ([daf77c6](daf77c6)) * Add ServiceMonitor auto-generation for Prometheus discovery ([#6126](#6126)) ([56e6d21](56e6d21)) * Add typed_features field to grpc write request (([#6117](#6117)) ([#6118](#6118)) ([eeaa6db](eeaa6db)), closes [#6116](#6116) * Add UUID and TIME_UUID as feature types ([#5885](#5885)) ([#5951](#5951)) ([5d6e311](5d6e311)) * Add version indicators to lineage graph nodes ([#6187](#6187)) ([73805d3](73805d3)) * Add version tracking to FeatureView ([#6101](#6101)) ([ed4a4f2](ed4a4f2)) * Added Agent skills for AI Agents ([#6007](#6007)) ([99008c8](99008c8)) * Added odfv transformations metrics ([8b5a526](8b5a526)) * Created DocEmbedder class ([#5973](#5973)) ([0719c06](0719c06)) * Extended OIDC support to extract groups & namespaces and token injection with multiple methods ([#6089](#6089)) ([7c04026](7c04026)) * Replace ORJSONResponse with Pydantic response models for faster JSON serialization ([65cf03c](65cf03c)) * Support distinct count aggregation [[#6116](#6116)] ([3639570](3639570)) * Support HTTP in MCP ([#6109](#6109)) ([e72b983](e72b983)) * Support nested collection types (Array/Set of Array/Set) ([#5947](#5947)) ([#6132](#6132)) ([ab61642](ab61642)) * Support podAnnotations on Deployment pod template ([1b3cdc1](1b3cdc1)) * Utilize date partition column in BigQuery ([#6076](#6076)) ([4ea9b32](4ea9b32)) ### Performance Improvements * Online feature response construction in a single pass over read rows ([113fb04](113fb04))
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
entity_df=None) created a synthetic entity_df usingpd.date_range(start=start_date, ...)[:1], which placedstart_dateas the event timestampMAX(entity_timestamp)as the upper bound, usingstart_datemadeend_dateunreachable — no features afterstart_datewould be returned[end_date]directly, matching the ClickHouse (feat: Add non-entity retrieval support for ClickHouse offline store #6066) and Dask implementationsTest plan
test_non_entity_entity_df_uses_end_datethat captures the synthetic entity_df and asserts its timestamp equalsend_dateTestNonEntityRetrievaltests passFixes the bug identified during review of #6066, referenced in #6057.