Skip to content

fix(postgres): Use end_date in synthetic entity_df for non-entity retrieval#6110

Merged
franciscojavierarceo merged 6 commits intofeast-dev:masterfrom
YassinNouh21:fix/postgres-entity-df-timestamp
Mar 17, 2026
Merged

fix(postgres): Use end_date in synthetic entity_df for non-entity retrieval#6110
franciscojavierarceo merged 6 commits intofeast-dev:masterfrom
YassinNouh21:fix/postgres-entity-df-timestamp

Conversation

@YassinNouh21
Copy link
Copy Markdown
Collaborator

@YassinNouh21 YassinNouh21 commented Mar 15, 2026

Summary

  • The non-entity retrieval path (entity_df=None) created a synthetic entity_df using pd.date_range(start=start_date, ...)[:1], which placed start_date as the event timestamp
  • Since PIT joins use MAX(entity_timestamp) as the upper bound, using start_date made end_date unreachable — no features after start_date would be returned
  • Fix: use [end_date] directly, matching the ClickHouse (feat: Add non-entity retrieval support for ClickHouse offline store #6066) and Dask implementations

Test plan

  • Added regression test test_non_entity_entity_df_uses_end_date that captures the synthetic entity_df and asserts its timestamp equals end_date
  • All existing TestNonEntityRetrieval tests pass
  • No changes to the query template or other code paths

Fixes the bug identified during review of #6066, referenced in #6057.


Open with Devin

@YassinNouh21 YassinNouh21 requested a review from a team as a code owner March 15, 2026 17:51
@YassinNouh21 YassinNouh21 self-assigned this Mar 15, 2026
devin-ai-integration[bot]

This comment was marked as resolved.

…_df for non-entity retrieval

The non-entity retrieval path created a synthetic entity_df using
pd.date_range(start=start_date, ...)[:1], which placed start_date as
the event_timestamp. Since PIT joins use MAX(entity_timestamp) as the
upper bound for feature data filtering, using start_date made end_date
unreachable — no features after start_date would be returned.

Fix: use [end_date] directly, matching the ClickHouse implementation
(PR feast-dev#6066) and the Dask offline store behavior.

Signed-off-by: yassinnouh21 <[email protected]>
The entity_df fix alone would cause min_event_timestamp to be computed
as end_date - TTL (instead of start_date - TTL), clipping valid data
from the query window. Override entity_df_event_timestamp_range to
(start_date, end_date) in non-entity mode so the full range is used.

Also fix ruff formatting in the test file.

Signed-off-by: yassinnouh21 <[email protected]>
@ntkathole ntkathole force-pushed the fix/postgres-entity-df-timestamp branch from 0f4df15 to e82371d Compare March 16, 2026 03:50
Copy link
Copy Markdown
Member

@franciscojavierarceo franciscojavierarceo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add an integration test for this so we can confirm the behavior?

@YassinNouh21 YassinNouh21 force-pushed the fix/postgres-entity-df-timestamp branch 2 times, most recently from 3abfba5 to efd3f9b Compare March 16, 2026 15:06
@YassinNouh21 YassinNouh21 force-pushed the fix/postgres-entity-df-timestamp branch from efd3f9b to 271825e Compare March 16, 2026 15:10
Copy link
Copy Markdown
Member

@franciscojavierarceo franciscojavierarceo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice lgtm

@franciscojavierarceo franciscojavierarceo merged commit 088a802 into feast-dev:master Mar 17, 2026
26 checks passed
Anarion-zuo pushed a commit to Anarion-zuo/feast that referenced this pull request Mar 17, 2026
…rieval (feast-dev#6110)

* fix(postgres): Use end_date instead of start_date in synthetic entity_df for non-entity retrieval

The non-entity retrieval path created a synthetic entity_df using
pd.date_range(start=start_date, ...)[:1], which placed start_date as
the event_timestamp. Since PIT joins use MAX(entity_timestamp) as the
upper bound for feature data filtering, using start_date made end_date
unreachable — no features after start_date would be returned.

Fix: use [end_date] directly, matching the ClickHouse implementation
(PR feast-dev#6066) and the Dask offline store behavior.

Signed-off-by: yassinnouh21 <[email protected]>

* fix: preserve timestamp range for min_event_timestamp and fix formatting

The entity_df fix alone would cause min_event_timestamp to be computed
as end_date - TTL (instead of start_date - TTL), clipping valid data
from the query window. Override entity_df_event_timestamp_range to
(start_date, end_date) in non-entity mode so the full range is used.

Also fix ruff formatting in the test file.

Signed-off-by: yassinnouh21 <[email protected]>

* test: add integration test for non-entity retrieval

Signed-off-by: yassinnouh21 <[email protected]>

---------

Signed-off-by: yassinnouh21 <[email protected]>
Co-authored-by: Francisco Javier Arceo <[email protected]>
Signed-off-by: aaronzuo <[email protected]>
Shizoqua pushed a commit to Shizoqua/feast that referenced this pull request Mar 18, 2026
…rieval (feast-dev#6110)

* fix(postgres): Use end_date instead of start_date in synthetic entity_df for non-entity retrieval

The non-entity retrieval path created a synthetic entity_df using
pd.date_range(start=start_date, ...)[:1], which placed start_date as
the event_timestamp. Since PIT joins use MAX(entity_timestamp) as the
upper bound for feature data filtering, using start_date made end_date
unreachable — no features after start_date would be returned.

Fix: use [end_date] directly, matching the ClickHouse implementation
(PR feast-dev#6066) and the Dask offline store behavior.

Signed-off-by: yassinnouh21 <[email protected]>

* fix: preserve timestamp range for min_event_timestamp and fix formatting

The entity_df fix alone would cause min_event_timestamp to be computed
as end_date - TTL (instead of start_date - TTL), clipping valid data
from the query window. Override entity_df_event_timestamp_range to
(start_date, end_date) in non-entity mode so the full range is used.

Also fix ruff formatting in the test file.

Signed-off-by: yassinnouh21 <[email protected]>

* test: add integration test for non-entity retrieval

Signed-off-by: yassinnouh21 <[email protected]>

---------

Signed-off-by: yassinnouh21 <[email protected]>
Co-authored-by: Francisco Javier Arceo <[email protected]>
Signed-off-by: Shizoqua <[email protected]>
aniketpalu pushed a commit to aniketpalu/feast that referenced this pull request Mar 23, 2026
…rieval (feast-dev#6110)

* fix(postgres): Use end_date instead of start_date in synthetic entity_df for non-entity retrieval

The non-entity retrieval path created a synthetic entity_df using
pd.date_range(start=start_date, ...)[:1], which placed start_date as
the event_timestamp. Since PIT joins use MAX(entity_timestamp) as the
upper bound for feature data filtering, using start_date made end_date
unreachable — no features after start_date would be returned.

Fix: use [end_date] directly, matching the ClickHouse implementation
(PR feast-dev#6066) and the Dask offline store behavior.

Signed-off-by: yassinnouh21 <[email protected]>

* fix: preserve timestamp range for min_event_timestamp and fix formatting

The entity_df fix alone would cause min_event_timestamp to be computed
as end_date - TTL (instead of start_date - TTL), clipping valid data
from the query window. Override entity_df_event_timestamp_range to
(start_date, end_date) in non-entity mode so the full range is used.

Also fix ruff formatting in the test file.

Signed-off-by: yassinnouh21 <[email protected]>

* test: add integration test for non-entity retrieval

Signed-off-by: yassinnouh21 <[email protected]>

---------

Signed-off-by: yassinnouh21 <[email protected]>
Co-authored-by: Francisco Javier Arceo <[email protected]>
Signed-off-by: Aniket Paluskar <[email protected]>
yuan1j pushed a commit to yuan1j/feast that referenced this pull request Apr 2, 2026
…rieval (feast-dev#6110)

* fix(postgres): Use end_date instead of start_date in synthetic entity_df for non-entity retrieval

The non-entity retrieval path created a synthetic entity_df using
pd.date_range(start=start_date, ...)[:1], which placed start_date as
the event_timestamp. Since PIT joins use MAX(entity_timestamp) as the
upper bound for feature data filtering, using start_date made end_date
unreachable — no features after start_date would be returned.

Fix: use [end_date] directly, matching the ClickHouse implementation
(PR feast-dev#6066) and the Dask offline store behavior.

Signed-off-by: yassinnouh21 <[email protected]>

* fix: preserve timestamp range for min_event_timestamp and fix formatting

The entity_df fix alone would cause min_event_timestamp to be computed
as end_date - TTL (instead of start_date - TTL), clipping valid data
from the query window. Override entity_df_event_timestamp_range to
(start_date, end_date) in non-entity mode so the full range is used.

Also fix ruff formatting in the test file.

Signed-off-by: yassinnouh21 <[email protected]>

* test: add integration test for non-entity retrieval

Signed-off-by: yassinnouh21 <[email protected]>

---------

Signed-off-by: yassinnouh21 <[email protected]>
Co-authored-by: Francisco Javier Arceo <[email protected]>
Signed-off-by: yuanjun220 <[email protected]>
franciscojavierarceo pushed a commit that referenced this pull request Apr 7, 2026
# [0.61.0](v0.60.0...v0.61.0) (2026-04-07)

### Bug Fixes

* Add grpcio dependency group to transformation server Dockerfile ([2c2150a](2c2150a))
* Add https readiness check for rest-registry tests ([ea85e63](ea85e63))
* Add website build check for PRs and fix blog frontmatter YAML error ([#6079](#6079)) ([30a3a43](30a3a43))
* Added missing jackc/pgx/v5 entries ([94ad0e7](94ad0e7))
* Added MLflow metric charts across feature selection ([#6080](#6080)) ([a403361](a403361))
* Check duplicate names for feature view across types ([#5999](#5999)) ([95b9af8](95b9af8))
* Fix integration tests ([#6046](#6046)) ([02d5548](02d5548))
* Fix missing error handling for resource_counts endpoint ([d9706ce](d9706ce))
* Fix non-specific label selector on metrics service ([a1a160d](a1a160d))
* fix path feature_definitions.py ([7d7df68](7d7df68))
* Fix regstry Rest API tests intermittent failure ([d53a339](d53a339))
* Fixed IntegrityError on SqlRegistry ([#6047](#6047)) ([325e148](325e148))
* Fixed intermittent failures in get_historical_features ([c335ec7](c335ec7))
* Fixed pre-commit check ([114b7db](114b7db))
* Fixed the intermittent FeatureViewNotFoundException ([661ecc7](661ecc7))
* Fixed uv cache permission error for docker build on mac ([ad807be](ad807be))
* Fixes a `PydanticDeprecatedSince20` warning for trino_offline_store ([#5991](#5991)) ([abfd18a](abfd18a))
* Handle existing RBAC role gracefully in namespace registry ([b46a62b](b46a62b))
* Ignore ipynb files during apply ([#6151](#6151)) ([4ea123d](4ea123d))
* Integration test failures ([#6040](#6040)) ([9165870](9165870))
* Mount TLS volumes for init container ([080a9b5](080a9b5))
* **postgres:** Use end_date in synthetic entity_df for non-entity retrieval ([#6110](#6110)) ([088a802](088a802)), closes [#6066](#6066)
* Ray offline store tests are duplicated across 3 workflows ([54f705a](54f705a))
* Reenable tests ([#6036](#6036)) ([82ee7f8](82ee7f8))
* SSL/TLS mode by default for postgres connection ([4844488](4844488))
* Use commitlint pre-commit hook instead of a separate action ([35a81e7](35a81e7))

### Features

* Add Claude Code agent skills for Feast ([#6081](#6081)) ([1e5b60f](1e5b60f)), closes [#5976](#5976) [#6007](#6007)
* Add complex type support (Map, JSON, Struct) with schema validation ([#5974](#5974)) ([1200dbf](1200dbf))
* Add decimal to supported feature types ([#6029](#6029)) ([#6226](#6226)) ([cff6fbf](cff6fbf))
* Add feast apply init container to automate registry population on pod start ([#6106](#6106)) ([6b31a43](6b31a43))
* Add feature view versioning support to PostgreSQL and MySQL online stores ([#6193](#6193)) ([940e0f0](940e0f0)), closes [#6168](#6168) [#6169](#6169) [#2728](#2728)
* Add materialization, feature freshness, request latency, and push metrics to feature server ([2c6be18](2c6be18))
* Add metadata statistics to registry api ([ef1d4fc](ef1d4fc))
* Add non-entity retrieval support for ClickHouse offline store ([4d08ddc](4d08ddc)), closes [#5835](#5835)
* Add OnlineStore for MongoDB ([#6025](#6025)) ([bf4e3fa](bf4e3fa)), closes [golang/go#74462](golang/go#74462)
* Add Oracle DB as Offline store in python sdk & operator ([#6017](#6017)) ([9d35368](9d35368))
* Add RBAC aggregation labels to FeatureStore ClusterRoles ([daf77c6](daf77c6))
* Add ServiceMonitor auto-generation for Prometheus discovery ([#6126](#6126)) ([56e6d21](56e6d21))
* Add typed_features field to grpc write request (([#6117](#6117)) ([#6118](#6118)) ([eeaa6db](eeaa6db)), closes [#6116](#6116)
* Add UUID and TIME_UUID as feature types ([#5885](#5885)) ([#5951](#5951)) ([5d6e311](5d6e311))
* Add version indicators to lineage graph nodes ([#6187](#6187)) ([73805d3](73805d3))
* Add version tracking to FeatureView ([#6101](#6101)) ([ed4a4f2](ed4a4f2))
* Added Agent skills for AI Agents ([#6007](#6007)) ([99008c8](99008c8))
* Added CodeQL SAST scanning and detect-secrets pre-commit hook ([547b516](547b516))
* Added odfv transformations metrics ([8b5a526](8b5a526))
* Adding optional name to Aggregation (feast-dev[#5994](#5994)) ([#6083](#6083)) ([56469f7](56469f7))
* Created DocEmbedder class ([#5973](#5973)) ([0719c06](0719c06))
* Extended OIDC support to extract groups & namespaces and token injection with multiple methods ([#6089](#6089)) ([7c04026](7c04026))
* Feature Server High-Availability on Kubernetes ([#6028](#6028)) ([9c07b4c](9c07b4c)), closes [Hi#Availability](https://github.com/Hi/issues/Availability) [Hi#Availability](https://github.com/Hi/issues/Availability)
* **go:** Implement metrics and tracing for http and grpc servers ([#5925](#5925)) ([2b4ec9a](2b4ec9a))
* Horizontal scaling support to the Feast operator ([#6000](#6000)) ([3ec13e6](3ec13e6))
* Making feature view source optional (feast-dev[#6074](#6074)) ([#6075](#6075)) ([76917b7](76917b7))
* Replace ORJSONResponse with Pydantic response models for faster JSON serialization ([65cf03c](65cf03c))
* Support arm docker build ([#6061](#6061)) ([1e1f5d9](1e1f5d9))
* Support distinct count aggregation [[#6116](#6116)] ([3639570](3639570))
* Support HTTP in MCP ([#6109](#6109)) ([e72b983](e72b983))
* Support nested collection types (Array/Set of Array/Set) ([#5947](#5947)) ([#6132](#6132)) ([ab61642](ab61642))
* Support podAnnotations on Deployment pod template ([1b3cdc1](1b3cdc1))
* Use orjson for faster JSON serialization in feature server ([6f5203a](6f5203a))
* Utilize date partition column in BigQuery ([#6076](#6076)) ([4ea9b32](4ea9b32))

### Performance Improvements

* Online feature response construction in a single pass over read rows ([113fb04](113fb04))
* Optimize protobuf parsing in Redis online store ([#6023](#6023)) ([59dfdb8](59dfdb8))
* Optimize timestamp conversion in _convert_rows_to_protobuf ([33a2e95](33a2e95))
* Parallelize DynamoDB batch reads in sync online_read ([#6024](#6024)) ([9699944](9699944))
* Remove redundant entity key serialization in online_read ([d87283f](d87283f))
franciscojavierarceo pushed a commit that referenced this pull request Apr 8, 2026
# [0.62.0](v0.61.0...v0.62.0) (2026-04-08)

### Bug Fixes

* Added missing jackc/pgx/v5 entries ([94ad0e7](94ad0e7))
* Fix missing error handling for resource_counts endpoint ([d9706ce](d9706ce))
* fix path feature_definitions.py ([7d7df68](7d7df68))
* Fix regstry Rest API tests intermittent failure ([d53a339](d53a339))
* Fixed intermittent failures in get_historical_features ([c335ec7](c335ec7))
* Fixed the intermittent FeatureViewNotFoundException ([661ecc7](661ecc7))
* Handle existing RBAC role gracefully in namespace registry ([b46a62b](b46a62b))
* Ignore ipynb files during apply ([#6151](#6151)) ([4ea123d](4ea123d))
* Mount TLS volumes for init container ([080a9b5](080a9b5))
* **postgres:** Use end_date in synthetic entity_df for non-entity retrieval ([#6110](#6110)) ([088a802](088a802)), closes [#6066](#6066)
* SSL/TLS mode by default for postgres connection ([4844488](4844488))
* Sync v0.61-branch so v0.61.0 tag is reachable from master ([af66878](af66878))

### Features

* Add Claude Code agent skills for Feast ([#6081](#6081)) ([1e5b60f](1e5b60f)), closes [#5976](#5976) [#6007](#6007)
* Add decimal to supported feature types ([#6029](#6029)) ([#6226](#6226)) ([cff6fbf](cff6fbf))
* Add feast apply init container to automate registry population on pod start ([#6106](#6106)) ([6b31a43](6b31a43))
* Add feature view versioning support to PostgreSQL and MySQL online stores ([#6193](#6193)) ([940e0f0](940e0f0)), closes [#6168](#6168) [#6169](#6169) [#2728](#2728)
* Add metadata statistics to registry api ([ef1d4fc](ef1d4fc))
* Add Oracle DB as Offline store in python sdk & operator ([#6017](#6017)) ([9d35368](9d35368))
* Add RBAC aggregation labels to FeatureStore ClusterRoles ([daf77c6](daf77c6))
* Add ServiceMonitor auto-generation for Prometheus discovery ([#6126](#6126)) ([56e6d21](56e6d21))
* Add typed_features field to grpc write request (([#6117](#6117)) ([#6118](#6118)) ([eeaa6db](eeaa6db)), closes [#6116](#6116)
* Add UUID and TIME_UUID as feature types ([#5885](#5885)) ([#5951](#5951)) ([5d6e311](5d6e311))
* Add version indicators to lineage graph nodes ([#6187](#6187)) ([73805d3](73805d3))
* Add version tracking to FeatureView ([#6101](#6101)) ([ed4a4f2](ed4a4f2))
* Added Agent skills for AI Agents ([#6007](#6007)) ([99008c8](99008c8))
* Added odfv transformations metrics ([8b5a526](8b5a526))
* Created DocEmbedder class ([#5973](#5973)) ([0719c06](0719c06))
* Extended OIDC support to extract groups & namespaces and token injection with multiple methods ([#6089](#6089)) ([7c04026](7c04026))
* Replace ORJSONResponse with Pydantic response models for faster JSON serialization ([65cf03c](65cf03c))
* Support distinct count aggregation [[#6116](#6116)] ([3639570](3639570))
* Support HTTP in MCP ([#6109](#6109)) ([e72b983](e72b983))
* Support nested collection types (Array/Set of Array/Set) ([#5947](#5947)) ([#6132](#6132)) ([ab61642](ab61642))
* Support podAnnotations on Deployment pod template ([1b3cdc1](1b3cdc1))
* Utilize date partition column in BigQuery ([#6076](#6076)) ([4ea9b32](4ea9b32))

### Performance Improvements

* Online feature response construction in a single pass over read rows ([113fb04](113fb04))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants