RemoteOfflineStore does not support SQL string as entity_df in get_historical_features()
Expected Behavior
get_historical_features() should accept a SQL string as entity_df, as documented and supported by local offline stores (ClickHouse, PostgreSQL, BigQuery). The type signature in RemoteOfflineStore already declares Optional[Union[pd.DataFrame, str]].
entity_sql = f"""
SELECT driver_id, event_timestamp
FROM {store.get_data_source("driver_hourly_stats_source").get_table_query_string()}
WHERE event_timestamp BETWEEN '2021-01-01' and '2021-12-31'
"""
training_df = store.get_historical_features(
entity_df=entity_sql,
features=["driver_hourly_stats:conv_rate"],
).to_df()
Current Behavior
Passing a SQL string as entity_df to RemoteOfflineStore raises:
AttributeError: 'str' object has no attribute 'columns'
Two functions in feast/infra/offline_stores/remote.py assume entity_df is always a DataFrame:
_create_retrieval_metadata() (line 456) — calls _get_entity_schema(entity_df) which accesses entity_df.columns
_put_parameters() (line 564) — calls pa.Table.from_pandas(entity_df)
Steps to reproduce
- Deploy Feast with a remote offline store (Arrow Flight) backed by any store that supports SQL entity_df (ClickHouse, PostgreSQL, etc.)
- Run from the client:
from feast import FeatureStore
store = FeatureStore(config=config) # remote offline store
entity_sql = "SELECT id, event_timestamp FROM my_table WHERE event_timestamp > '2025-01-01'"
job = store.get_historical_features(entity_df=entity_sql, features=["my_fv:feature1"])
df = job.to_df() # raises AttributeError
Specifications
- Version: 0.61.0
- Platform: Linux / macOS
- Subsystem:
feast.infra.offline_stores.remote (RemoteOfflineStore / Arrow Flight)
Possible Solution
Option A — pass SQL via api_parameters:
- Client (
RemoteOfflineStore.get_historical_features): if entity_df is a string, put it into api_parameters["entity_df_sql"] and pass entity_df=None to RemoteRetrievalJob
- Server (
OfflineServer.get_historical_features): if command contains entity_df_sql, forward it as entity_df to the local offline store
Option B — fix _create_retrieval_metadata and _put_parameters:
_create_retrieval_metadata: return metadata with empty keys/timestamps when entity_df is a string
_put_parameters: encode SQL string in a transport-compatible format (e.g., Flight descriptor command metadata)
RemoteOfflineStore does not support SQL string as entity_df in get_historical_features()
Expected Behavior
get_historical_features()should accept a SQL string asentity_df, as documented and supported by local offline stores (ClickHouse, PostgreSQL, BigQuery). The type signature inRemoteOfflineStorealready declaresOptional[Union[pd.DataFrame, str]].Current Behavior
Passing a SQL string as
entity_dftoRemoteOfflineStoreraises:Two functions in
feast/infra/offline_stores/remote.pyassumeentity_dfis always a DataFrame:_create_retrieval_metadata()(line 456) — calls_get_entity_schema(entity_df)which accessesentity_df.columns_put_parameters()(line 564) — callspa.Table.from_pandas(entity_df)Steps to reproduce
Specifications
feast.infra.offline_stores.remote(RemoteOfflineStore / Arrow Flight)Possible Solution
Option A — pass SQL via
api_parameters:RemoteOfflineStore.get_historical_features): ifentity_dfis a string, put it intoapi_parameters["entity_df_sql"]and passentity_df=NonetoRemoteRetrievalJobOfflineServer.get_historical_features): ifcommandcontainsentity_df_sql, forward it asentity_dfto the local offline storeOption B — fix
_create_retrieval_metadataand_put_parameters:_create_retrieval_metadata: return metadata with empty keys/timestamps whenentity_dfis a string_put_parameters: encode SQL string in a transport-compatible format (e.g., Flight descriptor command metadata)