feat: Add Oracle DB as Offline store in python sdk & operator#6017
feat: Add Oracle DB as Offline store in python sdk & operator#6017ntkathole merged 40 commits intofeast-dev:masterfrom
Conversation
|
@aniketpalu Tested oracle DB configuration with feast operator and feast SDK , LGTM |
de7b133 to
79b59da
Compare
sdk/python/feast/infra/offline_stores/contrib/oracle_offline_store/oracle.py
Outdated
Show resolved
Hide resolved
sdk/python/feast/infra/offline_stores/contrib/oracle_offline_store/oracle.py
Outdated
Show resolved
Hide resolved
sdk/python/feast/infra/offline_stores/contrib/oracle_offline_store/oracle.py
Outdated
Show resolved
Hide resolved
sdk/python/feast/infra/offline_stores/contrib/oracle_offline_store/oracle.py
Outdated
Show resolved
Hide resolved
|
|
||
| def _read_data_source(data_source: DataSource, repo_path: str = "") -> Table: | ||
| table = _read_oracle_table(con, data_source) | ||
| return ibis.memtable(table.execute()) |
There was a problem hiding this comment.
reading entire table in memory? Isn't there a better way to read filtered table based on timestamps may be?
There was a problem hiding this comment.
pretty sure _build_data_source_reader_for_retrieval function is completely redundant, it just repackages _build_data_source_reader for no good reason.
There was a problem hiding this comment.
Added pre-filter to avoid reading the whole table. Thanks for the catch
There was a problem hiding this comment.
to be clear, I don't understand why we need _build_data_source_reader_for_retrieval at all. can't you use _build_data_source_reader in get_historical_features_ibis call? that way there will be no materialization
There was a problem hiding this comment.
Thanks for clarification, I misunderstood your earlier comment. You are right, materialization can be avoided by using _build_data_source_reader.
After this fix, needed to do a little change in building entity_row_id in ibis as r = ibis.literal("") in _generate_row_id() creates broken entity_row_id due to Oracle DB casting it as NULL.
sdk/python/feast/infra/offline_stores/contrib/oracle_offline_store/oracle.py
Show resolved
Hide resolved
|
Can we add documentation as well for offline store ? |
d8c2abb to
4fffb65
Compare
|
@aniketpalu Need rebase |
ecc47ac to
9b363eb
Compare
9b363eb to
ecc47ac
Compare
…-operator Signed-off-by: Aniket Paluskar <[email protected]>
Signed-off-by: Aniket Paluskar <[email protected]>
Signed-off-by: Aniket Paluskar <[email protected]>
Signed-off-by: Aniket Paluskar <[email protected]>
… Vulnerability Signed-off-by: Aniket Paluskar <[email protected]>
…s & validation for mutually exclusive fields Signed-off-by: Aniket Paluskar <[email protected]>
Signed-off-by: Aniket Paluskar <[email protected]>
Signed-off-by: Aniket Paluskar <[email protected]>
Signed-off-by: Aniket Paluskar <[email protected]>
Signed-off-by: Aniket Paluskar <[email protected]>
Signed-off-by: Aniket Paluskar <[email protected]>
…0 EnumTypeWrapper incompatibility Signed-off-by: Aniket Paluskar <[email protected]>
Signed-off-by: Aniket Paluskar <[email protected]>
Signed-off-by: Aniket Paluskar <[email protected]>
Srihari1192
left a comment
There was a problem hiding this comment.
Tested the changes LGTM
Thanks @aniketpalu
…oid runtime errors Signed-off-by: Aniket Paluskar <[email protected]>
sdk/python/feast/infra/offline_stores/contrib/oracle_offline_store/oracle.py
Outdated
Show resolved
Hide resolved
Signed-off-by: Aniket Paluskar <[email protected]>
…value in Oracle DB Signed-off-by: Aniket Paluskar <[email protected]>
… ibis date32[day] Arrow mapping Signed-off-by: Aniket Paluskar <[email protected]>
Signed-off-by: Aniket Paluskar <[email protected]>
Signed-off-by: Aniket Paluskar <[email protected]>
Signed-off-by: Aniket Paluskar <[email protected]>
Signed-off-by: Aniket Paluskar <[email protected]>
Signed-off-by: Aniket Paluskar <[email protected]>
ff2104a to
466e2d2
Compare
Signed-off-by: Aniket Paluskar <[email protected]>
ntkathole
left a comment
There was a problem hiding this comment.
looks good! Thanks @aniketpalu
…dev#6017) * feat: Add Oracle DB as Offline store in python sdk & support in feast-operator Signed-off-by: Aniket Paluskar <[email protected]> * Added oracle db dependency from ibis-framework subgroups Signed-off-by: Aniket Paluskar <[email protected]> * Operator yaml changes Signed-off-by: Aniket Paluskar <[email protected]> * Data source writer ignored parameters, fixed Signed-off-by: Aniket Paluskar <[email protected]> * Replaced raw sql with dedicated truncate_table() to fix SQL Injection Vulnerability Signed-off-by: Aniket Paluskar <[email protected]> * Minor improvements like single db connection, removal of default creds & validation for mutually exclusive fields Signed-off-by: Aniket Paluskar <[email protected]> * Fetching pre-filtered table from db Signed-off-by: Aniket Paluskar <[email protected]> * Minor formatting changes Signed-off-by: Aniket Paluskar <[email protected]> * Added Oracle DB Offline Store documentation Signed-off-by: Aniket Paluskar <[email protected]> * Resolved import error by removing OracleSource import from the __init__ Signed-off-by: Aniket Paluskar <[email protected]> * Fixed lint error by updating secret baseline Signed-off-by: Aniket Paluskar <[email protected]> * fix: Exclude qdrant from docstring tests to avoid qdrant-client 1.17.0 EnumTypeWrapper incompatibility Signed-off-by: Aniket Paluskar <[email protected]> * Generated secret.baseline to avoid lint error Signed-off-by: Aniket Paluskar <[email protected]> * Fixed lint error Signed-off-by: Aniket Paluskar <[email protected]> * Updated .secrets.baseline Signed-off-by: Aniket Paluskar <[email protected]> * Fixed lint errors Signed-off-by: Aniket Paluskar <[email protected]> * Fixed lint errors Signed-off-by: Aniket Paluskar <[email protected]> * Update sdk/python/feast/type_map.py Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com> Signed-off-by: Aniket Paluskar <[email protected]> * Updated dependency lock files Signed-off-by: Aniket Paluskar <[email protected]> * Fixed lint issues in Trino Offline Store Signed-off-by: Aniket Paluskar <[email protected]> * Updated requirements Signed-off-by: Aniket Paluskar <[email protected]> * Updated pixi.lock file Signed-off-by: Aniket Paluskar <[email protected]> * Restricted non-empty feature_views in get_historical_features() to avoid runtime errors Signed-off-by: Aniket Paluskar <[email protected]> * Removed _build_data_source_reader_for_retrieval function Signed-off-by: Aniket Paluskar <[email protected]> * Modified initial query to be _ to avoid empty string casting to Null value in Oracle DB Signed-off-by: Aniket Paluskar <[email protected]> * cast DATE to TIMESTAMP in _read_oracle_table to preserve time lost by ibis date32[day] Arrow mapping Signed-off-by: Aniket Paluskar <[email protected]> * Use single database connection for pull_latest_from_table_or_query() Signed-off-by: Aniket Paluskar <[email protected]> * Improved readibility by breaking down the code into functions Signed-off-by: Aniket Paluskar <[email protected]> * Updated .secret.baseline Signed-off-by: Aniket Paluskar <[email protected]> * Updated .secret.baseline and pixi.lock Signed-off-by: Aniket Paluskar <[email protected]> * Fixed lint issue Signed-off-by: Aniket Paluskar <[email protected]> --------- Signed-off-by: Aniket Paluskar <[email protected]> Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com> Signed-off-by: Shizoqua <[email protected]>
What this PR does / why we need it:
Oracle DB Offline Store for Feast (ibis-based)
This PR adds a native Oracle Database offline store for Feast, built on the ibis-framework's Oracle backend (ibis.oracle). It follows the same proven pattern used by the existing MSSQL ibis offline store — delegating to the shared ibis.py functions for point-in-time joins, deduplication, and feature retrieval.
The implementation is ~300 lines total (vs 1096 lines in the SQLAlchemy-based approach), with no Jinja2 templates, no dialect-specific SQL workarounds, and no manual query generation. ibis generates Oracle SQL automatically and provides native Arrow transfer via to_pyarrow().
Oracle Column Name Handling
Oracle stores unquoted identifiers in UPPERCASE. For example, CREATE TABLE t (user_id INT) stores the column as USER_ID. This store uses a passthrough approach — column names are returned exactly as Oracle stores them, with no automatic case transformation.
This means:
Unquoted columns (the standard Oracle convention): The user references them in UPPERCASE, matching what they see in Oracle tools like SQL Developer, DBeaver, or DESCRIBE table.
Quoted columns (e.g., "CamelCase"): The user references them with the exact casing used when creating the table.
This design ensures predictable behavior — what you see in Oracle is what you use in Feast. No hidden transformations, no conventions to learn.
How to Use
feature_store.yamlWhat's Tested
Tested against a live Oracle 23ai database (Oracle Free 23.26.1.0.0):
Core Feast Operations:
Non-Entity Retrieval (entity_df=None):
Oracle Column Name Handling:
Which issue(s) this PR fixes:
#6018
Misc