Skip to content

feat: Implement spark offline store offline_write_batch method#3076

Merged
feast-ci-bot merged 6 commits intofeast-dev:masterfrom
niklasvm:add_offline_write_batch
Aug 18, 2022
Merged

feat: Implement spark offline store offline_write_batch method#3076
feast-ci-bot merged 6 commits intofeast-dev:masterfrom
niklasvm:add_offline_write_batch

Conversation

@niklasvm
Copy link
Copy Markdown
Collaborator

What this PR does / why we need it:

  • Create offline_write_batch method for spark offline store
  • Replace spark testing data sets with a file-based parquet format instead of a temporary view

This PR resolves a further set of failing integration tests.

Which issue(s) this PR fixes:

None

@niklasvm niklasvm changed the title Add offline write batch feat: add spark offline store offline_write_batch method Aug 12, 2022
@niklasvm niklasvm changed the title feat: add spark offline store offline_write_batch method feat: Implement spark offline store offline_write_batch method Aug 12, 2022
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Aug 12, 2022

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 16.27907% with 36 lines in your changes missing coverage. Please review.
✅ Project coverage is 75.18%. Comparing base (2646a86) to head (3704f4a).
⚠️ Report is 1673 commits behind head on master.

Files with missing lines Patch % Lines
...ffline_stores/contrib/spark_offline_store/spark.py 11.76% 30 Missing ⚠️
...s/contrib/spark_offline_store/tests/data_source.py 33.33% 6 Missing ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3076      +/-   ##
==========================================
+ Coverage   67.07%   75.18%   +8.11%     
==========================================
  Files         173      207      +34     
  Lines       15124    17553    +2429     
==========================================
+ Hits        10144    13198    +3054     
+ Misses       4980     4355     -625     
Flag Coverage Δ
integrationtests 66.93% <ø> (-0.14%) ⬇️
unittests 58.23% <16.27%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@niklasvm niklasvm marked this pull request as ready for review August 12, 2022 15:57
@niklasvm
Copy link
Copy Markdown
Collaborator Author

/ok-to-test

@feast-ci-bot
Copy link
Copy Markdown
Collaborator

@niklasvm: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

Details

In response to this:

/ok-to-test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@niklasvm
Copy link
Copy Markdown
Collaborator Author

/assign @kevjumba

@achals
Copy link
Copy Markdown
Member

achals commented Aug 17, 2022

@niklasvm can you rebase? I think we've fixed the lingering issues with the go unit tests.

@niklasvm niklasvm force-pushed the add_offline_write_batch branch from 5ed0cf0 to 3704f4a Compare August 17, 2022 17:57
@niklasvm
Copy link
Copy Markdown
Collaborator Author

@niklasvm can you rebase? I think we've fixed the lingering issues with the go unit tests.

Thank you. I've rebased. Waiting for tests now

@niklasvm
Copy link
Copy Markdown
Collaborator Author

@achals looks like tests ran successfully. There are still 2 pending ones. I am not sure if you need to approve before they run

Copy link
Copy Markdown
Collaborator

@adchia adchia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@feast-ci-bot
Copy link
Copy Markdown
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: adchia, niklasvm

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@feast-ci-bot feast-ci-bot merged commit 5b0cc87 into feast-dev:master Aug 18, 2022
kevjumba pushed a commit that referenced this pull request Aug 25, 2022
# [0.24.0](v0.23.0...v0.24.0) (2022-08-25)

### Bug Fixes

* Check if on_demand_feature_views is an empty list rather than None for snowflake provider ([#3046](#3046)) ([9b05e65](9b05e65))
* FeatureStore.apply applies BatchFeatureView correctly ([#3098](#3098)) ([41be511](41be511))
* Fix Feast Java inconsistency with int64 serialization vs python ([#3031](#3031)) ([4bba787](4bba787))
* Fix feature service inference logic ([#3089](#3089)) ([4310ed7](4310ed7))
* Fix field mapping logic during feature inference ([#3067](#3067)) ([cdfa761](cdfa761))
* Fix incorrect on demand feature view diffing and improve Java tests ([#3074](#3074)) ([0702310](0702310))
* Fix Java helm charts to work with refactored logic. Fix FTS image ([#3105](#3105)) ([2b493e0](2b493e0))
* Fix on demand feature view output in feast plan + Web UI crash ([#3057](#3057)) ([bfae6ac](bfae6ac))
* Fix release workflow to release 0.24.0 ([#3138](#3138)) ([a69aaae](a69aaae))
* Fix Spark offline store type conversion to arrow ([#3071](#3071)) ([b26566d](b26566d))
* Fixing Web UI, which fails for the SQL registry ([#3028](#3028)) ([64603b6](64603b6))
* Force Snowflake Session to Timezone UTC ([#3083](#3083)) ([9f221e6](9f221e6))
* Make infer dummy entity join key idempotent ([#3115](#3115)) ([1f5b1e0](1f5b1e0))
* More explicit error messages ([#2708](#2708)) ([e4d7afd](e4d7afd))
* Parse inline data sources ([#3036](#3036)) ([c7ba370](c7ba370))
* Prevent overwriting existing file during `persist` ([#3088](#3088)) ([69af21f](69af21f))
* Register BatchFeatureView in feature repos correctly ([#3092](#3092)) ([b8e39ea](b8e39ea))
* Return an empty infra object from sql registry when it doesn't exist ([#3022](#3022)) ([8ba87d1](8ba87d1))
* Teardown tables for Snowflake Materialization testing ([#3106](#3106)) ([0a0c974](0a0c974))
* UI error when saved dataset is present in registry. ([#3124](#3124)) ([83cf753](83cf753))
* Update sql.py ([#3096](#3096)) ([2646a86](2646a86))
* Updated snowflake template ([#3130](#3130)) ([f0594e1](f0594e1))

### Features

* Add authentication option for snowflake connector ([#3039](#3039)) ([74c75f1](74c75f1))
* Add Cassandra/AstraDB online store contribution ([#2873](#2873)) ([feb6cb8](feb6cb8))
* Add Snowflake materialization engine ([#2948](#2948)) ([f3b522b](f3b522b))
* Adding saved dataset capabilities for Postgres  ([#3070](#3070)) ([d3253c3](d3253c3))
* Allow passing repo config path via flag ([#3077](#3077)) ([0d2d951](0d2d951))
* Contrib azure provider with synapse/mssql offline store and Azure registry store ([#3072](#3072)) ([9f7e557](9f7e557))
* Custom Docker image for Bytewax batch materialization ([#3099](#3099)) ([cdd1b07](cdd1b07))
* Feast AWS Athena offline store (again) ([#3044](#3044)) ([989ce08](989ce08))
* Implement spark offline store `offline_write_batch` method ([#3076](#3076)) ([5b0cc87](5b0cc87))
* Initial Bytewax materialization engine ([#2974](#2974)) ([55c61f9](55c61f9))
* Refactor feature server helm charts to allow passing feature_store.yaml in environment variables ([#3113](#3113)) ([85ee789](85ee789))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants