Closed
Conversation
Collaborator
|
Hi @windoze , we want to retrospectively follow https://github.com/linkedin/feathr/blob/main/docs/dev_guide/pull_request_guideline.md. Could you create a github issue for your PR? Also let's sync first to align on the technical direction with @xiaoyongzhu |
Member
Author
|
This PR addresses #102 |
xiaoyongzhu
reviewed
Jun 11, 2022
| arguments (str): all the arugments you want to pass into the spark job | ||
| job_tags (str): tags of the job, for exmaple you might want to put your user ID, or a tag with a certain information | ||
| configuration (Dict[str, str]): Additional configs for the spark job | ||
| properties (Dict[str, str]): Additional System Properties for the spark job |
Member
There was a problem hiding this comment.
What's this system property for?
Member
|
I'm not sure if I understand the background of this PR, but from the associated link, I feel this PR is to expose the JDBC sources in the python package.@windoze maybe you can add a bit more description to make it clear for the reviewers? |
Collaborator
|
Please add more details to the description @windoze |
Merged
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is the Python part corresponding to #101.
In #101, I added Scala code to handle JDBC sources using multiple sources need different auth credential, but the Python client still needs this update to let user do: 1. Create JdbcSource, 2. Pass required parameter to Spark job.
This PR adds a new Spark job argument
--system-properties, which is used to pass secrets from Python client to Spark job, as we shall not store secrets directly inside the config files. The key of each entry is the data source name with_USER/_PASSWORDor_TOKENsuffices, depends on the data source auth type, and the value is taken from the current environment variables with the corresponding key from the Python client side.