feat(py): A dbt adapter for Feldera (not a Feldera adapter, but a dbt adapter, that is wrapping Feldera)#5950
Conversation
…w count. TODO: Assert IVM by adding more data to source, perhaps via Kafka
|
I was going to review this but on first glance I don't understand it at all. Feldera adapters are written in Rust and go in the adapters crate. This is something else completely different written in Python and I don't see how it is a Feldera adapter at all. |
|
Hey @blp, This is not a Feldera adapter, this is a dbt adapter: dbt is a super popular CLI tool used by Data Engineers/Analysts. The idea of building a dbt adapter is to increase Feldera adoption amongst laymans. Here are some dbt adapters for other ETL/Database engines, they're all written in Python: https://github.com/dbt-labs/dbt-adapters I suppose whoever wrote the Python client for Feldera would be a more suitable reviewer, as this PR is using that client to bridge dbt to feldera. |
OK. I will remove myself from the reviewers list, because I am not the right person to look at this. |
|
@mdrakiburrahman Mihai should definitely look at this, but is traveling so might be slow to reply. In the meantime, tagging in @snkas who understands the python API well. |
|
@lalithsuresh - no rush! I still have a few TODOs to get this PR merge worthy, but I'm curious to get feedback on how I'm bridging the SQL to REST API |
|
also @abhizer |
…e FACT table UNION
|
@gz would you mind allowing CI to run on the latest commit? I was seeing a ruff lint failure in GH Action, hopefully this fixes it |
Ruff 0.9.10 (used by pre-commit CI) classifies dbt_common as third-party and dbt.adapters.* as first-party due to the project's setuptools namespace package config. This means dbt_common imports must appear in the third-party section (before the blank-line separator), not interleaved with the first-party dbt.adapters block. Co-authored-by: Copilot <[email protected]>
|
looks like it passed pre-merge queue. if you think it's in good shape I can clone the branch and trigger a run for the merge queue so that it doesnt actually merge it yet. but it may reveal some issues with CI scripts. let me know if you think it's ready to be tried. |
|
@gz let's do it! |
mihaibudiu
left a comment
There was a problem hiding this comment.
I am still on vacation, so I will review more when I get a chance.
I apologize for the slow process.
I will be out for one more week.
@mihaibudiu no problem at all 🙂 I appreciate you taking the time over the time off. I'm not in a rush to merge since there's minimal chances of merge conflicts |
|
@gz - 1 hour 18 mins, pretty fast post-merge queue! |
|
I am trying to find the dbt integration tests but I think they aren't wired into ci.yml, but looks like the unit tests worked 🚀 |
|
Feldera uses rebase-merge, not squash. Every commit in the PR lands individually on main — each with its own message, authored by you. You can verify this by looking at any recently-merged multi-commit PR (e.g. PR #5956 had two commits and both appear separately on main: fc3dd4d and 7d27972). So yes, the commit messages matter here. 30 commits with checkpoint notes, merge commits, and fixup messages will all become permanent git history. The fix is straightforward: References: |
|
CI passed — good. The commit history still needs the rebase you mentioned in step 3 of your plan. Still seeing checkpoint commits, merge commits, and 'Attempt to lint' scattered through it. Please squash/rebase to a clean logical sequence before I can lift CHANGES_REQUESTED. |
mythical-fred
left a comment
There was a problem hiding this comment.
Commit history still needs cleanup. Still seeing "Checkpoint:", "Fixed catalog gen", "Proofread", double "Update changelog", "Remove seeds from git in origin", and three merge-from-main commits. Please squash or rebase into a clean sequence before merge. Also, Mihai left a batch of inline questions on April 3 that appear unanswered — worth addressing those first.
|
If it's not obvious @mythical-fred is our LLM reviewer, he gets fixated on things sometimes :) |
|
Fair point, @gz — CI is green now. @mdrakiburrahman: the only remaining ask is the rebase. Once the history is clean I'll dismiss my REQUEST_CHANGES. |
|
@gz wow could have had me fooled 🤣 I thought it was a git connoisseur engineer @mythical-fred - calm down robot! I'll do my rebase after finishing the review with Mihai. Btw we solve all these problems in our repo by enforcing Squash merge. So the developer can push as many commits as they want in their PR branch to get CI green etc, but when it merges to
(Rebasing can sometimes be stressful unless you're used to it, I've messed up my working branches before but I'm personally not very good at |
The dev dependency used ruff>=0.6.9 which resolved to 0.15.8, while CI pre-commit uses ruff 0.9.10. These versions produce different isort import ordering for dbt namespace packages, causing run.sh fix to undo the CI-expected ordering. Pinning ensures local and CI ruff agree. Co-authored-by: Copilot <[email protected]>
…ueries and not ad-hoc
… by ad-hoc VS pipeline query endpoint
|
Hey @mihaibudiu, I think I got your suggestions in the latest batch of commits. No rush, whenever you're available, please take a look at the latest iteration. Btw, if you wanted to co-author this branch, please let me know! I can help get you setup locally (see CONTRIBUTING.md), it's fairly quick to make code changes and fire |

A new dbt adapter for Feldera.
Small demo
Notes
Not for this PR, but we need to open a PR in the dbt repo to get this adapter docs exposed in dbt's public docs, sample PR: Onboarding dbt-fabricspark adapter, adding documentation of the adapter dbt-labs/docs.getdbt.com#7219
Certain features like dbt
snapshotsare not yet supported, we need to think through how this can be done in Feldera. I noticed the confluent Flink adapter doesn't have it, so perhaps it's fine to start without these luxuriesIntegration tests are part of the PR but not running in CI, will come with the existing CI infra that @gz noted in this PR discussion, task here: [dbt] Add CI support with existing Kafka broker infra #5966
Describe Manual Test Plan
Includes a
dbt-adventureworksIntegration test that runs both PyTest and dbt CLI with bash. Uses Feldera in docker.There is a
.scripts/run.sh alltarget that runs all targets of the new Python project locally, see thedbt-feldera/README.md:Checklist
Breaking Changes?
Mark if you think the answer is yes for any of these components:
Describe Incompatible Changes
N/A
Issue
Closes #5794