For knowledge document of this organization, you can follow this π Data Developer & Engineer document.
Note
I will delegate the tools section of this docs to each project in Data Dev & Eng Lab
flowchart LR
1([π ddeutil]) --> 2([π
ddeutil-io]) --> 3([π ddeutil-workflow]) ---> 4
subgraph observe
4([π‘ ddeutil-observe])
6([π ddeutil-observe<br>streamlit])
end
3 --> 6
0([βοΈ fmtutil]) -.-> 2
This organization has the propose to make lightweight data orchestration framework for small - middle data platform project (πAround 10K workflows).
Firstly, I will implement base projects, π Core (utility functions) and π IO (Input/Output transport utility objects) for the first dependency packages because it has a lot of base code to make main package and I do not want to develop this code on the main package, for example, it do not good if I want fix bug on the merge key function that no relate with the workflow package
π― The main package of this organize orchestration framework has 2 layers and I split it with 2 projects for optional installation requirement (you can only use just one of these layers without raise error).
- π Workflow - Lightweight workflow orchestration in Python with less dependencies.
- π‘ Observe (FastAPI) - Lightweight observation application with FastAPI for the workflow package.
- π Observe (Streamlit) - Lightweight observation application with Streamlit for the workflow package.
flowchart LR
1([π ddeutil-workflow]) ---> 2([β‘ deflow])
3([π
ddeutil-io]) ---> 4([βοΈ jett])
5([airflow]) ---> 6([π dagtool])
- β‘ DeFlow - Lightweight Declarative Data Workflow Framework.
- βοΈ Jett - Just a Template Engine Tool
- π DagTool - Friendly Airflow DAG Build Tool
This organize has some mini-projects that develop for specific usecase:
- data-orchestra - Full-Stack Data Orchestration from Yaml template with Flask & HTMX
- load-routing - Routing Application Service deploy to On-Premise server with FastAPI
- ποΈ Extensions - An additional practices to use any 3rd API connect data source.
Warning
I have some 3rd-party projects (deprecated!!!), ποΈ Extensions, for keeping
an additional practices to use any 3rd API connect data source, like polars, duckdb, etc.
It is dynamic data processing & transformation functions and objects from external vendor packages.
It can plug-in to the Workflow package on the hook stage.
Warning
The above projects have a lot of bugs and need times to fix and refactor the code. So, you should not use these projects.