Software Engineer · Data Engineering · Cloud · AI/LLM Apps
Results-driven Software Engineer with 4+ years building scalable data products and full‑stack solutions across Snowflake, DBT, Airflow, AWS, and Spark—plus hands-on AI/LLM work (RAG, local models, LangGraph). I enjoy turning ambiguous data and platform problems into reliable, production-grade pipelines and developer-friendly apps.
- Current focus: DBT/Snowflake data products, Airflow orchestration, EMR ops, cost‑efficient LLM agents.
- Strengths: data modeling, pipeline performance, workflow reliability/SLA hygiene, pragmatic AI integrations.
- Open to: data engineering roles, platform/infra data work, privacy-first AI applications.
- Data/Cloud: Snowflake, DBT (Core & Cloud), Airflow (MWAA), Databricks, Hadoop, Spark, Hive, AWS (S3, EMR)
- Programming: Python, SQL, Shell, JavaScript/TypeScript, Java, C/C++
- Web: React, Node.js, Next.js, FastAPI, Laravel, WordPress
- Databases: Oracle, PostgreSQL, MySQL, MongoDB, ChromaDB
- AI/LLM: LangGraph, Streamlit, RAG, Vector DBs, Ollama, LLaMA/Mistral
- Built and maintained DBT+Snowflake data products for enterprise analytics with robust testing, governance, and performance tuning.
- Operated Airflow (MWAA) and Arena environments with incident response, EMR reliability, and end‑to‑end workflow SLAs.
- Migrated DBT Core → DBT Cloud with Airflow DAG refactors, validation, and observability.
- Delivered privacy-first AI apps: local LLM RAG and a hybrid token‑efficient agent.
- Full‑stack background: React/Next.js + FastAPI/Laravel, strong API design and docs.
- DocuRAG: Document Analysis Platform
React + FastAPI + PostgreSQL + ChromaDB + Ollama (LLaMA 3.2)
- Local, privacy-first RAG for PDFs/TXT/DOCX with vector embeddings and on‑device inference.
- Dockerized, Nginx‑backed deployment; supports multi-user auth and knowledge bases.
- Useful for research analysis and internal KB search.
- TokenSage: Hybrid LLM Agent
LangGraph + Streamlit + OpenAI API + Local LLMs (Ollama)
- Minimizes premium token usage by delegating expansion to local models.
- Orchestrates API+local responses for quality and cost control.
- Applied to docs, support, and education content generation.
- DataGene: Synthetic Data Generator
Java + Spring Boot + RabbitMQ + S3
- High‑volume data synthesis with secure S3 storage and rich observability.
- Admin dashboard for environment variables and connections.
- RAG Pipeline for Jira Ticket Insights
LangChain + Streamlit + LLaMA + ChromaDB
- Ingests Jira ticket dumps, indexes features, and predicts likely error occurrences.
- Fast, searchable interface for triage and analysis.
- Alexion Operations – Data Products (Snowflake, DBT, MWAA/Airflow): DBT model design, testing, performance; data governance artifacts (Collibra), conceptual models; end‑to‑end solution testing and production handover.
- BofA Product Support (Arena): Incident handling (JIRA), log analysis, Spark upgrade support, platform stability.
- DBT Core → Cloud migration: Refactored Airflow DAGs to DBT Cloud jobs, validated lineage and outputs.
- Managed Services (Arena, Airflow, EMR, AWS): Proactive monitoring, reruns, biweekly maintenance, infra/app issue ownership.
- Software Engineer, TrieDatum — Apr 2023 → Present
- Software Engineer, Ekodus Technologies — Jul 2021 → Apr 2023
- Software Engineer, Riverrun Digital — Sep 2020 → Jun 2021
Education: M.Sc. Information Technology, Gauhati University
Certifications: Snowflake (Udemy, 2023), DBT (Udemy, 2024)
- Email: [email protected]
- LinkedIn: linkedin.com/in/alokdeka
