Skip to content
View alokdeka's full-sized avatar

Highlights

  • Pro

Block or report alokdeka

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
alokdeka/README.md

Alok Deka

Software Engineer · Data Engineering · Cloud · AI/LLM Apps

Email LinkedIn Portfolio Location


About me

Results-driven Software Engineer with 4+ years building scalable data products and full‑stack solutions across Snowflake, DBT, Airflow, AWS, and Spark—plus hands-on AI/LLM work (RAG, local models, LangGraph). I enjoy turning ambiguous data and platform problems into reliable, production-grade pipelines and developer-friendly apps.

  • Current focus: DBT/Snowflake data products, Airflow orchestration, EMR ops, cost‑efficient LLM agents.
  • Strengths: data modeling, pipeline performance, workflow reliability/SLA hygiene, pragmatic AI integrations.
  • Open to: data engineering roles, platform/infra data work, privacy-first AI applications.

Core tech

  • Data/Cloud: Snowflake, DBT (Core & Cloud), Airflow (MWAA), Databricks, Hadoop, Spark, Hive, AWS (S3, EMR)
  • Programming: Python, SQL, Shell, JavaScript/TypeScript, Java, C/C++
  • Web: React, Node.js, Next.js, FastAPI, Laravel, WordPress
  • Databases: Oracle, PostgreSQL, MySQL, MongoDB, ChromaDB
  • AI/LLM: LangGraph, Streamlit, RAG, Vector DBs, Ollama, LLaMA/Mistral

Highlights

  • Built and maintained DBT+Snowflake data products for enterprise analytics with robust testing, governance, and performance tuning.
  • Operated Airflow (MWAA) and Arena environments with incident response, EMR reliability, and end‑to‑end workflow SLAs.
  • Migrated DBT Core → DBT Cloud with Airflow DAG refactors, validation, and observability.
  • Delivered privacy-first AI apps: local LLM RAG and a hybrid token‑efficient agent.
  • Full‑stack background: React/Next.js + FastAPI/Laravel, strong API design and docs.

Featured projects

  1. DocuRAG: Document Analysis Platform
    React + FastAPI + PostgreSQL + ChromaDB + Ollama (LLaMA 3.2)
  • Local, privacy-first RAG for PDFs/TXT/DOCX with vector embeddings and on‑device inference.
  • Dockerized, Nginx‑backed deployment; supports multi-user auth and knowledge bases.
  • Useful for research analysis and internal KB search.
  1. TokenSage: Hybrid LLM Agent
    LangGraph + Streamlit + OpenAI API + Local LLMs (Ollama)
  • Minimizes premium token usage by delegating expansion to local models.
  • Orchestrates API+local responses for quality and cost control.
  • Applied to docs, support, and education content generation.
  1. DataGene: Synthetic Data Generator
    Java + Spring Boot + RabbitMQ + S3
  • High‑volume data synthesis with secure S3 storage and rich observability.
  • Admin dashboard for environment variables and connections.
  1. RAG Pipeline for Jira Ticket Insights
    LangChain + Streamlit + LLaMA + ChromaDB
  • Ingests Jira ticket dumps, indexes features, and predicts likely error occurrences.
  • Fast, searchable interface for triage and analysis.

Enterprise work summary (TrieDatum)

  • Alexion Operations – Data Products (Snowflake, DBT, MWAA/Airflow): DBT model design, testing, performance; data governance artifacts (Collibra), conceptual models; end‑to‑end solution testing and production handover.
  • BofA Product Support (Arena): Incident handling (JIRA), log analysis, Spark upgrade support, platform stability.
  • DBT Core → Cloud migration: Refactored Airflow DAGs to DBT Cloud jobs, validated lineage and outputs.
  • Managed Services (Arena, Airflow, EMR, AWS): Proactive monitoring, reruns, biweekly maintenance, infra/app issue ownership.

Experience

  • Software Engineer, TrieDatum — Apr 2023 → Present
  • Software Engineer, Ekodus Technologies — Jul 2021 → Apr 2023
  • Software Engineer, Riverrun Digital — Sep 2020 → Jun 2021

Education: M.Sc. Information Technology, Gauhati University
Certifications: Snowflake (Udemy, 2023), DBT (Udemy, 2024)


Contact

Popular repositories Loading

  1. jquery-filepond jquery-filepond Public

    Forked from pqina/jquery-filepond

    🔌 A handy FilePond wrapper for jQuery

    JavaScript 1

  2. Django Django Public

    Python

  3. ML ML Public

    Jupyter Notebook

  4. NS-2.35 NS-2.35 Public

    Tcl

  5. LISP LISP Public

    Common Lisp

  6. pythonScripts pythonScripts Public

    Python