Shreyash Hamal

Full Stack Developer / AI Engineer

Engineering data-native AI systems.

Hi! I'm Shreyash, I do full-stack for AI-native products.

Experience / Founding

It's my work experience and open-source contributions, organized as a horizontal set of project cards. Scroll through each one to see what I built, the problems I solved, and the systems I helped improve.

Consilient Labs

Backend Developer (Secure REST API Gateway & AuthZ)

San Francisco, CA

Aug 2024 — Dec 2024
  • Secure login using OAuth 2.0 (AWS Cognito) so users authenticate safely without exposing raw credentials to the application.
  • Token validation with JWKS-backed JWT verification, including signature + expiration checks (with JWKS fetching/caching to keep validation fast and reliable).
  • Authorization enforcement using configurable role/group permissions (admin/member/viewer) mapped to specific REST API endpoints and HTTP methods (read/write/delete style actions).
  • Implemented a reverse-proxy gatekeeper pattern so requests are routed only after the user is authenticated and cleared by the access rules.
  • Designed role-permission configuration that can be updated via scripts and YAML/JSON rule sources, enabling “no downtime” style policy updates.
  • Added automated testing for correctness and reliability: unit tests for login/role checks plus end-to-end speed tests for request responsiveness.
  • Built CI/CD automation to run tests + build + deployment workflows, reducing deployment errors and improving operational stability.

Built a permissioned REST API access layer that keeps authentication and authorization consistent across services. The system verifies OAuth2 tokens, validates JWTs against Cognito JWKS, and enforces fine-grained, role-based access rules at the endpoint/method level through a configurable gatekeeper approach.

Node.jsExpress.jsOAuth 2.0AWS Cognito (JWKS)JWT ValidationRBAC / YAML-JSON PoliciesReverse Proxy / REST8Jest TestingCI/CDDocker

HeyContext

Full-Stack Developer (AI Memory + Intelligence Platform)

San Francisco, CA

Mar 2025 — Jan 2026
  • Developed the AI memory backend using FastAPI with robust lifecycle management (async initialization, graceful cleanup hooks, shutdown resource handling).
  • Created a modular API surface with thin routers delegating to deeper services, including real-time chat via an SSE streaming endpoint (`/api/v1/chat/stream`).
  • Implemented embedding generation as a validated API (content-type checks + length validation) feeding into storage for vector similarity search.
  • Built intelligence orchestration endpoints that trigger multi-entity processing via background jobs, keeping requests responsive instead of blocking threads.
  • Added production-grade operational tooling for background jobs: enqueue/monitoring endpoints and incident-style flows (including emergency handling for stuck jobs).
  • Integrated configuration-driven behavior for multi-environment support, optional Redis enablement, and production-safe rate-limiting defaults.
  • Delivered a sizable automated test suite (API, concurrency, failure recovery, CORS behavior, and orchestration/integration coverage).
  • Built the Next.js App Router frontend with a modern component stack, including a rich editor experience using TipTap + markdown support for structured content workflows.
  • Implemented authenticated API proxying for streaming chat so the frontend can preserve `text/event-stream` response behavior and maintain real-time UX.
  • Deeply integrated Convex at the schema layer for application state, background jobs, embeddings, and intelligence artifact persistence.
  • Produced operational documentation for stateful systems (including migration workflows) so destructive/config changes are handled safely and repeatably.

Worked on a production-style “AI that remembers” platform: a backend that turns user interactions into persistent intelligence artifacts, plus a Next.js frontend with streaming-first UX. The system emphasizes modular APIs, real-time chat streaming, vector-backed state, and background jobs for long-running intelligence processing.

React.jsNext.jsFastAPIPython (async backend)SSE StreamingTipTap / Markdown EditorAuthenticated Streaming ProxyConvex (Schema + Real-time data)Vector StoresBackground Jobs (monitor/enqueue)Observability & MonitoringTest Automation (pytest/Jest-style tooling)

The Convergence

Core Contributor (Open-Source Optimization Framework)

2024 — Present
  • Implemented an SDK/API layer (`run_optimization`) so backend services can trigger optimization without YAML workflows, including normalization and conversion into the internal optimization schema.
  • Built an online runtime selection loop (select/update) designed for production behavior rather than offline experimentation.
  • Used Thompson Sampling-style arm sampling to evolve decisions at runtime, updating cached arms with stability checks and reward feedback.
  • Centralized Bayesian-style update plumbing so runtime performance signals translate into updated beliefs/estimates over time.
  • Designed an extensible architecture with pluggable evaluators/adapters and multiple storage backends (sqlite/postgresql/multi-backend style).
  • Supported multiple modes: batch evolutionary optimization (generations/population) and runtime per-request selection that improves while the system is live.
  • Documented configuration formats and operational usage patterns so teams can adopt the framework consistently.

Contributed to an open-source Python framework that optimizes API/agent configuration over time. The project combines evolutionary-style search with reinforcement-learning-inspired control logic and supports both batch optimization runs and runtime per-request selection using multi-armed bandit strategies.

Python 3.11+ typingMulti-armed BanditsThompson SamplingEvolutionary AlgorithmsBayesian UpdatesOnline Runtime SelectionSDK / API DesignPluggable EvaluatorsStorage Backends (SQLite/Postgres)

Vector-Native

Core Contributor (Vector-Native A2A Protocol Engine)

2024 — Present
  • Implemented the core parser to convert Vector-Native strings into structured operation objects using strict parsing (`parse_vector_native`) plus hybrid and fallback flows for real-world mixed LLM outputs.
  • Added token counting and efficiency measurement with a tokenizer layer that uses `tiktoken` when available and falls back to heuristic estimation when not.
  • Created token-reduction evaluation utilities (before/after + reduction %) so the protocol’s efficiency claims are measurable, not just descriptive.
  • Added hybrid parsing support to preserve token savings while tolerating non-compliant or partially structured model outputs.
  • Developed the project as a protocol-first system with formal documentation: symbol registry, syntax rules, validation rules, and extension guidelines.
  • Set up test coverage for research claims, including parser/tokenizer behavior expectations and dedicated `tests/` coverage.
  • Structured the project as implementation + measurement (parser/tokenizer + explicit reduction calculations) to enable repeatable experiments and reports.

Built a research/engineering system for a “Vector-Native” symbolic structured protocol that enables AI-to-AI communication with less ambiguity. The work includes a strict parser, hybrid parsing fallbacks, token-counting for efficiency measurement, and a full evaluation workflow so token savings/reduction can be proven and reproduced.

Python parsingSymbolic ProtocolStrict + Hybrid ParserTokenizer / tiktokenToken Reduction MetricsVector-Native Specpytest test suiteEvaluation/Measurement Pipelines

Hackathons / Wins & Awards

A bento grid of hackathon builds — scrub to assemble, hover to inspect.

WeaveHacks 2 • Multi-agent Learning Loop

Agent Playground

Building multi-agent learning systems is only half the job: teams need deterministic safety scoring and deep observability to understand how strategies evolve, converge, and fail over time.

Demonstrated measurable learning improvements: hierarchical learning success rate ~72% → ~88% (+16%), and quality score ~0.74 → ~0.89 (+20%). The system includes a judge-friendly end-to-end demo flow (~5–8 minutes) plus automated tests for hierarchical behavior and security scoring integration.

FastAPI (Python 3.11+)Next.js 15 (App UI)TypeScriptTailwind CSS+4

Devpost • Content Creator Connector

AI-Influencer-Marketing-Agent

Influencer partnership workflows are fragmented: discovering creators, interpreting profiles, and generating personalized outreach requires multiple steps across research, matching, multimodal analysis, and traceable reporting.

Delivered a structured and typed pipeline (Pydantic models + structured workflow outputs) that improves debuggability and iteration speed. Included reusable toolkits for multimodal analysis (Selenium stealth screenshot capture) and streaming/traceability patterns so the workflow stays inspectable across runs.

FastAPIAgno (multi-agent orchestration)OpenAIChatMultimodal (Instagram screenshot analysis)+5

AWS Hackathon • Natural Selection for AI Tool Generation

Darwin

AI-generated tools are risky by default: you need a system that can safely explore candidates, scan for vulnerabilities, and consistently pick the best outputs instead of trusting a single generation run.

Built a security-first evolution loop with explicit fitness weights (Security 40%, Success 30%, Speed 20%, Quality 10%) and robust Bedrock enablement/fallback so the system can run reliably in different environments. Included generation tracking and leaderboard/specialization insights to show improvement across runs.

Next.js 15TypeScriptTailwind CSSshadcn/ui+5

Systems / Skills Graph

Hover a skill to light up its connected projects; nothing is isolated.

LLMs

Data Modeling

Infra

Observability