Harshit Mani
Tripathi

SRE / Infrastructure Engineer

I automate infrastructure, observe everything,
and break things on purpose.

About

I'm a DevOps intern at Unstop (since March 2026), where I work on infrastructure automation and cloud deployments. I also maintain a hybrid homelab where I design and run infrastructure end to end. Most experiments start there before they become repeatable systems.

My setup is hybrid: bare metal Proxmox on-prem connected to Oracle Cloud Infrastructure through a WireGuard mesh. I use it to test clustering, networking, identity, and observability at realistic scale.

I am focused on making infrastructure boring in the best way: automated, observable, and reproducible from a clean checkout. If a fix requires repeated manual SSH steps, the system is not finished yet.

Experience

  1. DevOps Intern

    Unstop
    March 2026 · Present
    • Built a Kubernetes-based developer platform POC on AWS EKS using Terraform and Helm, provisioning isolated workspaces for 100+ users
    • Debugged a silent PVC failure traced to missing IRSA permissions on the EBS CSI driver
    • Refactored Terraform modules to decouple cluster infrastructure from application layer
    • Work across AWS, Kubernetes, and infrastructure-as-code daily

Skills

Infrastructure & Cloud

IaC & Automation

SRE & Kubernetes

Observability

Networking & Identity

Languages

Projects

  1. Hybrid Homelab Platform

    Oct 2021 · present

    Proxmox and Oracle Cloud Infrastructure hybrid setup connected via a WireGuard mesh. Keycloak handles SSO across all services and the entire state is managed through OpenTofu with HCP remote state, so a full rebuild takes a single run.

  2. Fully Automated k3s Cluster

    Mar 2025 · present

    Three control-plane plus three worker node k3s cluster running on Proxmox VMs, automated end to end. Terraform provisions the VMs, Ansible configures the OS and bootstraps k3s, and Bash glues the handoffs together.

  3. Observability Stack

    Apr 2025 · present

    Full LGTM stack deployed on the homelab. Prometheus scrapes metrics, Loki aggregates logs, Tempo handles distributed traces, and Grafana ties it all into one pane with unified alerting.

  4. Recursive DNS Infrastructure

    Jul 2025 · present

    Self-hosted Unbound recursive resolver processing roughly 500 k queries per month at a 75% cache hit rate and 40 to 80 ms p50 latency, integrated with split-horizon zones for internal service discovery.

Open Source

Writing