Skip to content
View enesgulerai's full-sized avatar

Block or report enesgulerai

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
enesgulerai/README.md

Hello, I'm Enes!

I'm an MLOps Engineer focused on designing scalable, cloud-native inference architectures and automated CI/CD pipelines. I specialize in transforming experimental models into production-ready microservices using Kubernetes, Docker, and AWS.


What I'm working on

I am currently architecting robust ML systems with a focus on:

  • Orchestration: Managing containerized applications with Kubernetes.
  • MLOps & GitOps: End-to-end pipeline automation, monitoring, and continuous delivery using ArgoCD.
  • Microservices: Decoupling monolithic ML code into scalable FastAPI services.
  • Infrastructure: Configuring AWS (EC2, VPC, IAM, ECR) and Linux environments for high availability.

Featured Projects

1. End-to-End Fashion AI Recommender (H&M Project)

  • Cloud-Native MLOps & IaC: Architected a decoupled deployment strategy by provisioning AWS S3 storage via Terraform, removing heavy model artifacts from Docker images. Optimized model deployment via INT8 Dynamic Quantization (ONNX), achieving a ~75% reduction in footprint (100MB+ to 23MB) with dynamic S3 lazy-loading via Boto3.
  • Architecture & GitOps: Orchestrated a highly available Kubernetes environment managed declaratively via Helm and ArgoCD. Automated zero-downtime rollouts and utilized multi-stage builds to reduce image sizes by 54%.
  • High-Performance API & Reliability: Architected an asynchronous Redis caching layer, slashing inference latency to <2ms. Validated robustness via Locust, sustaining 805 RPS (equivalent to ~2.9M requests/hour) under 2,000 concurrent users with a 0% error rate.
  • Vector Search Engine: Integrated Qdrant as a high-throughput vector database to enable semantic similarity search, optimizing persistent storage structures for millisecond-latency recommendation retrieval.
  • DevOps & Observability: Orchestrated a full CI/CD pipeline via GitHub Actions for automated AWS EC2 deployments. Implemented a robust observability stack (Prometheus & Grafana) to track real-time P99 latency and system stability. Enforced rigorous DevSecOps practices (Trivy, Black, Isort) and ensured reliability with Pytest.

2. Cloud-Native Inference Engine (NYC Taxi Project)

  • Architecture: Deployed a decoupled system on AWS EC2 with zero-trust security via IAM Roles and custom Security Groups.
  • High-Performance: Achieved an 85x latency reduction (~281ms to ~3ms) and sustained ~290 RPS by migrating to ONNX Runtime and integrating Redis caching.
  • DevSecOps & CI/CD: Established a GitHub Actions pipeline with Trivy for vulnerability scanning. Reduced production Docker image sizes by 68% (2.05GB to 650MB) via multi-stage builds.
  • Model Engineering: Engineered a resource-efficient Random Forest model (size reduced by 97%, from 1.2GB to 33MB) and utilized MLflow, resolving critical Kubernetes OOM errors.
  • Reliability: Validated system robustness via Locust stress testing (1000 concurrent connections, 0% error rate) and implemented real-time monitoring with Prometheus & Grafana.

Tech Stack

Languages & Frameworks

MLOps & Cloud

Data & Databases


📬 Let's Connect

LinkedIn

Pinned Loading

  1. hm-fashion-recommender hm-fashion-recommender Public

    Let's find your style!

    Python 6

  2. nyc-taxi-mlops nyc-taxi-mlops Public

    How long was your taxi drive? Let's find out!

    Python 2

  3. sentinel sentinel Public

    Do you suspect your transactions? Let Sentinel find the truth.

    Python