I design, evaluate, and productionize autonomous coding agents.
Currently building open-source DevOps AI agents at Stakpak, focusing on long-running autonomous systems, evaluation science, and reliable agent execution.
- Autonomous coding agents (24/7 infra agents)
- Agent evaluation frameworks & benchmarking datasets
- Context engineering & dynamic memory systems
- Multi-agent orchestration
- Auto-prompt optimization systems
- Hallucination detection & recovery pipelines
- RAG systems with trusted, fresh retrieval
- Fine-tuning small LLMs (LoRA / QLoRA)
- Long-session reasoning reliability
Founding AI R&D Engineer building a DevOps coding agent in Rust competing in the autonomous CLI agent space.
- Architected agentic search & retrieval system (trusted domains, semantic reranking, freshness guarantees)
- Designed and built internal DevOps evaluation datasets (Terminal-style containerized benchmarks)
- Implemented automated A/B testing pipelines with LLM-as-judge scoring
- Built sub-goal trajectory analysis to detect reasoning failure patterns
- Designed hallucination detection supervisors for real-time agent correction
- Developed auto-prompt optimization framework (textual gradient refinement + multi-tier evaluation)
- Contributed to Autopilot mode (long-running autonomous infrastructure agent)
- Reliable autonomous coding agents
- Evaluation science for LLM systems
- Context & tool design engineering
- Prompt optimization algorithms
- Agent observability & telemetry
- Infrastructure-aware AI systems
Seqoon — Built and deployed production RAG and voice-agent systems as Founding AI Engineer.
German University in Cairo (GUC) — Senior Teaching Assistant in CS & Mechatronics, supervising deep learning and autonomous systems projects.
Python • Rust • C/C++ • SQL • Bash
LangChain • LangGraph • DSPy • Adalflow
vLLM • Ollama • Unsloth
LoRA / QLoRA fine-tuning
Multi-agent systems
Milvus • Pinecone • Neo4j • Typesense
Hybrid search (BM25 + embeddings)
Semantic reranking
Metadata-aware retrieval
Docker • Kubernetes • K3s
Harbor • Traefik • NGINX
CI/CD • AWS
Prometheus • Trivy • Checkov
FastAPI • Django • Axum
MongoDB • PostgreSQL
Streamlit • Git • Linux



