Skip to content
View Skrisps26's full-sized avatar

Highlights

  • Pro

Block or report Skrisps26

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Skrisps26/README.md

Hey — I'm a third-year CS student who spends most of his time trying to make small models punch above their weight. Lately that means RL post-training, inference-time memory, and figuring out what's actually possible on a laptop GPU.

I like problems where the constraint is the interesting part.



What I'm building

Qwen3-0.6B Reasoning Pipeline  ·  active

Training a 0.6B model to reason using a 4-stage GRPO pipeline — SFT coldstart, RL on math, mode fusion, then generalization. At inference I attach a Hopfield episodic memory bank (~20MB) that retrieves similar past problems as context. The bet is that a sub-1B model with the right inference-time setup can match 7B+ on reasoning benchmarks.


PowerBench-Consumer  ·  complete

Benchmarked LLM inference and GRPO training on an RTX 2050 (4GB VRAM) — extending DREAM:Lab's Jetson Orin AGX methodology to hardware most people actually own. The interesting finding: INT8 is 3× slower than FP16 on consumer GPUs, the opposite of what happens on the Jetson. Quantization benefits don't travel across hardware architectures.

FP16  →  2,407ms  ·  13.29 tok/s  ·  PPL 14.80
INT8  →  10,056ms ·  3.18  tok/s  ·  PPL 19.46
INT4  →  3,965ms  ·  8.07  tok/s  ·  PPL 15.36

Neural Global Illumination Engine  ·  complete

Reframed Global Illumination as a regression task — an MLP learns to predict radiance instead of ray-tracing it. Brought per-frame compute from 26ms down to 11ms with a +4.1dB PSNR improvement. Runs at 60+ FPS with dynamic lighting and moving occluders.


Deepfake Detector  ·  complete

Multimodal detection pipeline across video and audio. 85%+ accuracy, processes 10K+ frames and audio samples per batch in under 3 seconds.



A few things I've picked up

{
  "research":  ["PyTorch", "GRPO", "QLoRA", "TRL", "HuggingFace Transformers"],
  "cv / edge": ["OpenCV", "TFLite", "YOLO", "Taichi CUDA"],
  "backend":   ["FastAPI", "Node.js", "Express", "MongoDB", "PostgreSQL"],
  "cloud":     ["AWS Lambda", "Bedrock", "S3", "DynamoDB", "Step Functions"],
  "languages": ["Python", "C", "C++", "JavaScript", "Go"],
}


Recognition

Qualcomm Edge AI Hackathon 2025 — Finalist  ·  top teams from 2000+ participants
DevsHouse '25 — MongoDB Track Winner  ·  4th overall



  


Pinned Loading

  1. portfolio portfolio Public

    CSS

  2. website_builder website_builder Public

    TypeScript

  3. fraud_transaction fraud_transaction Public

    Python