Winner, AMD Synthetic Data Hackathon
Fine-tuned LLM reasoning via reinforcement learning for a question-answering agent.
now Building agentic LLM workflows and optimized inference systems to get models into production.
ML systems engineer focused on getting real models into production - agentic LLM infrastructure, model serving, and performance-optimized inference across GPUs.
Fine-tuned LLM reasoning via reinforcement learning for a question-answering agent.
Agentic LLM platforms on Kubernetes - dynamic DAG workflows, multimodal RAG, serverless GGML inference, and domain-pretrained embedding models.
ML infrastructure on AWS - online feature stores, batch inference at scale, and LLM serving (BLOOM, Flan-T5) with tensor and model parallelism.
Graph contrastive learning over circuit layouts, ranking systems for product recommendation, and an AutoML platform with hardware/software co-design for inference throughput.
Demand forecasting, CV-based quality assurance for manufacturing, and operations research for logistics and route planning.
Algorithms, Digital Image Processing, Data Intensive Systems for Machine Learning
Designed Unity3D simulation with PyTorch and built physical prototype with 3D-printed components.
Trained deep RL agents (DDPG, A2C, PPO) for continuous control tasks with >30 DoF, using neural policies to process raw sensor and image inputs. Successfully transferred simulation-trained policies to real-world hardware.
Enhanced Whisper v3 medium model for South Asian Language Speech Recognition. Adopted new tokenizer, incorporated audio augmentations, and expanded dataset with high-confidence transcriptions to significantly reduce word error rate.
Zero-installation LLM inference (Gemma, LLama, Mistral) directly in browser using WebAssembly SIMD.
Developed compiler system for mobile backends using TVM and CUDA. Designed SIMD kernels for ARM NEON and custom CUDA kernels for Jetson Orin, outperforming OpenCV and torchscript benchmarks.