Winner, AMD Synthetic Data Hackathon
Fine-tuned LLM reasoning via reinforcement learning for a question-answering agent.
now Building agentic LLM workflows and optimized inference systems to get models into production.
ML systems engineer focused on getting real models into production - agentic LLM infrastructure, model serving, and performance-optimized inference across GPUs.
Fine-tuned LLM reasoning via reinforcement learning for a question-answering agent.
Built agentic LLM systems spanning workflow orchestration, ingestion pipelines, multimodal RAG, LLM application backends, AWS infrastructure, and GPU-hosted LLM and ML inference on ASU's SOL cluster.
Engineered ML platform infrastructure on AWS, building model deployment pipelines, online feature stores, batch inference systems, and large-model serving while partnering with stakeholders to deliver scalable production ML capabilities.
Engineered deep learning systems for Siemens electrical design analysis, graph-based component understanding, sales recommendation, AutoML training and inference backends, and CNN optimization for edge ML on custom ASICs.
Applied machine learning for demand forecasting, computer vision-based quality assurance, and operations research for logistics and route planning.
Algorithms, Digital Image Processing, Data Intensive Systems for Machine Learning
Designed Unity3D simulation with PyTorch and built physical prototype with 3D-printed components.
Trained deep RL agents (DDPG, A2C, PPO) for continuous control tasks with >30 DoF, using neural policies to process raw sensor and image inputs. Successfully transferred simulation-trained policies to real-world hardware.
Enhanced Whisper v3 medium model for South Asian Language Speech Recognition. Adopted new tokenizer, incorporated audio augmentations, and expanded dataset with high-confidence transcriptions to significantly reduce word error rate.
Zero-installation LLM inference (Gemma, LLama, Mistral) directly in browser using WebAssembly SIMD.
Developed compiler system for mobile backends using TVM and CUDA. Designed SIMD kernels for ARM NEON and custom CUDA kernels for Jetson Orin, outperforming OpenCV and torchscript benchmarks.