narain@portfolio:~$

Narain Pattabhiraman

MLOps Engineer · ML systems, inference, infrastructure

now Building agentic LLM workflows and optimized inference systems to get models into production.

about

ML systems engineer focused on getting real models into production - agentic LLM infrastructure, model serving, and performance-optimized inference across GPUs.

AWSAzureC/CPPCUDAGolangJavascriptMySQLPythonPyTorchSpark

achievements

Winner, AMD Synthetic Data Hackathon

Fine-tuned LLM reasoning via reinforcement learning for a question-answering agent.

Winner LLM Reinforcement Learning GPUs

experience

Jan 2024 - Present

MLOps Engineer · Arizona State University

Agentic LLM platforms on Kubernetes - dynamic DAG workflows, multimodal RAG, serverless GGML inference, and domain-pretrained embedding models.

Apr 2022 - Jul 2023

Software Engineer · Fidelity Investments

ML infrastructure on AWS - online feature stores, batch inference at scale, and LLM serving (BLOOM, Flan-T5) with tensor and model parallelism.

May 2021 - Apr 2022

AI Researcher · QPiAI Technologies

Graph contrastive learning over circuit layouts, ranking systems for product recommendation, and an AutoML platform with hardware/software co-design for inference throughput.

Jan 2020 - May 2021

Data Analyst · Thoughtware Analytics

Demand forecasting, CV-based quality assurance for manufacturing, and operations research for logistics and route planning.

education

2023 - 2025

MS Computer Engineering · Arizona State University

Algorithms, Digital Image Processing, Data Intensive Systems for Machine Learning

2016 - 2020

B.Tech Mechanical Engineering · Amrita Vishwa Vidyapeetham

projects

Robotic Arm Maneuvering Using Deep Reinforcement Learning B.Tech Thesis Project at Amrita Vishwa Vidyapeetham.

2020

Designed Unity3D simulation with PyTorch and built physical prototype with 3D-printed components.

Trained deep RL agents (DDPG, A2C, PPO) for continuous control tasks with >30 DoF, using neural policies to process raw sensor and image inputs. Successfully transferred simulation-trained policies to real-world hardware.

PyTorch Deep RL Unity3D Robotics

Whisper Training on Indian Languages

2023

Enhanced Whisper v3 medium model for South Asian Language Speech Recognition. Adopted new tokenizer, incorporated audio augmentations, and expanded dataset with high-confidence transcriptions to significantly reduce word error rate.

CUDA Deep Learning NLP

Steganalysis Detection

2022

CNN-based model to detect steganographic content in digital images. Modified ResNet architecture with Efficient Channel Attention (ECA) and reduced convolution stride for enhanced sensitivity to concealed data.

Python Deep learning Computer Vision

Indic LLama

2024

Enhanced LLaMA model for multiple languages and domains. Integrated new tokenizer and trained on diverse translated instruction datasets and domain-specific dialogues.

WASM LLM Javascript Deep Learning

LLM Model Inference in Browser

2024

Zero-installation LLM inference (Gemma, LLama, Mistral) directly in browser using WebAssembly SIMD.

Javascript WebAssembly C GPU

Compiler System for Deep learning

2022

Developed compiler system for mobile backends using TVM and CUDA. Designed SIMD kernels for ARM NEON and custom CUDA kernels for Jetson Orin, outperforming OpenCV and torchscript benchmarks.

TVM LLVM MLIR NEON CUDA