Dhivya Sreedhar

I'm a Master's student in Information Systems with a specialization in Machine Learning and NLP at Carnegie Mellon University. Previously, I earned my B.E. in Computer Science (with Honors) from Anna University in 2022. My interests lie at the intersection of large language models, multimodal learning, and real-world ML systems.

I'm currently working with Reclamation Factory, where I build real-time multimodal robotic perception systems for material identification and sorting. Previously, I spent two years at Zoho Corporation, developing distributed ingestion pipelines, search systems, and cloud-scale analytics for security products.

Outside of AI, I enjoy fostering animals, exploring nature, and lifting weights. I'm actively seeking Machine Learning and Applied Scientist roles starting January 2026 — let's connect!

Email  /  CMU Email  /  Resume  /  GitHub  /  LinkedIn  /  Scholar


CMU
MS in Information Systems (ML & NLP)
Aug 2024 – Dec 2025

Reclamation Factory
Computer Vision & ML Intern
May 2025 – Present

Zoho Corporation
Software Developer
Aug 2022 – Aug 2024

Anna University
B.E. Computer Science
Aug 2018 – May 2022

Work & Research Experience
Reclamation Factory (CMU Robotics Startup)
Computer Vision & Machine Learning Intern
  • Built a multimodal robotic identification and sorting system on NVIDIA Jetson AGX Orin using NIR/XRF/RGB fusion and Vision Transformer fine-tuning, achieving 93.5% accuracy across six material classes.
  • Optimized inference pipelines using TensorRT and ONNX kernels, achieving sub-50ms latency for high-throughput edge deployment.
  • Designed multimodal feature fusion pipelines combining spectral, audio, and visual embeddings to improve robustness and cross-domain generalization.
Bank of New York
Applied Scientist – Capstone Project
  • Developed and fine-tuned large language models for text summarization, conversational AI, and compliance automation using RLHF, LoRA, and prompt engineering.
  • Applied knowledge distillation and transfer learning techniques on financial datasets using PyTorch, TensorFlow, and Hugging Face.
Zoho Corporation – ManageEngine (Log360 Cloud)
Software Developer
  • Designed and implemented distributed backend services using a microservice architecture for a cloud-native SIEM platform.
  • Co-developed a high-throughput HTTP Event Collector using Java Struts, Redux, and REST APIs, achieving a 150% improvement in log ingestion performance via parallel processing and critical path optimization.
  • Led end-to-end development of a containerized search system using Docker, AWS Lambda, and EC2, reducing search latency by 40% while enabling real-time analytics across globally distributed tenants.
Carnegie Mellon University – Language Technologies Institute
Teaching Assistant – 11-785 Deep Learning
  • Conduct weekly recitations for a flagship PhD-level deep learning course with 400+ students, covering PyTorch, speech preprocessing, NAS, and memory-efficient data pipelines.
  • Collaborate with Prof. Bhiksha Raj to develop instructional material and mentor student research projects on LLM reasoning, generative AI, and reinforcement learning.
IIITDM Kancheepuram
Research Intern
  • Conducted research on face forgery detection using deep convolutional neural networks and synthetic media datasets.
National Institute of Technology Calicut
Summer Research Intern
  • Built a music instrument recognition system using CNNs and mel-spectrogram representations, achieving 99.17% accuracy.

Projects
Multimodal Product Category Reranking with Bi-Encoder
PyTorch LoRA Contrastive Learning Transformers
  • Fine-tuned a two-tower bi-encoder reranker (SigLIP2-Base + BGE-Base) on 48K Shopify products, fusing vision and text embeddings via a learned sigmoid gate into a shared 512-d space for listwise candidate ranking.
  • Trained with a two-stage LoRA curriculum (heads-only contrastive pretraining → text LoRA alignment) using multi-positive InfoNCE over a 2,042-entry cross-batch MoCo queue, achieving a best listwise ranking loss of 1.24.
  • Built an interactive 3D visualization of the full embedding space (React + Three.js, custom GLSL shaders) deployed as a zero-backend Cloudflare Worker.
minGPT
Python PyTorch einops
  • Upgraded a baseline GPT model by implementing Rotary Position Embeddings (RoPE) and Grouped-Query Attention (GQA) in PyTorch, enhancing efficiency and alignment.
  • Conducted pretraining and fine-tuning experiments on the Shakespeare corpus, benchmarking performance trade-offs of RoPE and GQA individually and jointly, replicating core architectural elements of LLaMA-2 at small scale.
LoRA Fine-Tuning for Sentiment Classification
Python PyTorch LoRA Transformers Instruction Tuning
  • Fine-tuned GPT-2 Medium on the Rotten Tomatoes dataset using Low-Rank Adaptation (LoRA), achieving 88% test accuracy while updating only ~5.5% of parameters.
  • Ablated rank r and scaling factor α across multiple PEFT configurations, outperforming full fine-tuning baseline with faster convergence and stronger generalization.
Automatic Speech Recognition with Transformer Encoder-Decoders
PyTorch HuggingFace TorchMetrics
  • Built an end-to-end ASR system leveraging Transformer-based encoder-decoder architectures with multihead attention, positional encodings, and sequence masking.
  • Implemented greedy and beam search decoding, achieving a 15% reduction in CER via decoder pretraining and progressive training strategies.
  • Experimented with multilingual embeddings and cross-lingual transfer for low-resource datasets.

Last updated July 2025. Template adapted from Jon Barron.