|
Dhivya Sreedhar
I'm a Master's student in Information Systems with a specialization in Machine Learning and NLP at
Carnegie Mellon University.
Previously, I earned my B.E. in Computer Science (with Honors) from
Anna University in 2022.
My interests lie at the intersection of large language models, multimodal learning, and real-world ML systems.
I'm currently working with
Reclamation Factory,
where I build real-time multimodal robotic perception systems for material identification and sorting.
Previously, I spent two years at
Zoho Corporation,
developing distributed ingestion pipelines, search systems, and cloud-scale analytics for security products.
Outside of AI, I enjoy fostering animals, exploring nature, and lifting weights.
I'm actively seeking Machine Learning and Applied Scientist roles starting January 2026 — let's connect!
Email /
CMU Email /
Resume /
GitHub /
LinkedIn /
Scholar
|
|
Work & Research Experience
Computer Vision & Machine Learning Intern
- Built a multimodal robotic identification and sorting system on NVIDIA Jetson AGX Orin using NIR/XRF/RGB fusion and Vision Transformer fine-tuning, achieving 93.5% accuracy across six material classes.
- Optimized inference pipelines using TensorRT and ONNX kernels, achieving sub-50ms latency for high-throughput edge deployment.
- Designed multimodal feature fusion pipelines combining spectral, audio, and visual embeddings to improve robustness and cross-domain generalization.
Bank of New York
Applied Scientist – Capstone Project
- Developed and fine-tuned large language models for text summarization, conversational AI, and compliance automation using RLHF, LoRA, and prompt engineering.
- Applied knowledge distillation and transfer learning techniques on financial datasets using PyTorch, TensorFlow, and Hugging Face.
Software Developer
- Designed and implemented distributed backend services using a microservice architecture for a cloud-native SIEM platform.
- Co-developed a high-throughput HTTP Event Collector using Java Struts, Redux, and REST APIs, achieving a 150% improvement in log ingestion performance via parallel processing and critical path optimization.
- Led end-to-end development of a containerized search system using Docker, AWS Lambda, and EC2, reducing search latency by 40% while enabling real-time analytics across globally distributed tenants.
Teaching Assistant – 11-785 Deep Learning
- Conduct weekly recitations for a flagship PhD-level deep learning course with 400+ students, covering PyTorch, speech preprocessing, NAS, and memory-efficient data pipelines.
- Collaborate with Prof. Bhiksha Raj to develop instructional material and mentor student research projects on LLM reasoning, generative AI, and reinforcement learning.
Research Intern
- Conducted research on face forgery detection using deep convolutional neural networks and synthetic media datasets.
Summer Research Intern
- Built a music instrument recognition system using CNNs and mel-spectrogram representations, achieving 99.17% accuracy.
Projects
Multimodal Product Category Reranking with Bi-Encoder
PyTorch
LoRA
Contrastive Learning
Transformers
- Fine-tuned a two-tower bi-encoder reranker (SigLIP2-Base + BGE-Base) on 48K Shopify products, fusing vision and text embeddings via a learned sigmoid gate into a shared 512-d space for listwise candidate ranking.
- Trained with a two-stage LoRA curriculum (heads-only contrastive pretraining → text LoRA alignment) using multi-positive InfoNCE over a 2,042-entry cross-batch MoCo queue, achieving a best listwise ranking loss of 1.24.
- Built an interactive 3D visualization of the full embedding space (React + Three.js, custom GLSL shaders) deployed as a zero-backend Cloudflare Worker.
minGPT
Python
PyTorch
einops
- Upgraded a baseline GPT model by implementing Rotary Position Embeddings (RoPE) and Grouped-Query Attention (GQA) in PyTorch, enhancing efficiency and alignment.
- Conducted pretraining and fine-tuning experiments on the Shakespeare corpus, benchmarking performance trade-offs of RoPE and GQA individually and jointly, replicating core architectural elements of LLaMA-2 at small scale.
LoRA Fine-Tuning for Sentiment Classification
Python
PyTorch
LoRA
Transformers
Instruction Tuning
- Fine-tuned GPT-2 Medium on the Rotten Tomatoes dataset using Low-Rank Adaptation (LoRA), achieving 88% test accuracy while updating only ~5.5% of parameters.
- Ablated rank r and scaling factor α across multiple PEFT configurations, outperforming full fine-tuning baseline with faster convergence and stronger generalization.
Automatic Speech Recognition with Transformer Encoder-Decoders
PyTorch
HuggingFace
TorchMetrics
- Built an end-to-end ASR system leveraging Transformer-based encoder-decoder architectures with multihead attention, positional encodings, and sequence masking.
- Implemented greedy and beam search decoding, achieving a 15% reduction in CER via decoder pretraining and progressive training strategies.
- Experimented with multilingual embeddings and cross-lingual transfer for low-resource datasets.
Last updated July 2025. Template adapted from Jon Barron.
|