I'm a Lead Audio AI/ML Engineer at Logitech, where I work on Echo Cancellation for our video-conferencing devices and manage the audio pipeline for an upcoming flagship product. I’ve also co-led the development of a Speech Enhancement algorithm for real-time denoising and dereverberation, and our training approach was recently published at Interspeech 2025.
Previously, I co-founded Echolair, a music AI startup that transforms samples into unique variations, and served as Founder-in-Residence at The Sound of AI Accelerator. I also co-developed the first AI-powered Acoustic Echo Cancellation system at Zoom, successfully deployed in Zoom Rooms, and built NLP models for recommendation systems at Shopee serving eight countries.
I graduated from Indian Institute of Technology Ropar in 2018 and conducted graduate research at National University of Singapore on Computational Sound Scene Analysis, publishing in IEEE ICASSP 2019.
When I'm not at my desk, I enjoy composing music and creating original pieces such as Child's Play. I'm also a former member of the NUS Guitar Ensemble. Outside of music, I’ve traveled to over 23 countries, and I like to play badminton, run, and climb rocks.
Winner of 'Most Interesting Use of AI' at 1st Sound of AI Hackathon 2022. AI-based video mashup application that creates mashup videos using favorite videos and songs.
Introduced a one-stage, step-invariant flow-matching model for speech enhancement (SFMSE) that enables high-quality denoising in a single step while matching perceptual performance of diffusion-based baselines with ~60 neural evaluations.
Proposed a novel paradigm that uses a model's own encoder as the loss function for speech enhancement, improving performance over traditional handcrafted or deep-feature losses.
Created a metric which evaluates statistical similarity between two data sources (for example, "does your training data match the deployment conditions?").
Built a two-stage system for claim verification that improves evidence retrieval using enriched question generation, achieving strong results on the AVeriTeC benchmark.
End-to-end unsupervised anomaly detection system for CCTV factory videos at Panasonic. Novel ML algorithm for real-time detection of unauthorized access and machine anomalies.
Novel deep learning architecture for acoustic scene classification. Leverages band-wise temporal information achieving 14% relative improvement over DCASE 2018 baseline.
👨🏻💻Github Source Code👨🏻💻
Top-10 team in REFUGE Challenge for glaucoma assessment from fundus photographs. Developed 2-level model for optic disc localization and cup/disc segmentation.
Enhanced Capsule Networks with multiple capsule levels and DenseNet integration. Published in ACCV 2018 and WiML NeurIPS 2019. Achieved state-of-the-art on MNIST with 20-fold reduction in training iterations.
👨🏻💻Github Source Code👨🏻💻
Multi-modal system using SVMs for audio and MLP for image processing to create synchronized multi-instrument videos, deployed as an android application.