Graduate TA
Sep 2023 - Dec 2023
Stony Brook University
Stony Brook, NY
- Evaluated EDA assignment on a Kaggle NLP dataset for 240 students.
- Developed final exam questions covering various data science concepts and assessed responses on k-nearest neighbors and distance metrics for the same cohort.
Data Science
Exploratory Data Analysis
Teaching Assistant
Data Science Intern
May 2023 - Aug 2023
HPE Arube Networking
San Jose, CA
- Developed a metric to quantify connectivity failures and applied it within a clustering framework in PySpark to detect failures caused by device misconfiguration and other interoperability issues in telemetry data.
- Devised a new method to compute threshold for structured failures, reducing the run-time of clustering by 5% and enhancing its resilience against random failures.
Data Science
Computer Networks
Clustering
Metrics
Graduate TA
Feb 2023 - May 2023
Stony Brook University
Stony Brook, NY
- Prepared and delivered lectures to 100+ students on NLP topics such as word embeddings (Word2Vec) and sentence embeddings (Deep Averaging Networks, Long Short-Term Memory (LSTM), Transformers).
- Involved in the development of assignments in PyTorch for the training (gradient descent types), validation (perplexity, BLEU score) and explainability (probing, de-biasing) of large language models.
Natural Language Processing
Deep Learning
Large Language Models
Embeddings
Transformers
LSTM
RNN
Data Scientist
May 2021 - Jul 2022
Bottomline Technologies
Bengaluru, KA
- Created synthetic data using NLP augmentation and built a multi-model classification pipeline using Scikit-Learn to auto-match payments to invoices, improving precision by 20%.
- Applied quantile regression in Python to forecast the payment date of invoices with a 45% higher custom accuracy than baseline models and collaborated with the data visualization team to generate plots of future cash flow.
- Achievements: Received Certificate of Excellence under Spot Award category in Feb 2022 for 'valuable contribution to the multi-model classification architecture'.
FinTech
Quantile Regression
Classification Metrics
Payment Forecasting
Visualization
Decision Scientist
Jul 2019 - May 2021
Mu Sigma Services Limited
Bengaluru, KA
- Led a team of 5 Data Scientists to build a Sammon projection of stock price correlations and train a classification model on engineered network metrics to identify the market regime with 60% precision.
- Implemented Bollinger Bands, Ornstein-Uhlenbeck and Regime-Switch models to detect anomalies in financial security prices. Used Tidyverse in R to vectorize anomaly detection engine, thereby decreasing run-time by 83%.
- Built an RMarkdown notebook to automate end-to-end analysis of multivariate time-series data using ARIMA, Vector AutoRegression and their variants, which enabled client-side teams to reduce model development time by 2 weeks.
- Developed a Tableau dashboard summarizing the usage of in-house artefacts by client-facing teams at Mu Sigma and derived actionable insights from the data to assist C-suite executives in making critical decisions for business units.
- Achievements: Received Certificate of Excellence under Spot Award category in Feb 2020, July 2020, Nov 2020, Feb 2021 and Apr 2021 for 'driving the Algos thread and delivering the highest quality of work'.
Decision Sciences
Research and Development Labs
Statistical Arbitrage
Time-Series Analysis
Classification Models
R Programming
Anomaly Detection
Vector AutoRegression
LSTM
Intern
Jan 2019 - May 2019
NEC Technologies India
Noida, UP
- Built a multi-class classification model to identify the lease absorption potential of channel partners, thus decreasing the default rate by 25% for a financial services firm.
Lease Absorption
Financial Services
Classification Model