Hi! I'm Atharva Kulkarni

Computer Engineering Graduate at Stonybrook University,NY

Python Developer | AWS foundational Certified | Aspiring Data Scientist | Artificial Intelligence & Machine Learning Enthusiast

Hero Image

About

I'm an aspiring software professional with a diverse set of technical skill set spanning cloud platforms, machine learning, and software development. My cloud expertise includes AWS Cloud Foundations certification, comprehensive training through the AWS Graduate Academy, and proficiency in Azure cloud services through different Azure courses. While I might be considered fresher to professional software development, I've built a solid foundation in Python programming, machine learning concepts, and data analysis. Through self-study and hands-on projects, I've developed the skills needed to contribute effectively while continuing to learn and grow. I'm now eager to bring my technical knowledge and enthusiasm for innovation to a professional role where I can make meaningful contributions to challenging projects.

psychology
ML / AI

Tensorflow | PyTorch | Neural Networks | Deep Learning | LLM Fine-Tuning | RAG | LangChain | HuggingFace Transformers | CUDA |

cloud_sync
CLOUD

Microsoft Azure Services | Amazon Web Services | Docker | Git |

data_exploration
Data Science

Pandas | Visualization Tools | SQL | Feature Engineering | Dimensionality Reduction | Clustering | Numpy |


Projects

View some of my latest projects

Project
Handwritten Equation Solver using HOG & SVM

Developed a machine learning pipeline that deciphers handwritten numerical equations and computes their results. The system combines advanced image segmentation techniques such as adaptive thresholding and morphological operations to effectively isolate individual characters. For feature extraction, it leverages Histogram of Oriented Gradients (HOG) to capture critical shape and texture details, and these features are then classified using a Support Vector Machine (SVM) with a polynomial kernel. I employed grid search to systematically tune the Hyperparameters for both.

Machine Learning
NLP
Computer Vision
Image Segmentation
HOG (Histogram of Gradients)
SVM (Support Vector Machines)
Hyperparameter Tuning
Data Science
PyTorch
Feature Engineering
Polynomial kernel
Project
AWS Hosted Portfolio

Built and deployed a personal portfolio application hosted on AWS, leveraging an end-to-end cloud infrastructure for scalability and reliability. The deployment includes AWS CloudFront as a Content Delivery Network (CDN) for faster global content delivery and EC2 instances for hosting the application. A custom VPC configuration ensures secure network isolation, while tailored security group policies manage access control. The site uses a custom domain from Namecheap, with SSL/TLS encryption configured through AWS Certificate Manager to ensure secure communication.

Django
CDN
AWS Cloudfront
EC2
VPC
SSL/TLS
Domain Integration
Automated Deployment Pipeline
DNS Management
Bootstrap 5
Back-End Development
Cloud Architecture
Project
AI Financial Analyst

Developed an AI financial chatbot that leverages a fine-tuned LLaMA-3 8B model to provide personalized investment recommendations and financial insights. The system integrates real-time market data from Yahoo Finance and NewsAPI, combining quantitative metrics with sentiment analysis of financial news to deliver actionable advice. Fine-tuned using QLoRA - Quantized Low-Rank Adaptation. Key features include contextual financial Q&A, sentiment evaluation, and trend analysis.

LLM
Large Language Model
QLoRA
Yfinance
Huggingface Transformer
PEFT
API integration
Project
Volatility Forecasting Using GARCH Model

Created a financial forecasting model to predict market volatility using the GARCH(1,1) framework on 13 years of S&P 500 historical data. The model achieved an impressive RMSE of 0.0058, demonstrating high accuracy in volatility prediction. Implemented a rolling-window backtesting framework to validate the model's performance and compute Value at Risk (VaR) at a 95% confidence level, yielding a VaR of 0.0205 for effective risk assessment.

Machine Learning
Financial Risk Modeling
Volatility Analysis
GARCH
Time Series Analysis
Rolling-Window Backtesting
Risk Assessment
Quantitative Finance
Python
Statistical Modeling
Project
Evaluate Student Summaries

Engineered an NLP pipeline to predict the quality of student summaries, achieving an MSE of 0.21 using a Random Forest model. The system extracts deep linguistic features by analyzing patterns, syntactic structures, and semantic coherence to understand the nuances of written content. In addition, a chi-square statistical analysis module was incorporated to pinpoint the discriminative vocabulary between high- and low-quality essays, providing actionable insights for educational feedback.

Machine Learning
NLP
Feature Engineering
Python
Feature Extraction
Chi-Square Analysis
Linguistic Analysis
Project
Bim-Viewer

Developed a concise 3D BIM visualization engine that loads and renders Industry Foundation Classes (IFC) files with precision. Utilizing a modular Python codebase, the application integrates the PyQt5 GUI framework with an OpenGL rendering pipeline and leverages IfcOpenShell for accurate BIM file interpretation. The viewer features an interactive navigation system with smooth pan, zoom, and rotation capabilities enabled by PyQt5 event handling, allowing users to seamlessly explore and analyze detailed building models.

BIM Visualization
IfcOpenShell
OpenGL
PyQt5
Python
Project
Movie-Data-Analysis

This project involves the development of a comprehensive movie data analysis dashboard, focusing on trends and insights from movies released between 2021 and 2023. Data was collected using the OMDB API and stored in a PostgreSQL database, with SQLAlchemy used for database integration. The analysis pipeline utilized Pandas for data cleaning and transformation, followed by detailed exploratory data analysis (EDA) to uncover patterns in movie ratings, view counts, and title word frequencies.

Python
Pandas
Data Visualization
Exploratory Data Analysis-EDA
SQLAlchemy
PostgreSQL
Database integration
API integration
Seaborn
Matplotlib
Statistical Analysis
Trend Analysis
Project
Dynamic Embedding Model for Retrieval-Augmented Generation (RAG)

This project builds a dynamic embedding pipeline that intelligently classifies incoming queries with a Random Forest model trained on custom datasetand then selects the optimal domain-specific embedding model. It re-embeds both queries and documents, indexes them in an in-memory vector database using ChromaDB, and leverages cosine similarity for effective retrieval. The approach ensures that context-rich prompts are constructed for language model generation, optimizing semantic relevance and improving retrieval accuracy in a robust retrieval-augmented generation (RAG) system.

Machine Learning
NLP
PyTorch
LLM
Huggingface Transformer
Python
RAG
Embeddings
Chromadb
Dynamic Embedding