Projects - Hi, I'm Sebastian Marrero
- Built fraud detection models using Logistic Regression and Random Forest (scikit-learn)
- Implemented SMOTE resampling to address severe class imbalance (imbalanced-learn)
- Performed exploratory analysis on application velocity, credit risk, email domain, and behavioral patterns (pandas, seaborn)
- Engineered features and imputed missing values using NumPy and pandas for robust model performance
- Visualized class imbalance, fraud likelihood by categorical attributes, and feature impacts (matplotlib)
- Identified key fraud signals such as high velocity metrics and weak credit histories using feature importance rankings
- Deployed notebook-based reporting with Jupyter and integrated all outputs into a Jekyll-based portfolio site
GitHub Repo
- Built churn prediction models using Logistic Regression and Random Forest (scikit-learn)
- Performed exploratory analysis on tenure, support behavior, subscription type, and user engagement
- Visualized churn patterns and feature impacts using matplotlib and seaborn
- Improved model AUC through Random Forest tuning and feature scaling with StandardScaler
- Delivered business insights to guide customer retention strategies based on high-risk behaviors
GitHub Repo
- Analyzed global start-up trends using SQL (MySQL Workbench) and Python (pandas, SQLAlchemy)
- Explored investment, funding, and valuation metrics across industries, countries, and investors
- Created business-driven visualizations and ROI metrics using Tableau Public
GitHub Repo
- Analyzed NYC short-term rental trends with SQL + Python from Airbnb housing data
- Performed Sentiment Analysis using TextBlob in Jupyter Notebook
- Integrated and enhanced multiple datasets using Python and PostgreSQL to uncover time-based and categorical trends
GitHub Repo