Projects - Hi, I'm Sebastian Marrero

Credit Card Fraud Data Set Analyis

  • Built fraud detection models using Logistic Regression and Random Forest (scikit-learn)
  • Implemented SMOTE resampling to address severe class imbalance (imbalanced-learn)
  • Performed exploratory analysis on application velocity, credit risk, email domain, and behavioral patterns (pandas, seaborn)
  • Engineered features and imputed missing values using NumPy and pandas for robust model performance
  • Visualized class imbalance, fraud likelihood by categorical attributes, and feature impacts (matplotlib)
  • Identified key fraud signals such as high velocity metrics and weak credit histories using feature importance rankings
  • Deployed notebook-based reporting with Jupyter and integrated all outputs into a Jekyll-based portfolio site

GitHub Repo

SaaS Subscription Model Churn Modeling Analysis

  • Built churn prediction models using Logistic Regression and Random Forest (scikit-learn)
  • Performed exploratory analysis on tenure, support behavior, subscription type, and user engagement
  • Visualized churn patterns and feature impacts using matplotlib and seaborn
  • Improved model AUC through Random Forest tuning and feature scaling with StandardScaler
  • Delivered business insights to guide customer retention strategies based on high-risk behaviors

GitHub Repo

Unicorn Company (Start-up) Analysis

  • Analyzed global start-up trends using SQL (MySQL Workbench) and Python (pandas, SQLAlchemy)
  • Explored investment, funding, and valuation metrics across industries, countries, and investors
  • Created business-driven visualizations and ROI metrics using Tableau Public

GitHub Repo

Airbnb Market Analysis

  • Analyzed NYC short-term rental trends with SQL + Python from Airbnb housing data
  • Performed Sentiment Analysis using TextBlob in Jupyter Notebook
  • Integrated and enhanced multiple datasets using Python and PostgreSQL to uncover time-based and categorical trends

GitHub Repo