๐Ÿ“‰ Telecom Customer Churn โ€” End-to-End ML Pipeline

Production-style churn prediction workflow with SHAP explainability and profit-driven evaluation.

Streamlit dashboard showing churn probability predictions

Project Overview

Built on the IBM Telco dataset, this project delivers an end-to-end churn prediction system: from data acquisition & cleaning through feature engineering, model selection & evaluation, to a Streamlit app deployment. It incorporates leakage prevention, profit-optimized decision thresholds, and SHAP-based interpretability for actionable insights.

Key Features

Tech Stack โš™๏ธ

  • Python ยท pandas, scikit-learn, xgboost, shap
  • Data ingestion via kagglehub
  • Visualization: matplotlib, Altair, Plotly
  • Deployment: Streamlit
  • Model persistence: joblib
  • Environment: venv (dependency pins for reproducibility)

Try It Yourself

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python -m src.models.train --csv "data/raw/Telco-Customer-Churn.csv"
streamlit run app/Home.py

๐Ÿ”— View Source Code on GitHub
๐Ÿ“Š See Full Analysis

๐Ÿ” Blog Post