Loan defaulter prediction using data driven methods

Loan default prediction is a fundamental problem in credit risk assessment, where early identification of high-risk borrowers can support informed lending and risk management decisions. With the increasing availability of large-scale financial data, machine learning methods have become widely adopted for modeling default risk due to their ability to capture complex, non-linear relationships.

This project implements a supervised machine learning pipeline for loan default prediction using public loan-level data, inspired by prior ACM research on credit risk modeling. The study focuses on leakage-aware preprocessing, feature engineering across borrower and loan attributes, and handling class imbalance through cost-sensitive learning.

To evaluate robustness and generalization, experiments are conducted across two heterogeneous financial datasets: peer-to-peer installment loan data and an academic credit card default benchmark. Model performance is assessed using ranking-based metrics, and interpretability is emphasized through analysis of key drivers influencing default risk

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
data		data
papers		papers
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Loan defaulter prediction using data driven methods

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Loan defaulter prediction using data driven methods

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages