Major Project · Data Science

UPI Fraud
Detection

250K Transactions
₹328M Volume Analysed
4 ML Models Evaluated
Best: CatBoost · AUC 0.92
Total Transactions
250K
Across all UPI types
Total Amount
₹328M
Total transaction volume
Success Rate
95.05%
237.6K successful
Failed Transactions
12.38K
4.95% failure rate
Merchant Category
Which Merchants Drive the Most Transaction Value?
Transaction Status
95% of UPI Transactions Complete Successfully
237,620
SUCCESS · 95.05%
12,380
FAILED · 4.95%
Transaction Type
Which Transaction Type Do Users Prefer?
Monthly Volume
Transaction Volume Stays Consistent Year-Round
Total Transactions
250K
Total Amount
₹328M
Success Rate
0.95
Age Group Activity
Ages 26–35 Are the Most Active UPI Senders
Bank Performance
Which Bank Sends the Highest Transaction Value?
Network Type
Which Network Type Has the Highest Fraud Rate?
Age Group Risk
Which Age Group Has the Highest Fraud Rate?
Fraudulent Transactions
480
Out of 250K total
Fraudulent Amount
₹720K
Total fraud exposure
Overall Fraud Rate
0.19%
Industry avg: ~0.1–0.3%
State-wise Fraud
Which States Have the Most Fraudulent Transactions?
Merchant Targeting
Shopping & Grocery Merchants Are Most Targeted
Time of Day
Fraud Spikes Sharply Between 8PM and 11PM
Day of Week
Which Days of the Week See the Most Fraud?
Best Accuracy
0.97
CatBoost & Random Forest
Best F1 Score
0.38
CatBoost
Best Precision
0.25
CatBoost
Best Recall
0.83
CatBoost
Model Comparison
How Do All 4 Models Compare Across Every Metric?
Model Accuracy F1 Score Precision Recall Verdict
F1 Score Comparison
CatBoost Achieves the Best Fraud Detection F1 of 0.38
Multi-Metric Overview
High Accuracy Alone Is Misleading — F1 Score Decides the Winner
0.92
CatBoost AUC
0.89
Random Forest AUC
0.85
Decision Tree AUC
0.82
Logistic Reg. AUC
ROC Curve
ROC Curve — All Models
Precision-Recall
Precision-Recall Curve — All Models
Best Model
CatBoost
Optimal threshold: 0.657
Top Feature
Amount
Highest SHAP impact
Features Used
19
In final model
SHAP Mean Feature Importance
Top Features Driving CatBoost Predictions
Threshold Tuning
CatBoost — Optimal Threshold = 0.657
Key Findings
What the Model Learned About Fraud Patterns
💰 Amount (INR)
Highest SHAP value. High-value transactions at night are the strongest fraud signal.
🏪 Merchant Category
Shopping & Grocery are top fraud targets. Category encodes risk level strongly.
📅 Weekend Flag
Weekend + high amount is a strong engineered feature. Fraudsters exploit off-hours.
🕐 Hour of Day
8PM–11PM spike confirmed by SHAP. Night transactions are high risk.