Pig E. Bank Predictive Modeling
Predictive Modeling | Data Mining | Time-Series Forecasting
Project Overview
-
As an analyst for an anti-money laundering compliance department of a global bank, i'm tasked with:
Understanding and predicting what causes members to leave the bank.
Building models to assess transaction risk and flag suspicious activity.
-
Pig E. Bank client dataset includes
Account balance
Customer demo information (country, age, gender)
Bank member status
Credit card holder status
Estimated salary
Credit score
-
Data bias
Data security & data privacy
Data mining (CRISP-DM, clustering, decision tree)
Predictive modeling (regression & classification models
Time-series forecasting (stationarity, autocorrelation, seasonality, ARIMA, Facebook Prophet)
Analysis
Who leaves the bank?
On average those who left the bank had lower avg credit score, higher avg age & higher avg balance.
What contributes to attrition?
Contribution % of factor =
(# of factor that left bank) / (# of bank clients all time)
Top contributing factors for leaving Pig. E. Bank:
Higher than avg age
Has Credit Card
Had 1 bank product
Had higher than avg balance
Is Female
Decision tree for leaving bank
Conclusions
Recommendations
Workers should be deployed to the 9 states (California, Texas, Florida, New York, Pennsylvania, Illinois, Ohio, Michigan, North Carolina) with most deaths and largest vulnerable populations.
Number of staff deployed should be proportional to the size of the state's vulnerable population.
Staff should be sent in the beginning of December to mitigate against the peak of influenza season (December - March).
Next Steps
Additional analysis should be conducted to determine best course of actions for preventing and treating influenza.
Follow up analysis should be conducted to measure effectiveness of additional medical staff.