Theme: Applications
28 Mar 2024
Liu, Lanbiao, Chen, Chen, & Wang, Bo (August, 2022). Predicting financial crises with machine learning methods. Journal of Forecasting, 41(5), 871–910. https://doi.org/10.1002/for.2840
The article(Liu, Chen, and Wang 2022) defines financial crises by their effects:
A financial crisis severely affects both the crisis country and the global economy. On average, real estate prices and stock market indices in crisis countries fall by 35% and 56%, respectively, and unemployment increases by approximately 7% during the downturn of the financial crisis cycle (Reinhart & Rogoff, 2009). To cope with a financial crisis, the government must incur large fiscal expenditures. According to the estimation of Laeven and Valencia (2018), the median fiscal expenditure incurred by governments to deal with crises is equivalent to 6.7% of the GDP in high-income countries, while it is higher in middle-income and low-income countries (approximately 10% of the GDP)
to detect crises in advance and provide policymakers with sufficient time to formulate and implement macroeconomic policies as effective countermeasures. Therefore, the financial crisis early warning system studied in this study focuses on the pre-crisis period to determine whether financial imbalances tend to intensify and whether the risk of systemic crises increases. (Liu, Chen, and Wang 2022)
Following Alessi and Detken (2018), we regard the 5 years before any crisis as risky periods and other years as safe periods; thus, the dependent variables in this study are dummy variables used to judge whether the country is in a risky period before the financial crisis. As the crisis dates in the database are up to 2017, we cannot judge whether a country was in a safe period after 2012. (Liu, Chen, and Wang 2022)
Baseline model | Other “linear model” | Machine learning models | Ensemble methods | Post-hoc explainability techniques |
---|---|---|---|---|
Logistic regression | LASSO logistic regression | - SVM | Linear models and machine learning models further employed as base learners and used to construct ensemble models by voting, averaging, and stacking | Shapley values and Shapley regressions |
- K-nearest neighbours | ||||
- decision trees | ||||
- Adaboost | ||||
- Gradient Boosted Decision Trees | ||||
- Random Forest |
A regression model can overfit or underfit unseen data.
Remember the cost function for linear regression model (i.e residual sum of squares)?
\[ \begin{align} \text{RSS}&=\sum_{i=1}^M (y_i-\hat{y}_i)^2\\ &= \sum_{i=1}^M(y_i-\sum_{j=1}^p x_{ij} \beta_j)^2 \end{align} \]
To prevent overfitting/underfitting, lasso regression adds a term to this cost function: \[ \sum_{i=1}^N(y_i-\sum_{j=1}^p x_{ij} \beta_j)^2 + \lambda \sum_{i=1}^p |\beta_j| \]
This is called L1-regularization. The idea is to penalize irrelevant terms in the regression…
The same principle applies for LASSO logistic regression: we add a penalty term to the cost function for logistic regression.
In logistic regression, we seek to minimize log loss i.e:
\[ L_{log}= - \sum_{i=1}^N \left[-\ln(1+\exp(\beta_0+\beta_1 x_i))+y_i(\beta_0+\beta_1x_i)\right] \]
The cost function for LASSO logistic regression is simply:
\[ L_{log}+\lambda \sum_{j=1}^p |\beta_j| = - \sum_{i=1}^N \left[-\ln(1+\exp(\beta_0+\beta_1 x_i))+y_i(\beta_0+\beta_1x_i)\right] + \lambda \sum_{j=1}^p |\beta_j| \]
See this Youtube video for an explanation…
Gini impurity metric optimized for decision tree classification
Adaboost and Gradient Boosting Decision Trees (GDBT) both boosted tree learners: Adaboost corrects error of previous decision tree by increasing weight of misclassified points while the GBDT model corrects the error of the previous decision tree by constructing a new tree that can optimize the loss function in the negative gradient direction
In addition to considering the linear models and machine learning models separately, these models are employed as base learners and used to construct ensemble models by voting, averaging, and stacking (weighted averaging and stacking are mentioned as the methods of choice here):
After referring to the relevant literature and considering the availability of data, we finally selected 13 early warning indicators that can be classified into four categories: macroeconomic fundamentals, external sector, domestic credit, and industrial structure. (Liu, Chen, and Wang 2022)
Category | Variables | Definition |
---|---|---|
Dependent variables | Crisis | Risky periods before financial crisis, dummy variable |
Currency | Risky periods before currency crisis, dummy variable | |
Banking | Risky periods before banking crisis, dummy variable | |
Sovereign | Risky periods before sovereign debt crisis, dummy variable | |
Early warning indicators | Macroeconomic fundamentals | Inflation |
GDPGrowth | ||
M2Growth | ||
M2_GDP | ||
External sector | Export_GDP | |
CA_GDP | ||
FDI_GDP | ||
OFDI_GDP | ||
NFA_GDP | ||
Domestic credit | Credit_GDP | |
Industrial structure | Agriculture_GDP | |
Industry_GDP | ||
Man_GDP |
all the observations were randomly classified into the training-validation group (75%) and test group (25%). We performed cross-validation on the training-validation group to build models and select the optimal hyperparameters and then carried out the out-of-sample test on the test group to evaluate the early warning performance of these models.(Liu, Chen, and Wang 2022)
Evaluation metrics
Because financial crises are episodic, high accuracy can be achieved if all the periods are predicted as safe periods, which makes it meaningless to consider the accuracy only. Therefore, in addition to accuracy, we consider the receiver operating characteristic (ROC) curve and area under the ROC curve (AUC) when evaluating the out-of-sample predictive performance of different models. (Liu, Chen, and Wang 2022)
Evaluation metrics
In this study, a high TPR implies a low probability of missing a risky period, whereas a high FPR implies a high probability of misjudging a safe period as a risky period. If the classification threshold of the classifier decreases, more samples will be predicted as positive; thus, the TPR and FPR will increase simultaneously. However, an ideal classifier should achieve a high TPR and a low FPR; hence, we need to find an early warning model for which the AUC value is close to the maximum value of 1 or the ROC curve slopes toward the upper left corner. Therefore, we use the AUC value as the key evaluation index and attempt to determine the optimal threshold to predict financial crises using the ROC curve.(Liu, Chen, and Wang 2022)
Confusion matrix
Predicted result | |||
---|---|---|---|
No warning | Warning | ||
Actual period | Safe | True negative (TN) | False positive (FP) |
Risky | False negative (FN) | True positive (TP) |
Logistic models
The 13 early warning indicators selected in this study are all good indicators that can warn of risky periods before crises. A country with a high inflation rate, slow GDP growth, and large-scale domestic credit is more likely to face a financial crisis; excessively rapid growth of broad money also tends to increase the probability of a crisis. The ratio of broad money to GDP is generally used to describe the degree of financial deepening of a country, reflecting the ability of financial institutions to provide liquidity. According to the empirical results, countries with a high degree of financialization and strong liquidity have a relatively low probability of crises. In addition, maintaining exports of goods and services and current account surpluses, attracting foreign direct investment, and increasing foreign net assets are also conducive to reducing the probability of crises. In terms of industrial structure, countries that more dependent on the added value of agriculture and manufacturing are more vulnerable to financial crises.(Liu, Chen, and Wang 2022)
Both logistic models are comparable (accuracy and AUC).
ML models
Ensemble models
ML algorithms are all about prediction but don’t tell you which features contributed to the prediction or explain it => not that useful if you want to single out which factors were responsible for the previous financial crises and want to take action in time to prevent the next one
Enter Shapley values and Shapley values regressions.
Shapley values (and by extension Shapley value regression) originate from game theory
As explained in the paper (Liu, Chen, and Wang 2022)
In the framework of cooperative game theory, the Shapley value is used to calculate each player’s payoff averaged over every possible sequence. Similarly, when the Shapley value is used to analyze machine learning methods, it can reflect the contribution of each predictor to the final prediction. Furthermore, Bluwstein et al. (2020) suggested that the Shapley value can only reflect the importance of predictors in crisis prediction and, thus, they needed to use the Shapley regressions constructed by Joseph (2019) to further judge the economic and statistical significance of these predictors, so as to uncover the relationships between the predictors and crisis risk. Referring to the above-mentioned studies, we use Shapley regressions to find predictors with economic and statistical significance, and we use the Shapley values of these predictors to analyze the causal relationships between them and financial crises.
This video gives an intuitive explanation of Shapley values
For more details on Shapley values, have a look at this link
The authors were not only interested in the predictions but also in explaining the predictions. The rest of the analysis focused on the best learner i.e Random Forest for simplicity.
The author attempted to answer this question: at what (crisis) probability level, should a warning be triggered?
Their answer, based on the above figure, is 37%: when the warning threshold is 37%, the random forest model can achieve the optimal result with a high TPR and low FPR, which is shown as the point closest to the upper left corner…
the above-mentioned 3,032 observations from 119 countries were used as the training-validation group, and the observations from 2013 to 2017 were used as the test group for out-of-sample testing. As mentioned above, it is impossible to judge whether a country was in a safe period after 2012, and it is more important for the warning model to avoid missing crises and be able to send warning signals before crises; hence, the test samples we selected were countries that were hit by financial crises. Moreover, to demonstrate the ability of the models to issue warning signals before a crisis, we further selected countries that experienced financial crises after 2015 as test samples, thus ensuring that there is an early warning period of at least two years before a crisis. Based on the availability of data, we selected Argentina 1 (ARG), Belarus (BLR), Brazil (BRA), the Democratic Republic of the Congo (COD), and Swaziland (SWZ) as test samples. As shown in the financial crisis database established by Laeven and Valencia (2018), Argentina experienced a sovereign debt crisis in 2016. Currency crises hit Belarus, Brazil, and Swaziland in 2015 and the Democratic Republic of the Congo in 2016. We still regarded 37% as the optimal warning threshold.
(Again Random Forest + Shapley values for explanation)
Chen, Mary and DeHaven, Matthew and Kitschelt, Isabel and Lee, Seung Jung and Sicilian, Martin, Identifying Financial Crises Using Machine Learning on Textual Data (March, 2023). International Finance Discussion Paper No. 1374, Available at SSRN: https://ssrn.com/abstract=4438789 or http://dx.doi.org/10.17016/IFDP.2023.1374
(Shows how to predict financial crises not only on tabular data but also how to improve predictions by including -higher frequency- textual data)
LSE DS202 2023/24 Winter Term | Archive