🗓️ Week 11:
Predicting financial crises
with ML techniques

Theme: Applications

28 Mar 2024

The article we’ll discuss today

Liu, Lanbiao, Chen, Chen, & Wang, Bo (August, 2022). Predicting financial crises with machine learning methods. Journal of Forecasting, 41(5), 871–910. https://doi.org/10.1002/for.2840

What are financial crises?

The article(Liu, Chen, and Wang 2022) defines financial crises by their effects:

A financial crisis severely affects both the crisis country and the global economy. On average, real estate prices and stock market indices in crisis countries fall by 35% and 56%, respectively, and unemployment increases by approximately 7% during the downturn of the financial crisis cycle (Reinhart & Rogoff, 2009). To cope with a financial crisis, the government must incur large fiscal expenditures. According to the estimation of Laeven and Valencia (2018), the median fiscal expenditure incurred by governments to deal with crises is equivalent to 6.7% of the GDP in high-income countries, while it is higher in middle-income and low-income countries (approximately 10% of the GDP)

The data

  • dates from financial crises based on dataset from (Laeven and Valencia 2018)
  • covers financial crises dates of 165 countries from 1970 to 2017
  • different categories of crises (currency, banking and sovereign debt crises):
    • currency crisis sharp depreciation of the nominal exchange rate of a country’s currency against the US dollar
    • financial distress in the banking system and the implementation of relevant banking policy intervention measures are signs of the the emergence of a banking crisis
  • based on integrity and availability of data, data of 119 countries (out of original 165) selected for study: 27 classified by World Bank as high-income, 43 as upper-middle-income, 35 as lower-middle-income and 14 low-income countries. Between 1970 and 2017, these countries experienced a total of 278 financial crises, of which 135 were currency crises, 88 banking crises, and 55 sovereign debt crises.

The data (continued)

  • the goal: build an early warning system for predicting financial crises
  • the aim is

to detect crises in advance and provide policymakers with sufficient time to formulate and implement macroeconomic policies as effective countermeasures. Therefore, the financial crisis early warning system studied in this study focuses on the pre-crisis period to determine whether financial imbalances tend to intensify and whether the risk of systemic crises increases. (Liu, Chen, and Wang 2022)

  • Following Alessi and Detken (2018), we regard the 5 years before any crisis as risky periods and other years as safe periods; thus, the dependent variables in this study are dummy variables used to judge whether the country is in a risky period before the financial crisis. As the crisis dates in the database are up to 2017, we cannot judge whether a country was in a safe period after 2012. (Liu, Chen, and Wang 2022)

Methodologies

Baseline model Other “linear model” Machine learning models Ensemble methods Post-hoc explainability techniques
Logistic regression LASSO logistic regression - SVM Linear models and machine learning models further employed as base learners and used to construct ensemble models by voting, averaging, and stacking Shapley values and Shapley regressions
- K-nearest neighbours
- decision trees
- Adaboost
- Gradient Boosted Decision Trees
- Random Forest

LASSO regression

A regression model can overfit or underfit unseen data.

Remember the cost function for linear regression model (i.e residual sum of squares)?

\[ \begin{align} \text{RSS}&=\sum_{i=1}^M (y_i-\hat{y}_i)^2\\ &= \sum_{i=1}^M(y_i-\sum_{j=1}^p x_{ij} \beta_j)^2 \end{align} \]

To prevent overfitting/underfitting, lasso regression adds a term to this cost function: \[ \sum_{i=1}^N(y_i-\sum_{j=1}^p x_{ij} \beta_j)^2 + \lambda \sum_{i=1}^p |\beta_j| \]

This is called L1-regularization. The idea is to penalize irrelevant terms in the regression…

LASSO logistic regression

The same principle applies for LASSO logistic regression: we add a penalty term to the cost function for logistic regression.

In logistic regression, we seek to minimize log loss i.e:

\[ L_{log}= - \sum_{i=1}^N \left[-\ln(1+\exp(\beta_0+\beta_1 x_i))+y_i(\beta_0+\beta_1x_i)\right] \]

The cost function for LASSO logistic regression is simply:

\[ L_{log}+\lambda \sum_{j=1}^p |\beta_j| = - \sum_{i=1}^N \left[-\ln(1+\exp(\beta_0+\beta_1 x_i))+y_i(\beta_0+\beta_1x_i)\right] + \lambda \sum_{j=1}^p |\beta_j| \]

K-nearest neighbors

See this Youtube video for an explanation…

Artificial neural networks (ANN)

Notes on the other methods used in the article

  • Gini impurity metric optimized for decision tree classification

  • Adaboost and Gradient Boosting Decision Trees (GDBT) both boosted tree learners: Adaboost corrects error of previous decision tree by increasing weight of misclassified points while the GBDT model corrects the error of the previous decision tree by constructing a new tree that can optimize the loss function in the negative gradient direction

  • In addition to considering the linear models and machine learning models separately, these models are employed as base learners and used to construct ensemble models by voting, averaging, and stacking (weighted averaging and stacking are mentioned as the methods of choice here):

    • three schemes are considered when choosing the number and types of base learners to be selected:
      1. all nine base learners;
      2. the benchmark logistic model and base learners with better predictive performance than the logistic model; and
      3. three base learners with the best predictive performance.

The predictors

After referring to the relevant literature and considering the availability of data, we finally selected 13 early warning indicators that can be classified into four categories: macroeconomic fundamentals, external sector, domestic credit, and industrial structure. (Liu, Chen, and Wang 2022)

The predictors (continued)

Category Variables Definition
Dependent variables Crisis Risky periods before financial crisis, dummy variable
Currency Risky periods before currency crisis, dummy variable
Banking Risky periods before banking crisis, dummy variable
Sovereign Risky periods before sovereign debt crisis, dummy variable
Early warning indicators Macroeconomic fundamentals Inflation
GDPGrowth
M2Growth
M2_GDP
External sector Export_GDP
CA_GDP
FDI_GDP
OFDI_GDP
NFA_GDP
Domestic credit Credit_GDP
Industrial structure Agriculture_GDP
Industry_GDP
Man_GDP

Results

  • Data split in training-validation and test sets (so as to allow tweaking of hyperparameters and optimization of accuracy on validation set)
  • k-fold cross-validation

all the observations were randomly classified into the training-validation group (75%) and test group (25%). We performed cross-validation on the training-validation group to build models and select the optimal hyperparameters and then carried out the out-of-sample test on the test group to evaluate the early warning performance of these models.(Liu, Chen, and Wang 2022)

Results(continued)

Evaluation metrics

  • case of class imbalance (accuracy is not suitable as a metric!)

Because financial crises are episodic, high accuracy can be achieved if all the periods are predicted as safe periods, which makes it meaningless to consider the accuracy only. Therefore, in addition to accuracy, we consider the receiver operating characteristic (ROC) curve and area under the ROC curve (AUC) when evaluating the out-of-sample predictive performance of different models. (Liu, Chen, and Wang 2022)

Results(continued)

Evaluation metrics

  • TPR = true positive rate, FPR= false positive rate

In this study, a high TPR implies a low probability of missing a risky period, whereas a high FPR implies a high probability of misjudging a safe period as a risky period. If the classification threshold of the classifier decreases, more samples will be predicted as positive; thus, the TPR and FPR will increase simultaneously. However, an ideal classifier should achieve a high TPR and a low FPR; hence, we need to find an early warning model for which the AUC value is close to the maximum value of 1 or the ROC curve slopes toward the upper left corner. Therefore, we use the AUC value as the key evaluation index and attempt to determine the optimal threshold to predict financial crises using the ROC curve.(Liu, Chen, and Wang 2022)

Confusion matrix

Predicted result
No warning Warning
Actual period Safe True negative (TN) False positive (FP)
Risky False negative (FN) True positive (TP)

Results(continued)

Logistic models

The 13 early warning indicators selected in this study are all good indicators that can warn of risky periods before crises. A country with a high inflation rate, slow GDP growth, and large-scale domestic credit is more likely to face a financial crisis; excessively rapid growth of broad money also tends to increase the probability of a crisis. The ratio of broad money to GDP is generally used to describe the degree of financial deepening of a country, reflecting the ability of financial institutions to provide liquidity. According to the empirical results, countries with a high degree of financialization and strong liquidity have a relatively low probability of crises. In addition, maintaining exports of goods and services and current account surpluses, attracting foreign direct investment, and increasing foreign net assets are also conducive to reducing the probability of crises. In terms of industrial structure, countries that more dependent on the added value of agriculture and manufacturing are more vulnerable to financial crises.(Liu, Chen, and Wang 2022)

Both logistic models are comparable (accuracy and AUC).

Results(continued)

ML models

Results(continued)

  • most of the machine learning methods have higher out-of-sample predictive accuracy than the logistic and LASSO-logistic models when predicting financial crises, currency crises, banking crises, and sovereign debt crises.
  • Among the seven machine learning methods, k-NN, SVM, RF, and GBDT were particularly prominent, and their out-of-sample predictive accuracy was stable above 80%.
  • Due to dataset imbalance, authors consider AUC value more important than the accuracy.
  • Most of the machine learning methods were also superior to the logistic and LASSO-logistic models with regard to the AUC value of out-of-sample early warning, especially RF and GBDT, which exhibited outstanding early warning ability.

Results(continued)

Ensemble models

  • (a) combines all models
  • (b) combines models with logistic model + learners with better predictive performance than the logistic model (i.e RF, GBDT, SVM, AdaBoost, k-NN, ANN)
  • (c) combines 3 base learners with the best predictive performance i.e RF, GBDT, and SVM

Results(continued)

  • As expected, the performance of ensemble models (particularly stacking based methods perform much better than single learners).

Opening the black-box: Shapley values

ML algorithms are all about prediction but don’t tell you which features contributed to the prediction or explain it => not that useful if you want to single out which factors were responsible for the previous financial crises and want to take action in time to prevent the next one

Enter Shapley values and Shapley values regressions.

  • Shapley values (and by extension Shapley value regression) originate from game theory

  • As explained in the paper (Liu, Chen, and Wang 2022)

    In the framework of cooperative game theory, the Shapley value is used to calculate each player’s payoff averaged over every possible sequence. Similarly, when the Shapley value is used to analyze machine learning methods, it can reflect the contribution of each predictor to the final prediction. Furthermore, Bluwstein et al. (2020) suggested that the Shapley value can only reflect the importance of predictors in crisis prediction and, thus, they needed to use the Shapley regressions constructed by Joseph (2019) to further judge the economic and statistical significance of these predictors, so as to uncover the relationships between the predictors and crisis risk. Referring to the above-mentioned studies, we use Shapley regressions to find predictors with economic and statistical significance, and we use the Shapley values of these predictors to analyze the causal relationships between them and financial crises.

  • This video gives an intuitive explanation of Shapley values

  • For more details on Shapley values, have a look at this link

Some more results

The authors were not only interested in the predictions but also in explaining the predictions. The rest of the analysis focused on the best learner i.e Random Forest for simplicity.

Some more results

  • the blue dots represent early warning indicators with lower values, while the red dots represent early warning indicators with higher values.
  • The early warning indicators are ranked from top to bottom according to their importance, measured using the mean of the absolute Shapley value, to the model output results
  • the probability of a financial crisis is linearly related to the most predictive indicators
  • the probability of a financial crisis is positively correlated with inflation, broad money growth, net domestic credit, and the ratio of industrial and manufacturing added value to GDP.
  • the probability of a financial crisis is negatively correlated with net foreign assets, the degree of financial deepening, FDI net inflows, GDP growth, and exports of goods and services
  • these results are consistent with those of the logistic model
  • other findings: there are nonlinear relationships between the probability of crises and the proportion of agricultural added value in GDP as well as the current account balance. Countries that depend too strongly or too weakly on agriculture are more likely to suffer crises, and a moderate current account surplus can maintain or even mitigate this risk
  • net foreign assets have a significant impact on all three types of crises (currency, banking and sovereign debt) : the expansion of net foreign assets reduces the probability of a crisis
  • high inflation increases the probability of currency and banking crises
  • Countries with a low degree of financial deepening are more likely to be hit by currency crises, while the probability of a banking crisis tends to increase with greater reliance on manufacturing
  • the probability of a sovereign debt crisis is non-linearly affected by net domestic credit in the percentage of GDP: maintaining domestic credit at a reasonable level can reduce this risk to some extent.

Optimal warning threshold

The author attempted to answer this question: at what (crisis) probability level, should a warning be triggered?

Their answer, based on the above figure, is 37%: when the warning threshold is 37%, the random forest model can achieve the optimal result with a high TPR and low FPR, which is shown as the point closest to the upper left corner…

Trials on more realistic out-of-sample data

  • Random Forest used as an example in another out-of-sample test
  • the above-mentioned 3,032 observations from 119 countries were used as the training-validation group, and the observations from 2013 to 2017 were used as the test group for out-of-sample testing. As mentioned above, it is impossible to judge whether a country was in a safe period after 2012, and it is more important for the warning model to avoid missing crises and be able to send warning signals before crises; hence, the test samples we selected were countries that were hit by financial crises. Moreover, to demonstrate the ability of the models to issue warning signals before a crisis, we further selected countries that experienced financial crises after 2015 as test samples, thus ensuring that there is an early warning period of at least two years before a crisis. Based on the availability of data, we selected Argentina 1 (ARG), Belarus (BLR), Brazil (BRA), the Democratic Republic of the Congo (COD), and Swaziland (SWZ) as test samples. As shown in the financial crisis database established by Laeven and Valencia (2018), Argentina experienced a sovereign debt crisis in 2016. Currency crises hit Belarus, Brazil, and Swaziland in 2015 and the Democratic Republic of the Congo in 2016. We still regarded 37% as the optimal warning threshold.

Trials on more realistic out-of-sample data

  • the crisis probability of Argentina much higher than warning threshold of 37% since 2013, and warning signals constantly issued => early warning model could effectively predict the currency crisis and sovereign debt crisis that struck in 2013 and 2016, respectively.
  • Warning signals issued in Belarus and Swaziland since 2013 suggested currency crises in both countries in 2015. After 2015, probability of crises in both countries decreased and fell below threshold in 2018.
  • For the currency crisis in Brazil in 2015, the model sent out persistent warning signals between 2013 and 2014. Although the risks in Brazil fell after this crisis, there was a sharp rebound in 2018, which should be considered seriously.
  • early warning model also effective for the Democratic Republic of the Congo, and results of model showed that risks in the Democratic Republic of the Congo further intensified in post-crisis period.

Analysis of crisis risk and factors in 2018

(Again Random Forest + Shapley values for explanation)

  • Egypt is high risk (57% probability): mainly caused by hyperinflation and its relatively high dependence on industry. The high scale of domestic credit also increased the risk to some extent.
  • China has a low risk of crisis (19%): mainly due to China’s large-scale net foreign assets, high degree of financial deepening, and low inflation. Similar to Egypt, the high ratio of manufacturing added value to GDP and the large-scale net domestic credit put China at an increased risk of a financial crisis. Therefore, China would need to accelerate the high-quality transformation of its industrial structure and control the domestic credit scale within a more reasonable range
  • For low-risk Singapore (9%), its abundant net foreign assets, reasonable inflation, and high degree of financial deepening constituted insurance against risk

Applicability of models to different types of countries

Applicability of models to different types of countries

Applicability of models to different types of countries

  • The machine learning methods, especially AdaBoost, RF, and GBDT, performed well in both the higher income group and the lower income group

Follow-up article

Chen, Mary and DeHaven, Matthew and Kitschelt, Isabel and Lee, Seung Jung and Sicilian, Martin, Identifying Financial Crises Using Machine Learning on Textual Data (March, 2023). International Finance Discussion Paper No. 1374, Available at SSRN: https://ssrn.com/abstract=4438789 or http://dx.doi.org/10.17016/IFDP.2023.1374

(Shows how to predict financial crises not only on tabular data but also how to improve predictions by including -higher frequency- textual data)

References

Bluwstein, Kristina, Marcus Buckmann, Andreas Joseph, Sujit Kapadia, and Özgür Şimşek. 2023. “Credit Growth, the Yield Curve and Financial Crisis Prediction: Evidence from a Machine Learning Approach.” Journal of International Economics 145: 103773. https://doi.org/https://doi.org/10.1016/j.jinteco.2023.103773.
Chen, Mary, Matthew DeHaven, Isabel Kitschelt, Seung Jung Lee, and Martin J. Sicilian. 2023. “Identifying Financial Crises Using Machine Learning on Textual Data.” Journal of Risk and Financial Management. https://api.semanticscholar.org/CorpusID:237263041.
Laeven, Mr Luc, and Mr Fabian Valencia. 2018. Systemic Banking Crises Revisited. International Monetary Fund.
Liu, Lanbiao, Chen Chen, and Bo Wang. 2022. “Predicting Financial Crises with Machine Learning Methods.” Journal of Forecasting 41 (5): 871–910.