Regulation of OTC Derivatives Market
After completing this reading, you should be able to: Summarize the clearing process... Read More
After completing this reading, you should be able to:
– Credit risk
– Market risk
– Operational risk
– Regulatory compliance
Machine learning is a branch of artificial intelligence (AI) that uses algorithms to identify patterns in a data set and then imitate decision-making, just like humans. It aims to imitate the way humans learn, gradually improving its predictive power and accuracy. Machine learning is premised on the realization that machines can learn without being programmed to perform specific tasks. Machine learning algorithms use statistical methods to uncover key insights within a data set and then make relevant classifications or predictions.
In recent years, machine learning has gained a strong foothold in the financial industry, particularly banking and insurance. It has been used to decide how much to lend to customers, provide warning signals to traders, detect fraud, and improve compliance with rules and regulations. This chapter explores ways machine learning and AI can improve risk management by leveraging the large volume of data available. We also look at the core machine learning techniques which can be applied to improve risk management.
Machine learning falls into two broad categories: supervised and unsupervised machine learning.
Supervised learning is a machine learning technique where models are trained using labelled data. The goal is to find the mapping function that maps the input variable (X) with the output variable (Y).
Y = f(x)
The word “supervised” comes from the fact that the algorithms used aren’t left to reduce the relationship between X and Y on their own. Instead, the machine is trained using data that are already labelled. It’s pretty much like providing the machine with some questions that are already tagged with the correct answers and then asking it to find the answers to untagged but similar questions.
Regression machine learning is a technique that predicts a single output (dependent variable) value using training data (independent variables). For example, we can use regression to model the risk of loan repayment using a range of explanatory variables, including average nonpayment rates, employment status, credit history, and other outstanding liabilities.
One advantage of machine learning regression over traditional regression is that we can include a larger number of independent variables that can be discarded automatically if they lack any explanatory power. For example, LASSO regression eliminates variables with zero regression power, while Ridge regression gives lower weights to variables in a model that are highly correlated with other variables in a model. We can also begin with zero power for all variables and gradually add the variables found to have explanatory power.
Supervised learning uses principal component analysis to reduce the number of variables by combining variables and extracting common factors. Let’s say you’re modelling credit repayment risk as the dependent variable on the one hand, and among the independent variables, you’ve got (I) owns a house, (II) owns a car, and (III) has bank savings. Rather than working with all three, PCA combines them into a single variable – asset ownership.
Classification involves grouping data into labelled classes. For example, when modelling the likelihood of default, we could have two categories: Potential defaulters and non-defaulters. The model would then be trained on how to classify the data into one of the two classes in an accurate manner. In binary classification, the model works with just two labels – 0 and 1 (yes and no). In the case of multi-class classification, the model classifies data into more than one class.
In unsupervised learning, models are not supervised or trained using labelled data. Instead, models find hidden patterns and insights from the given data without human intervention. The goal is to previously undetected patterns and discover the internal structure of a data set without predefined output categories. Unsupervised Learning methods are used to perform more complex processing tasks compared to supervised learning. For example, a bank could create an algorithm to scrutinize customer accounts and identify those with similarities. This could help the bank develop a product that specifically targets those account holders.
Clustering mainly deals with finding a natural structure or pattern in a collection of raw, unclassified data. Unsupervised Learning Clustering algorithms scour the data to identify any notable clusters (groups). One area where clustering is applied is the detection of spam emails. If an email looks like other emails deemed spam, it is likely to also be spam. There are several clustering approaches, but the most popular one is k-means clustering. But how exactly does it work?
Under k-means clustering, the desired number of clusters, k, is predetermined. The algorithm is then tasked with clustering the data into the k groups through an iterative process. A larger k means you’ve got smaller groupings with more granularity, while a lower k means larger groupings with less granularity. Iteration is aimed at maximizing the difference in means between determined groups. Each group or cluster has its own centroid (central focal point). If we have two clusters, A and B, and a data point Y is closer to the centroid (mean) of A than B’s, then Y is put in cluster A.
Dimensionality reduction is used to analyze and obtain a better representation of data. The data set should have less redundant information at the end of the process, but the important parts may be emphasized. In practice, this technique is used to hive off a section of a large amount of data for closer scrutiny.
In addition to supervised and unsupervised machine learning techniques, we have deep learning and neural networks – other branches of machine learning that can be supervised, unsupervised, or semi-supervised. The two are used to model super complex relationships between variables and ultimately to better mimic human decision-making. A key feature of deep learning is that problems are modelled in a multi-layer network that is extremely difficult to comprehend. As the input data progresses through the model, it’s combined and recombined to form new factors with weights that depend on the combinations made in the previous layer. This leads to output that’s essentially been worked out in a “black box.” This perceived lack of transparency and clarity over decisions made by the model can complicate risk management and can be a source of risk for firms. This is an issue that has widely been mentioned in the digital lending market, where the software used essentially runs borrower data through a black box to determine whether they are eligible for loans and how much they should get.
For a long time, firms relied on classical linear, logit, and probit regressions to model credit risk. However, in recent years, there’s been a realization that AI and machine learning can significantly improve credit risk management and help firms make better lending decisions. Studies have shown that credit risk can be modelled more accurately by combining traditional statistical methods of distress and bankruptcy prediction and neural network algorithms.
One area where machine learning has proved extremely useful is the credit risk analysis of credit default swaps. This is because, in the CDS market, there are many uncertain elements involved in the determination of the likelihood of a default (credit) event and estimation of the cost of default in case a default materializes. In a 14-year study conducted between 2001 and 2014 involving CDSs of different maturities and rating groups, nonparametric machine learning models outperformed traditional benchmark models regarding prediction accuracy. The nonparametric machine learning models also performed better in terms of suggesting the most practical hedging tools.
Banks are increasingly relying on machine learning to make better decisions regarding consumer lending and SME lending.
Market risk emanates from exposure to the financial market, including investing and trading in various assets such as stocks and bonds. Machine learning has been used in several market risk management areas:
Operational risk concerns itself with risks emanating from both internal and external operational breakdowns. That includes people (e.g., strikes and go-slows), systems, frauds, neglected procedures, or natural disasters. In recent years, operational risks have become more complex and more frequent, prompting firms to explore a path towards artificial intelligence and machine learning-based solutions.
AI and machine learning can help firms to:
AI and machine learning can be effective tools against fraud by automating routine tasks to minimize human error, processing unstructured data to screen out relevant content or negative news, and evaluating the extent of interconnectedness among individuals to assess how prone they might be to an external attack. AI tools can also be used to monitor individual traders by combining trade data and their electronic and voice communications records. They can also single out alerts that need a more urgent response.
Any firm that wants a sound risk management system must comply with all risk management regulations. To help with compliance, most firms have turned to RegTech – a subset of fintech that focuses on technologies that can facilitate the delivery of regulatory requirements more efficiently and effectively when compared with the existing traditional capabilities. AI is an excellent RegTech tool because it allows for continuous monitoring of firm activities. This way, the firm has access to real-time insights that help it avoid compliance breaches rather than dealing with the consequences of breaches after they have occurred.
For risk management, AI and machines offer several key benefits:
Improved data processing: AI and machine learning techniques have made it possible to process structured and unstructured data in massive amounts. Datasets can even be combined to form new variables that unravel key relationships.
Improved efficiency: Automation of repetitive tasks can help firms to reduce costs.
Real-time and predictive insights: AI and machine learning tools can alert firms about new exposures faster than traditional tools, increase preventative risk advice, and help firms develop faster response times in critical situations.
Improved decision making: Machine learning is associated with better decision-making through greater (predictive) insights and risk visibility.
Several practical issues plague the use of AI and machine learning techniques in risk management:
The availability of suitable data: Although multiple AI and machine learning tools have been developed with the ability to read all types of data, firms have been slow to organize their internal data in a way that makes its analysis and processing simple and straightforward. In some cases, data is held by different departments (silos) or systems, and sometimes the sharing of such data with other internal departments is restricted. In other cases, firms do not record important data but merely keep it as “informal knowledge.”
Availability of skilled staff: There’s an overall shortage of skilled employees who can work with modern AI and machine learning tools. In the same vein, staff training is a painstaking, time-consuming exercise that has so far failed to keep up with the development of new AI solutions. However, there have been efforts to overcome this problem, notably by creating learning campuses that solely focus on AI and machine learning techniques. For example, Goldman Sachs has set up a campus in India aiming to train more than 7,000 individuals on how to use these tools effectively.
Accuracy of machine learning solutions: There’s no doubt that AI and machine learning solutions can help improve risk management, but not all the underlying tools are effective. Some have been found to suggest unrealistic or inaccurate solutions that cannot be implemented. As such, the use of all machine learning solutions has to be accompanied by continuous monitoring and constant evaluation. Firms have to incorporate human input at various stages instead of giving algorithms full control (from data gathering to decision making).
Transparency and ethics: As good as AI and machine learning tools can be, there seems to be a consensus that some of the underlying solutions are not as transparent as firms and regulatory authorities would want. One of the most contentious solutions is deep learning, where models work out the output in a black box system. If a firm cannot clearly demonstrate how certain decisions are made, it would be difficult to convince regulatory authorities that the models being used are valid and ethically sound.