Machine Learning Methods
After completing this reading, you should be able to: Discuss the philosophical and... Read More
After completing this reading, you should be able to:
AI risks refer to the potential ills that may cause harm to organizations, consumers, or society at large. Such risks may originate from:
According to the Artificial Intelligence/Machine Learning Risk & Security Working Group (AIRS), potential risks of AI can be categorized into data-related risks, AI/ML attacks, risks related to testing and trust, and compliance (people) risks.
a. Learning Limitations
The effectiveness of any AI/ML system depends on the data used to train it and the scenarios considered during the training. Sadly, training the system on all possible scenarios and data is not always possible. For example, if we were to develop a model that seeks to predict the occurrence of a major financial crisis, we would only have data gathered from past crises. Even then, history shows that past financial crises were not preceded by the exact same conditions or scenarios. There have always been unique situations and “surprise” factors. In the end, we would develop a model that enjoys a certain degree of accuracy, but it would be far from perfect.
Therefore, learning limitation is a key issue that’s often discussed during risk reviews and brainstorming sessions that precede the deployment of an AI system.
b. Data Quality
Poor data quality limits the learning capacity of AI/ML systems leading to inaccurate or unreliable output. Furthermore, poor data quality also negatively impacts future decisions and inferences. Poor data may lead to inaccurate predictions and even failure to achieve the intended objectives.
Potential attacks against AI/ML systems include the following:
a. Data Privacy Attacks
If an attacker is able to infer the data used to train the AI/ML system, they may gain access to sensitive data used to train the model, thereby compromising the privacy of the system as a whole. We have two major types of data privacy attacks: membership inference and model inversion attacks.
In a successful membership inference attack, an attacker is able to determine whether a particular record is present in the training data set. They are able to determine, with a certain degree of confidence, whether a certain input or set of inputs was part of the data used to train the system. On the other hand, in a model inversion attack, an attacker is able to extract representations of training data. They usually achieve this by reconstructing training data from model parameters.
b. Training Data Poisoning
Data poisoning refers to the contamination of the data used to train the AI/ML system.
Data poisoning may increase the error rate and thus negatively impact the learning process and the overall output. “Label-flipping’’ and “frog-boil’’ attacks are examples of attacks under this category.
c. Adversarial Inputs
Some AI systems use inputs from external systems or users. These AI systems interpret such input and undertake some actions, such as classifying the data. An adversary could potentially deploy malicious algorithms designed to bypass the AI system classifier. Most adversarial attacks usually aim to deteriorate the performance of classifiers on specific tasks by essentially “fooling” the machine learning algorithm. Such malicious inputs are referred to as adversarial inputs.
d. Model Extraction
In a model extraction attack, a malicious user attempts to steal the model itself. The adversary first accesses the prediction API of the target model and then queries to extract information about the model’s vital components. The goal is to use this information to gradually train a pseudo model that works pretty much like the target model. Attempts can then be made to sell the pseudo model. In some cases, the adversary may take time to study the model even more closely with the aim of launching further attacks on the parent AI system.
One of the trickiest aspects of AI/ML systems is that some issues do not come to light in the early stages of implementation. Rather, these issues become apparent after continued use. The AI/ML system might also evolve and generate complexities that might worsen over time. Here are potential concerns related to testing and trust risk:
a. Incorrect Output
Certain AI/ML systems are inherently dynamic and prone to changes over time. In particular, the output may evolve over time and differ significantly from the output produced in the early stages of implementation. This poses a real challenge to the testing and validation of the AI/ML system. It may not be possible to carry out reliable tests for all the scenarios, combinations, and permutations of the available data.
b. Lack of Transparency
Although AI/ML systems have been with us for a while, they have still been considered an emerging technology with which most people are not yet fully conversant. Some people have a very different understanding of how these systems work, a situation that has led to persistent trust issues. It is believed, for example, that AI systems work out problems in a ‘’black box’’ that isn’t open to scrutiny.
c. Bias
AI bias refers to injustice or unfairness against a person, entity, or corporation. Evidence shows that AI systems usually churn out biased outcomes that can have serious negative effects on individuals and organizations. Technologies are not neutral; they are only as good (or bad) as the people who develop them. For example, in 2014, software engineers at Amazon developed an employee recruitment program that used the applicants’ resumes as input. However, the program was found to be biased against women for technical roles. This discrimination forced the company to retire the program prematurely.
Biased AI outcomes can also lead to legal, regulatory, reputational, and operational risks.
Policy Non-compliance
As the implementation of AI continues apace in the financial industry, there is a need to consider its effects on the existing internal policies. Indeed, regulatory authorities have expressed a lot of interest in AI, and multiple working groups have been formed to discuss supervisory challenges posed by emerging technologies. This has led to a trove of guidelines, white papers, and surveys on the subject as regulators seek to lay bare all the emerging challenges and how they impact their work.
As the use of AI becomes more widespread within the financial industry, there’s a general consensus among regulatory bodies and individual firms that there’s a need to keep a close eye on AI emerging risks. However, there’s also an acknowledgment that there are multiple ways to govern these risks. Each firm should be allowed to develop or modify its own risk management framework to recognize these risks. There are four key components of AI governance: definitions, inventory, policy/standards, and framework, including controls.
Definitions of AI/ML may vary from one organization to another depending on the organization’s culture, environment, and adoption. The first step toward sound and robust AI governance should include a clear definition of what constitutes AI (and what doesn’t). This definition is vital since it provides the foundation and a clear understanding of the other AI governance components.
For the best results, AI definitions should include the following:
The definitions and the supporting documentation should also clearly indicate how an organization’s stakeholders identify with the AI definitions. Such stakeholders include senior management, system developers, and legal, compliance, and information security officers.
An AI inventory is simply a centralized repository that helps an organization keep track of all AI systems in use and monitor associated risks. An inventory describes the role of each AI system deployed, its uses, and any restrictions on such use. Inventories can also provide a list of data elements that constitute each AI/ML system.
In some cases, the use of AI systems can be governed on the basis of existing policies and standards. However, there may be a need for the formulation of new policies and standards or some modification of the existing ones to ensure that AI is deployed appropriately.
It’s difficult to discuss AI policies and standards without mentioning ethical principles and accepted norms. It should be noted that ethical principles for AI have been discussed in the financial industry for a long time. As a result of these discussions, the financial industry has developed a binding set of principles. Indeed, some institutions have gone as far as publicizing the principles by which they abide. These principles have had positive impacts on organizations and should therefore be developed further.
A robust AI governance framework is important as it helps organizations learn, govern, monitor, and develop the AI system. The first part of an AI governance framework might involve the identification of key stakeholders, formalized in the Center of Excellence (CoE), working group, or council. The stakeholders for various groups and departments collectively form what we call a ‘coalition.’ With such groups, best practices, sharing of knowledge, and guidance on the use of AI systems can be achieved. Such efforts, in most cases, bear fruits when close links are established with technology, data engineers, and business line stakeholders.
In reviewing an AI-enabled initiative, the ‘coalition’ should consider the following:
It should be noted that identifying the potential AI/ML risks helps formulate an operational risk and control framework. Once potential risks have been identified, a gap analysis can be established against existing controls. The number of participating control owners depends on an institution’s control library. Thorough planning and a structured approach are necessary to achieve the gap analysis. The gap analysis results are then used to create new or improved controls to mitigate the identified AI/ML risks.
Other factors to be considered under the AI framework include:
A central monitoring process is important since it provides exposure to the decisions made and the opportunity to raise concerns or challenges appropriately. It is, therefore, necessary that the structure considers the changing needs of an organization as the adoption of AI matures or as changes occur in the industry. Some firms believe that monitoring and oversight procedures already in place sufficiently address potential AI risks.
In most cases, existing governance is designed for scenarios where the degree of human involvement is high. The accuracy, consistency, and efficiency of the existing processes may be improved by reducing or removing interventions.
When deploying an AI/ML system, an organization may involve a dedicated third party who has the knowledge and expertise required to pull off a successful launch. Such a move also enables scalability, increased computing power, and increased access to vendors within the larger fintech system. It is, therefore, necessary that firms strengthen the capabilities of their third-party risk management (TPRM). In some cases, institutions may include third-party contractual clauses with respect to the testing methodology of the AI system.
A three-line-of-defense model includes:
An AI framework should clearly define the roles and responsibilities of various parties within the organization. The following are examples of roles and responsibilities usually considered within an AI framework:
$$\begin{array}{l|l} \textbf{Party} & \textbf{Role} \\\hline
\text{Ethics Review Board} & {\text{Ensures that all AI projects observe}\\ \text{the organization’s norms and principles.}} \\\hline
{\text{Center of Excellence}\\\text{(CoE)}}& {\text{Works across business units or product lines}\\ \text{to provide expert knowledge or training while}\\ \text{also keeping in touch with industrial developments.}}\\\hline
\text{Data Science} &{ \text{Building and running AI algorithms}\\\text{and monitoring the system.}} \\\hline \text{ML Operations} & \text{Data creation and documentation.} \end{array}$$
Interpretability refers to the presentation of the AI system results in a format that a human can understand. On the other hand, discrimination refers to an unfairly biased outcome. The two form a major part of the risk management framework for any organization deploying an AI/ML system. Let’s see the potential risks associated with each.
AI may lead to a discriminatory and unfairly biased outcome if not implemented appropriately. Sources of such poor implementation include biased data, poor AI system training, or the use of alternative AI systems or data sources that could potentially be used to generate better outcomes for certain disadvantaged groups. An AI system that may cause unfairly biased outcomes is likely to cause regulatory non-compliance issues and legal and reputational risk.
Discrimination in areas that affect our day-to-day life, e.g., housing, lending, etc., is prohibited by both federal and state statutes.
According to federal banking regulators, discrimination can be of three types: overt discrimination, disparate treatment, and disparate impact.
Overt discrimination occurs when an AI system openly or actively discriminates. For example, a lending model might be designed to give people from certain regions instant “free” points even before the rest of their data has been considered. The key issue here is that the model is obviously or blatantly providing or offering more favorable terms to one group at the expense of another. However, overt discrimination doesn’t have to be intentional. For example, a loan department might come up with a product that can only be accessed by applicants above a certain age but, in so doing, lock out younger applicants who may well have attained the legally accepted age.
Disparate treatment discrimination is the treatment of members of a protected class differently from members of an unprotected class. In the context of AI lending, it occurs when a model treats an applicant unfairly compared to other applicants based on the applicant’s personal characteristics, e.g., race, gender, or sexual orientation. In an AI system, a good example would be when an insurance company uses an AI model that favors white over black people when assessing eligibility for coverage. It should be noted that a system could be statistically sound but not legally non-discriminatory.
Disparate impact discrimination occurs when an AI system uses a neutral factor to make a decision that affects a protected class more than an unprotected class. For example, a lending model might give a loan to an applicant from a majority-white zip code while turning down a similarly situated applicant from a majority-black zip code. The neutral factor here is the zip code.
Input data may lead to AI-related illegal discrimination in several ways:
Traditional data inputs, for example, credit bureau attributes, have a lower probability of causing disparate impact. This is because they are thoroughly vetted prior to approval and publication for use by lenders. On the other hand, non-traditional data, for example, rental payments, may raise more disparate impact concerns as compared to traditional data. Such data usually raises coverage and accuracy concerns.
The complexity and opacity of algorithms may lead to discriminatory outcomes. Algorithms may create interactions between variables and non-linear relationships that are too complex for humans to understand. Such relationships may cause disparate treatment by creating proxies for protected class status. However, some of these concerns have been addressed by the use of AI methods that allow for a comprehensive understanding of such complex relationships.
System misspecification may also lead to discriminatory outcomes. In this case, prediction features for both outcome and protected class status may be independent, but the class effect is included in the prediction.
As an example, assume that a lending model takes a loan applicant’s shopping habits into account, particularly whether they buy goods at a discount store. At first thought, this might look like an objective variable because it can reasonably be viewed as a measure of wealth and, therefore, a predictor of repayment. But if the system goes ahead to capture the store’s location, it may unintentionally capture a race effect because different neighborhoods have different racial makeups. In this scenario, shopping as a variable may serve as a proxy for the neighborhood, which in turn acts as a proxy for race.
Unlike traditional linear systems, the same training data may be used for training multiple AI/ML systems. Although the output from such systems is likely to be similar, each model is likely to have a unique logical explanation as to how the AI output was generated. The presence of different logical explanations for the same outcome can ignite debate and serious discussions among the system’s developers and monitoring teams.
Interpretability methods enhance human understanding of the AI/ML system, which helps mitigate risks associated with the use of AI/ML systems.
At the core of most AI/ML systems lies probabilistic tools that help the system make decisions based on the likelihood of each set of events. But this means the system may make incorrect decisions because the probability is not predictable. Even if the system shows there’s a 50% chance a borrower will default, there’s simply no way to accurately predict the outcome. The variables considered when computing the probabilities also play a big role in all this. The more unrealistic such variables are, the more likely it is to end up with an incorrect decision.
In some more complicated systems, neither the developer nor the user has a clear understanding of the decision made or whether it is right or wrong. Thus, the interpretability of high-impact AI/ML decisions is a huge source of risk. If there are doubts over the correctness of an AI-driven decision that has a major impact on individuals or the organization as a whole, there will be attempts to improve the model or even replace it entirely to mitigate the effects.
Malicious actors could potentially misuse AI/ML. Interpretability is vital in ensuring that AI/ML systems are protected as security evolves in the AI/ML world. The red team or white-hat hacking audits in testing AI/ML systems may apply post hoc explanation techniques in attacks against AI/ML systems.
Several legal regulations may require the use of interpretable systems, post hoc explanations, and the documentation they facilitate. Such legal regulations include the Equal Credit Opportunity Act, the Fair Credit Reporting Act, and the EU. General Data Privacy Regulation and so on.
To mitigate AI risk, there are three main areas of interest: the training data, the learning procedure, and the output predictions. This gives rise to three corresponding risk management approaches: pre-processing, in-processing, and post-processing approaches. Post-processing approaches are suitable for runtime environments since they do not necessarily require access to the training data. Furthermore, post-processing approaches operate in a black-box environment, meaning that they don’t need access to the internals of models. Thus, they can be applied to any machine learning model.
Oversight of an AI system capped by intensive monitoring to validate various aspects of the system helps ensure the accuracy and efficiency of the system. An oversight process can begin with the creation of an inventory of all the AI systems at a given institution, the uses of the system, techniques employed, developers’ names, and risk ratings. The evaluation includes assessing the inputs, outputs, and the AI system itself. Assessing training data is important for ensuring data quality as well as identifying potential biases that the data may contain. To evaluate the AI system, it is benchmarked against optional models and known methods are utilized to ensure the interpretability of the model.
Drift may result in a number of errors and risks in AI systems. Detection of drift can help mitigate some AI-based risks. Monitoring helps to provide insight into the ‘‘accuracy drift’’ of the model by estimating the accuracy of the model. Monitoring also helps to provide insight into the “data drift” by checking for the deviation of the input data from the training data.
Accuracy drift could worsen your model, while data drift, on the other hand, helps businesses to understand the change in the characteristics of the data at runtime.
The review of input variables and systems for evidence of discrimination in most lending organizations is done by compliance, fair lending, and system governance teams. Technological advances can enable the automation of most of these tasks. Nevertheless, a human-centric approach may be required for a fair AI. It is not possible for an automated process to fully substitute the experience and knowledge of a well-informed team reviewing the AI system for discrimination bias. Therefore, it is important that the first line of defense against discrimination in AI should include some manual review.
A number of recently researched algorithms have proven to reduce class-control discrepancies while maintaining the predictive quality of the system. In order to reduce these discrepancies, the mitigation algorithms find the “optimal” system for a corresponding quality and measure of discrimination.
Variables that cause discrepancies are excluded from the systems. Other tested variables are used in their places. However, these methods have been shown to have low rates of success in complex AI/ML systems.
More recently developed approaches for minimizing discrimination involve engaging in data processing, making decisions within the algorithm, and carrying out post-processing on the output.
One of the major challenges to many institutions is ensuring that the explanations of AI/ML are reliable and useful. Rough estimates and inaccurate or inconsistent explanations in financial service institutions, for example, raise special concerns, especially for credit lending decisions.
To reduce explainability-based risks, institutions may test the explanatory techniques used for accuracy and stability on the simulated data.
Malicious actors could use available AI system explanations and predictions to attack an organization. Organizations can mitigate such potential risks by only sharing the required information. For instance, an institution may only share information that a given consumer requires. Similarly, a firm should strictly share information whose dissemination is legally required.
Strong traditional technology and cyber control could be used for effective AI-based risk mitigation. The use of strong information security practices and watermarking could help mitigate model extraction attacks. Watermarking involves training the AI/ML system to produce unique output for a given input.
Practice Question
A financial firm has deployed a sophisticated AI system for credit scoring. Recently, the firm’s IT department detected anomalous system behaviors. The anomaly report showed that certain borrowers’ credit scores had been unusually boosted during the model’s retraining phase. On further investigation, it was found that the scores were manipulated through the subtle addition of incorrect labels to the training data. The IT department suspects an orchestrated attack to distort the system’s learning process. Which of the following potential AI/ML attacks is the firm most likely experiencing?
A. Membership inference attack
B. Model inversion attack
C. Training data poisoning attack
D. Adversarial inputs attack
The correct answer is C.
Training data poisoning attack typically involves the contamination of the AI/ML system’s training data in a way that negatively influences its learning process or output. In the given scenario, incorrect labels were subtly added to the training data during the model’s retraining phase. This is a clear indication of a Training Data Poisoning Attack aimed at manipulating the system’s learning process and consequently altering the borrowers’ credit scores.
Option A is incorrect because a membership inference attack focuses on an attacker’s attempt to ascertain whether a specific record was included in the training data set used for the AI system. In the scenario provided, there’s no indication that such an inference is being made; the problem revolves around the manipulation of training data, not the disclosure of whether certain records were included in the training set.
Option B is incorrect because a model inversion attack involves an attacker extracting specific information about the training data directly from the model. While this type of attack also concerns the training data, the scenario described does not involve an extraction of data from the model, but rather a manipulation of the labels within the training data to distort the AI system’s learning process.
Option D is incorrect because an adversarial inputs attack typically involves the intentional provision of malicious inputs designed to bypass the AI system’s classification or decision-making mechanisms. In the presented scenario, the problem is not related to the input data being used by the AI system after its training but rather pertains to the alteration of the training data itself.