Information Risk and Data Quality Management

Information Risk and Data Quality Management

After completing this reading you should be able to:

  • Identify the most common issues that result in data errors.
  • Explain how a firm can set expectations for its data quality and describe some key dimensions of data quality used in this process.
  • Describe the operational data governance process, including the use of scorecards in managing information risk.

Firms today heavily rely on information to run and improve business objectives; this dependency introduces risks that affect the achievability of the organization’s goals. For this reason, no organization is complete without instituting a process that measures, reports, reacts to, or controls the risk of poor data quality.

Most Common Issues that Result in Data Errors

Missing data 

The problem of missing data manifests when there is no data value in the observation of interest. Missing data can have a significant effect on the eventual conclusions drawn from the data. The following are cases of missing data in the context of a bank:

  • A client list without the “address” field; or
  • A customer profile without details on the official residence.

Data Capturing Errors

This occurs when information is input in the wrong way. Data capturing errors can occur when transcribing words and also when recording numerical data. For example, the data entry editor may use a value of 0.02 to represent the portfolio standard deviation in a VaR model, when the actual deviation is in fact 0.2 (20%). 

Duplicate Records

Duplicate records refer to multiple copies of the same records. The most common form of duplicate data is a complete carbon copy of another record. For example, the details of a particular bondholder may appear twice on the list of all bondholders. Such a duplicate is easy to spot and usually comes up while moving data between systems.

The biggest challenge comes in the form of a partial duplicate. In the context of a bank, a list of bondholders may contain records that have the same name, phone number, email, or residential address, but which also have other non-matching, probably correct data, such as the principal amounts subscribed for. Duplicate data may produce skewed or inaccurate insights.

Inconsistent Data

Data inconsistency occurs when there are different and conflicting versions of the same data at different places. On a list of prospective borrowers, for example, there may be an individual with two distinctive annual income amounts. In such a situation, the risk analyst would have a hard time trying to gauge the borrower’s ability to pay.

Poorly Defined Data

This occurs when data is sectioned in the wrong category. For example, a company account may be filed under a single person’s contact. 

Transformation Errors

Errors may be made when converting data from one format to another. When converting monthly income to annual amounts, some monthly figures may not be converted into annual figures.

Incorrect or Misleading Metadata

Metadata means “data about data” or data that describes other data. A depositor’s metadata, for example, may contain information about when the account was opened, the last date when the account was analyzed for money laundering, and the number of deposit transactions. Errors may be made when entering such data.

Business Impacts of Poor Data Quality

Every successful business operation is built upon high-quality data. Flawed data either delays or obstructs the successful completion of business activities. Since the determination of the specific impacts linked to different data issues is a challenging process, it is important to put these impacts into categories. To this end, we define six primary categories for assessing both the negative impacts incurred as a result of a flaw, and the potential opportunities for improvement that may result from improved data quality:

  • Financial impacts, such as increased operating costs, decreased revenues, or increased penalties.
  • Risk impacts with respect to issues such as credit assessment, competitiveness, fraud, and leakage.
  • Confidence-based impacts, such as waning organizational trust, low confidence in forecast results, and delayed or improper decisions.
  • Satisfaction impacts on customers, employees, suppliers, and the general market.
  • Compliance impacts with respect to government regulations, industry expectations, or firm-specific policies.
  • Productivity impacts such as increased workloads, increased processing time, or compromised end-product quality.

A note on Banks

As high-risk high-impact institutions, banks are not just sensitive to financial impacts. Compromised data may put the bank on a collision course with respect to:

  • Bank Secrecy Act
  • USA Patriot Act
  • Sarbanes-Oxley Act
  • Basel II Accord

Bank Secrecy Act

Also known as the Currency and Foreign Transactions Reporting Act, the Bank Secrecy Act (BSA) is U.S. legislation passed in 1970. It is aimed at preventing financial institutions from being used as tools by criminals to conceal or launder their ill-gotten wealth. The Act establishes recordkeeping and reporting requirements for individuals and financial institutions. In particular, banks must document and report transactions involving more than $10,000 in cash from one customer as a result of a single transaction. As such, any data entry mistakes made while recording transaction details can put banks on a collision course with the authorities.

Patriot Act

The Patriot Act was developed in response to the September 11 attacks. It is intended to help the U.S. government to intercept and obstruct terrorism. The Act requires banks to perform due diligence and with regard to accounts established for foreign financial institutions and private banking accounts established for non-U.S. persons. In addition, the Act encourages financial institutions to share information with regulators and law enforcement whenever they encounter suspicious transactions or individuals. 

Sarbanes-Oxley Act

The Sarbanes-Oxley Act (or SOX Act) is a U.S. federal law that aims to protect investors by ensuring that any data released to the public is reliable and accurate. The act was passed in the aftermath of major accounting scandals such as Enron and WorldCom that were marked by incorrect data to investors and inflated stock prices. Section 302 requires the principal executive officer and the principal financial officer to certify the accuracy and correctness of financial reports.

According to the Act, financial reports and statements must be:

  • Reviewed by signing officers and must have passed internal controls within the last 90 days;
  • Free of untrue statements or misleading omissions;
  • A representative of the company’s true financial health and position; and
  • Accompanied by a list of all deficiencies or changes in internal controls and information on any fraud involving company employees.

Basel II Accord

Basel II accord is a raft of requirements and recommendations for banks issued by the Basel Committee on banking. It guides the quantification of operational and credit risk as a way to determine the amount of capital considered a good guard against those risks. For this reason, the bank must ensure that all risk metrics and models are based on high-quality data.

Data Quality Expectations

In order to manage the risks associated with the use of flawed data, it is important to articulate the expectations among business users with respect to data quality. These expectations are defined in the context of “data quality dimensions that can be quantified, measured, and reported.

The available academic literature in data quality puts forth many different dimensions of data quality. For financial institutions, the initial development of a data quality scorecard can rely on 6 dimensions: accuracy, completeness, consistency, reasonableness, currency, and identifiability.

Accuracy

As the name implies, accuracy is all about correct information. When screening data for accuracy, we must ask ourselves one simple question:  does the information reflect the real-world situation? In the realm of banking, for example, does a depositor really have $10 million in their account?

Inaccurate information may hinder the bank from things such as:

  • Calculating the required economic capital;
  • Determining the actual amount of insurable deposits; and
  • Monitoring accounts with respect to unauthorized access

Completeness

Completeness refers to the extent to which the expected attributes of data are provided. When looking at data completeness, we seek to establish whether all of the data needed is available. for example, it may be mandatory to have every client’s primary phone number, but their middle name may be optional. 

Completeness is important since it affects the usability of data. When a customer’s email address and phone number are unavailable, for example, it may not be possible to contact and notify them of suspicious account activity.

Reasonableness

Reasonableness measures conformance to consistency expectations relevant to specific operational contexts. For example, the cost of sales on a given day is not expected to exceed 105% of the running average cost of sales for the previous 30 days.

Consistency

Data is considered consistent if the information stored in one place matches relevant data stored elsewhere. In other words, values in one data set must be reasonably comparable to those in another data set. For example, if records at the human resource department show that an employee has already left a company but the payroll department still sends out a check to the individual, that’s inconsistent.

Currency

The currency of data refers to the lifespan of data. In other words, is the data still relevant and useful in the current circumstances?  To assess the currency of data, the organization has to establish how frequently the data needs to be updated. 

Uniqueness

Uniqueness implies that no entity should exist more than once within the data set and that there should be a key that can be used to uniquely access each entity within the data set. ‘Andrew A. Peterson’ and ‘Andrew Peterson’ may well be one and the same person.

What is Operational Data Governance?

This refers to the processes and protocols put in place to ensure that an acceptable level of confidence in the data effectively satisfies the organization’s business needs. The data governance program defines the roles, responsibilities associated with managing data quality. Once a data error is identified, corrective action is taken immediately to avoid or minimize downstream impacts.

Corrective action usually entails notifying the right individuals to address the issue and determining if the issue can be resolved within a specified time agreed upon. The data is then inspected to ensure that it complies with data quality rules. Service-level agreements (SLAs) specify the reasonable expectations for response and remediation.

Data Quality Inspection vs. Data Validation

Data validation is a one-step exercise carried out to establish whether the data complies with a set of defined business rules. Data inspection, on the other hand, is a continuous process aimed at:

  • Reducing the number of errors to a reasonable and manageable
    level;
  • Identifying data flaws and making adjustments to pave way for the completion of data processing; and
  • Setting in motion a mitigation or remediation program to solve the problem within an agreed-to time frame.

As such, the goal of data inspection is not to keep data issues at zero. Rather, the exercise is aimed at catching issues early before they cause significant damage to the business. Operational data governance should include the following:

  • Data Governance Team: A data governance manager leads a program office that includes data architects and governance specialists.
  • Data Stewards: Oversee data sets and are in charge of the implementation of policies and monitoring compliance.
  • Data Quality Analyst: Works with the governance team and stewards to fix data errors and data quality metrics.
  • Data Governance Council: Made up of executives from all business units. It sets data policies and standards and resolves issues.
  • Chief Data Off: Bears overall responsibility for the firm’s data governance program.

Data Quality Scorecard

A data quality scorecard is a fundamental aid for enabling the company to perform successful data quality management. It measures the quality of data against pre-defined business rules over a given period of time. A scorecard has the ability to point out the rule or entry behind a weak/strong data quality score. It enables the establishment of targeted measures that optimize data quality.

Sample Data Quality Scorecard

Types of Data Quality Scorecards

Basic-level metric

A basic-level metric is measured against clear data quality criteria, e.g., accuracy. It quantifies specific observance of acceptable levels of defined data quality rule and is relatively easy to quantify.

Complex metric

This is a weighted average of several metrics. Various weights are applied to a collection of different metrics, both base-level and complex ones.

Complex data quality metrics can be accumulated in the following three ways:

  1. By issues: Evaluating the impact of specific data quality across multiple processes demonstrates the spread of trouble across the organization caused by a specific data flaw;
  2. By business process: In this view, managers can examine risks and failure preventing the business process; or
  3. By business impact: Since an impact may incur as a result of a number of different data quality issues, this view displays the aggregation of business impacts rolled up from the different issues across different process flaws.

Practice Question

Which of the following is NOT a factor to be considered by Higher-North Bank while looking at data quality expectation?

A. Accuracy

B. Completeness

C. Consistency

D. Variability

The correct answer is D.

It is important to determine the accuracy, completeness, and consistency of data. All these factors are necessary for determining the quality of data while variability is not a term included in this list.

Shop CFA® Exam Prep

Offered by AnalystPrep

Featured Shop FRM® Exam Prep Learn with Us

    Subscribe to our newsletter and keep up with the latest and greatest tips for success
    Shop Actuarial Exams Prep Shop Graduate Admission Exam Prep


    Daniel Glyn
    Daniel Glyn
    2021-03-24
    I have finished my FRM1 thanks to AnalystPrep. And now using AnalystPrep for my FRM2 preparation. Professor Forjan is brilliant. He gives such good explanations and analogies. And more than anything makes learning fun. A big thank you to Analystprep and Professor Forjan. 5 stars all the way!
    michael walshe
    michael walshe
    2021-03-18
    Professor James' videos are excellent for understanding the underlying theories behind financial engineering / financial analysis. The AnalystPrep videos were better than any of the others that I searched through on YouTube for providing a clear explanation of some concepts, such as Portfolio theory, CAPM, and Arbitrage Pricing theory. Watching these cleared up many of the unclarities I had in my head. Highly recommended.
    Nyka Smith
    Nyka Smith
    2021-02-18
    Every concept is very well explained by Nilay Arun. kudos to you man!
    Badr Moubile
    Badr Moubile
    2021-02-13
    Very helpfull!
    Agustin Olcese
    Agustin Olcese
    2021-01-27
    Excellent explantions, very clear!
    Jaak Jay
    Jaak Jay
    2021-01-14
    Awesome content, kudos to Prof.James Frojan
    sindhushree reddy
    sindhushree reddy
    2021-01-07
    Crisp and short ppt of Frm chapters and great explanation with examples.

    Leave a Comment