# Data Types

This reading will teach you the tools and techniques used to organize, visualize, and describe data. In addition, you will learn how to convert data into useful information that analysts can use to make important investment decisions.

## Data Types

Data refer to a collection of facts such as numbers, characters, measurements, observations, audio recordings, and videos.  Data can either be raw or formatted. Further, data can be classified as follows:

1. Numerical versus categorical data.
2. Cross-sectional versus time-series versus panel data.
3. Structured versus unstructured data.

### Numerical versus Categorical Data

Numerical data represent values that can be measured or counted. Examples of numerical data include age, height, weight, or the number of shares in a portfolio.

Numerical (quantitative) data can be classified into two types: continuous and discrete data.

• Continuous data: Data that can be measured on an infinite scale. Such data can take any numerical value in a specified range of values, no matter how small. Examples include measures of temperature (25.5 degrees Celsius), height (1.81 meters), length (25.256 meters), etc.
• Discrete data: Refer to quantitative data that can be counted and have a finite number of possible values. For example, days in a week (7 days), number of employees in a company (3 employees — we cannot have 3.5 workers).

Categorical data (also called qualitative data) represent qualitative outcomes (i.e., quality or characteristic) of a group of observations. The groups are mutually exclusive. This means that each individual fits only into one category. Examples of categorical data include investment style (i.e., growth vs. value stock) or investment-grade and junk bonds.

Categorical data are split into the following two types:

• Nominal data: Categorical values that have no inherent numerical significance since they do not rank data. A good example would be gender representation, e.g., 1 to represent ‘male’ and 2 to represent ‘female.’
• Ordinal data: Categorical values that rank data according to some characteristics where each category has an ordered relationship to all the other categories. Although ordinal scales can rank data in some order, the magnitude of the difference in categories cannot be quantified or measured. A good example is a rating scale from 1 to 3, where 1 represents ok, 2 represents good, and 3 represents excellent. Although ordinal data can be ranked or ordered, they do not necessarily indicate a numerical difference between them.

### Cross-sectional versus Time-series versus Panel Data

Data can be classified into cross-sectional, time-series, and panel data depending on the data collection method employed.

• Cross-sectional data: Refer to a set of observations made at a point in time. Samples are constructed by simultaneously collecting the data of interest across a range of observational units — people, objects, firms, etc. A good example of cross-sectional data is the stock returns Microsoft, IBM, and Samsung shareholders earned in the year ended, 31st December 2021.
• Time-series data: Refer to a set of observations made over a given period at specific and equally-spaced time intervals. That the observations are made at specific points in time shows that time intervals are discrete. A good example of time-series data could be the daily or weekly closing price of a stock recorded over a period spanning 13 weeks. Other appropriate examples could be the set of monthly profits (both positive and negative) Samsung earned between the 1st of October 2021 and the 1st of December 2021.
• Panel data: Comprise a combination of time series data and cross-sectional data. An Example of panel data can be studying the GDP of three developing countries for a period spanning three years, from 2019 to 2021.

A variable is a characteristic or quantity that can be measured, counted, or categorized and is subject to change. A variable can also be called a field, an attribute, or a feature. For example, stock price, market capital­ization, dividend and dividend yield, earnings per share (EPS), and price-to-earnings ratio (P/E) are basic data variables for the financial analysis of a public company.

An observation is the value of a specific variable collected at a point in time or over a specified time period. For example, last year DEF Inc. recorded an EPS (earnings per share) of \$7.50 — this could be our first observation related to the EPS variable. That value represented a 15% annual increase — this could be our second observation.

### Structured versus Unstructured Data

Structured data are organized in any pre-defined manner (i.e., rows and columns), and there is a relationship between different rows and columns. Since it is highly organized and formatted, it is easy to access, process, and store. It can work easily with most standard analytical models. Typical examples of structured com­pany financial data include:

• Market data: For example, daily trading yields of a bond, daily closing stock prices, bond prices, and trading volumes.
• Fundamental data: Data available in the financial statements, i.e., earnings per share, dividend per share, and gross margins.
• Analytical data: Data derived from analytics, i.e., forecast operating profit growth and forecast cash flow from operations.

Unstructured data are not organized in any pre-defined manner. They can be textual, numbers, dates, etc. Examples of unstructured data include financial news, posts on social media, company filings with a regulator, audio or video recordings, etc. Due to irregularities and disorganization within unstructured data, they are difficult to handle and understand. Unstructured data are usually collected from unconventional sources. Based on the source from which the unstructured data are sourced, they can be classified into the following three groups:

1. Produced by individuals (i.e., via social media posts, web searches, etc.).
2. Generated by business processes (i.e., via credit card transactions, corporate regulatory filings, etc.).
3. Generated by sensors (i.e., via satellite imagery, foot traffic by mobile devices, etc.).

#### Example: Data Types

Identify the data type for each of the following items:

1. Microsoft, IBM, and Samsung stock returns shareholders earned for the year ended 31st December, 2020.
2. Price change of a stock.
3. The number of students in a class.
4. Color of a smartphone.
5. Grades of a student in a quiz.

Solution

1. Cross-sectional data: Microsoft, IBM, and Samsung stock returns shareholders earned for the year ended 31st December 2020.
2. Continuous data: Price change of a stock.
3. Discrete data: Number of students in a class.
4. Nominal data: Color of a smartphone.
5. Ordinal data: Grades of students in a quiz.

## Question

Which of the following is most likely panel data?

1. Yearly remittances of five countries from Asia for the past 10 years.
2. Customers’ online comments regarding the quality of a product of a company.
3. Monthly profits a company earned from 1st of July 2019 to 30th of June 2020.

Solution

Remember that panel data are a mix of time-series and cross-sectional data. Panel data consist of observations through time on one or more variable(s) for multiple observational units. The observations in panel data are usually organized in a matrix format called a data table.

Therefore, yearly remittances of five countries from Asia for the past 10 years qualify to be referred to as panel data.

B is incorrect. Customers’ online comments regarding the quality of product of a company qualify to be referred to as structured data.

C  is incorrect. Monthly profits a company earned from 1st of July 2019 to 30th of June 2020 are examples of time series data.

Shop CFA® Exam Prep

Offered by AnalystPrep

Featured Shop FRM® Exam Prep Learn with Us

Subscribe to our newsletter and keep up with the latest and greatest tips for success

Sergio Torrico
2021-07-23
Excelente para el FRM 2 Escribo esta revisión en español para los hispanohablantes, soy de Bolivia, y utilicé AnalystPrep para dudas y consultas sobre mi preparación para el FRM nivel 2 (lo tomé una sola vez y aprobé muy bien), siempre tuve un soporte claro, directo y rápido, el material sale rápido cuando hay cambios en el temario de GARP, y los ejercicios y exámenes son muy útiles para practicar.
diana
2021-07-17
So helpful. I have been using the videos to prepare for the CFA Level II exam. The videos signpost the reading contents, explain the concepts and provide additional context for specific concepts. The fun light-hearted analogies are also a welcome break to some very dry content. I usually watch the videos before going into more in-depth reading and they are a good way to avoid being overwhelmed by the sheer volume of content when you look at the readings.
Kriti Dhawan
2021-07-16
A great curriculum provider. James sir explains the concept so well that rather than memorising it, you tend to intuitively understand and absorb them. Thank you ! Grateful I saw this at the right time for my CFA prep.
nikhil kumar
2021-06-28
Very well explained and gives a great insight about topics in a very short time. Glad to have found Professor Forjan's lectures.
Marwan
2021-06-22
Great support throughout the course by the team, did not feel neglected
Benjamin anonymous
2021-05-10
I loved using AnalystPrep for FRM. QBank is huge, videos are great. Would recommend to a friend
Daniel Glyn
2021-03-24
I have finished my FRM1 thanks to AnalystPrep. And now using AnalystPrep for my FRM2 preparation. Professor Forjan is brilliant. He gives such good explanations and analogies. And more than anything makes learning fun. A big thank you to Analystprep and Professor Forjan. 5 stars all the way!
michael walshe
2021-03-18
Professor James' videos are excellent for understanding the underlying theories behind financial engineering / financial analysis. The AnalystPrep videos were better than any of the others that I searched through on YouTube for providing a clear explanation of some concepts, such as Portfolio theory, CAPM, and Arbitrage Pricing theory. Watching these cleared up many of the unclarities I had in my head. Highly recommended.