Measures of the Shape of a Distribution
Since the deviations from the mean are squared when calculating variance, we cannot... Read More
This reading will teach you the tools and techniques used to organize, visualize, and describe data. In addition, you will learn how to convert data into useful information that analysts can use to make important investment decisions.
Data refer to a collection of facts such as numbers, characters, measurements, observations, audio recordings, and videos. Data can either be raw or formatted. Further, data can be classified as follows:
Numerical data represent values that can be measured or counted. Examples of numerical data include age, height, weight, or the number of shares in a portfolio.
Numerical (quantitative) data can be classified into two types: continuous and discrete data.
Categorical data (also called qualitative data) represent qualitative outcomes (i.e., quality or characteristic) of a group of observations. The groups are mutually exclusive. This means that each individual fits only into one category. Examples of categorical data include investment style (i.e., growth vs. value stock) or investment-grade and junk bonds.
Categorical data are split into the following two types:
Data can be classified into cross-sectional, time-series, and panel data depending on the data collection method employed.
A variable is a characteristic or quantity that can be measured, counted, or categorized and is subject to change. A variable can also be called a field, an attribute, or a feature. For example, stock price, market capitalization, dividend and dividend yield, earnings per share (EPS), and price-to-earnings ratio (P/E) are basic data variables for the financial analysis of a public company.
An observation is the value of a specific variable collected at a point in time or over a specified time period. For example, last year DEF Inc. recorded an EPS (earnings per share) of $7.50 — this could be our first observation related to the EPS variable. That value represented a 15% annual increase — this could be our second observation.
Structured data are organized in any pre-defined manner (i.e., rows and columns), and there is a relationship between different rows and columns. Since it is highly organized and formatted, it is easy to access, process, and store. It can work easily with most standard analytical models. Typical examples of structured company financial data include:
Unstructured data are not organized in any pre-defined manner. They can be textual, numbers, dates, etc. Examples of unstructured data include financial news, posts on social media, company filings with a regulator, audio or video recordings, etc. Due to irregularities and disorganization within unstructured data, they are difficult to handle and understand. Unstructured data are usually collected from unconventional sources. Based on the source from which the unstructured data are sourced, they can be classified into the following three groups:
Identify the data type for each of the following items:
Solution
Question
Which of the following is most likely panel data?
- Yearly remittances of five countries from Asia for the past 10 years.
- Customers’ online comments regarding the quality of a product of a company.
- Monthly profits a company earned from 1st of July 2019 to 30th of June 2020.
Solution
The correct answer is A.
Remember that panel data are a mix of time-series and cross-sectional data. Panel data consist of observations through time on one or more variable(s) for multiple observational units. The observations in panel data are usually organized in a matrix format called a data table.
Therefore, yearly remittances of five countries from Asia for the past 10 years qualify to be referred to as panel data.
B is incorrect. Customers’ online comments regarding the quality of product of a company qualify to be referred to as structured data.
C is incorrect. Monthly profits a company earned from 1st of July 2019 to 30th of June 2020 are examples of time series data.