Technical Analysis Indicators
Technical analysis indicators are measures used to predict changes in the price of... Read More
Big data is a term used to refer to complex, extremely large data that may be analyzed computationally to reveal patterns, trends, and associations, especially those motivated by human behavior. It encompasses both traditional data sources such as company reports, stock exchange sources, and data gathered from governments as well as nontraditional (alternative) data from social media, sensor networks, and electronic devices.
$$ \begin{array}{c|c|c} \text{MB} & \text{Megabyte} & \text{1 million bytes} \\ \text{GB} & \text{Gigabyte} & \text{1 billion bytes} \\ \text{TB} & \text{Terabyte} & \text{1 trillion bytes} \\ \text{PB} & \text{Petabyte} & \text{1 quadrillion bytes} \\ \end{array} $$
As can be seen, as more data are generated, captured, and stored, data volumes are growing from megabytes (MB) and gigabytes (GB) to far larger sizes, such as terabytes (TB) and petabytes (PB). As this happens, more data, both traditional and nontraditional, are available on a real-time or near-real-time basis. At the same time, the variety also grows.
Structured data refers to information with a high degree of organization. Items can be organized in tables and are commonly stored in a database where each field represents the same type of information.
Unstructured data refers to information with a low degree of organization. Items are unorganized and cannot be presented in tabular forms, such as text messages, tweets, and emails.
Semi-structured data may have the qualities of both structured and unstructured data.
In broad terms, artificial intelligence refers to machines that can perform tasks in ways that are “intelligent.” It has much to do with the development of computer systems that exhibit cognitive and decision-making abilities comparable or superior to that of humans. It is the broader concept of machines being able to carry out tasks in a way that we would consider “smart.” AI can take the form of “if-then” statements or complex statistical models that map raw sensory data to symbolic categories.
Machine learning is a current application of AI that draws knowledge from a large amount of data without making assumptions about the probability distribution of the data. It’s the idea that when exposed to huge amounts of data, machines can make changes on their own and come up with solutions to problems without reliance on human expertise. In machine learning, a computer algorithm is provided with inputs which can be in form of a set of variables or datasets, or outputs, which is basically the target data. The algorithm then learns from the data given how to effectively model inputs into an output or give the best identification and description of a data structure if no output is provided. The algorithm learns by identifying the relationships in the data and then uses this information to improve its learning process. The ML divides the dataset into three unique types: a training dataset, a validation dataset, and a test dataset. A training dataset allows the algorithm to identify the link between inputs and outputs based on the historical pattern in the data. These relationships are then validated, and the model is adjusted using the validation dataset. As the name suggests, the test dataset is used to test the strength of the model in predicting well on the new data. Note that machine learning still needs human intervention in understanding the underlying data and choosing suitable techniques for data analysis. In other words, before data is utilized, it must be cleaned and free from bias and spurious data.
Under supervised learning, computers learn to model data based on labeled training data that contains both the inputs and the desired outputs. Each training example has one or more inputs and a desired output.
Trying to predict the performance of a stock (up, down, or level) during the next business day can be modeled through supervised learning.
Under unsupervised learning, computers are only given input data and tasked with describing the data, for instance, by grouping or clustering data points. In this instance, computers learn from data that has not been labeled or categorized. The computers then “react” based on the presence or absence of commonalities in the data.
Trying to group companies based on their financial and not geographical or industrial characteristics would be a good example of unsupervised learning.
Question
Machine learning most likely refers to:
A. The autonomous acquisition of knowledge through the use of computer programs.
B. The ability of machines to execute coded instructions.
C. The selective acquisition of knowledge through the use of computer programs.
Solution
The correct answer is A.
Machine learning refers to the autonomous acquisition of knowledge through the use of computer programs such that computers learn to work out solutions to problems without human intervention. Machine learning is the idea that computers have the ability to “learn” and execute changes independently.