Feature Extraction, Selection, and Engineering of Textual Data

Feature Extraction Feature extraction entails mapping the textual data to real-valued vectors. After the text has been normalized, the next step is to create a bag-of-words (BOW). It is a representation of analyzing text. It does not, however, represent the…

More Details
Preparing, Wrangling, and Exploring Textual Data for Financial Forecasting

Sentiment analysis refers to the analysis of opinions or emotions from text data. In other words, it refers to how positive, negative, or neutral a particular phrase or statement is regarding a “target.” Such sentiment can provide critical predictive power…

More Details
Model Training

Machine learning (ML) model training entails three tasks: method selection, performance evaluation, and tuning. While there are no standard rules for training an ML model, having a fundamental understanding of domain-specific training data and ML algorithm principles is key to…

More Details
Data Exploration

The main objective of data exploration is to investigate and comprehend data distributions and relationships. Data exploration involves three critical tasks: exploratory data analysis, feature selection, and feature engineering. Exploratory Data Analysis Exploratory Data Analysis (EDA) is the first step…

More Details
Preparing and Wrangling Data

Data preparation and wrangling is a crucial step that entails cleaning and organizing raw data in a consolidated format that allows for more convenient consumption of the data. Data collection precedes the data preparation and wrangling stage. Recall that before…

More Details
Steps in a Data Analysis Project

The term “big data” refers to structured or unstructured data that is significant, fast, or complex, thus strenuous or even impossible to process using traditional methods. The incorporation of big data has prompt implications for building a machine learning model…

More Details
Neural Networks (NNs), Deep Learning Nets (DLNs), and Reinforcement Learning (RL)

Neural networks, deep learning nets, and reinforcement learning are sophisticated algorithms that handle complex tasks with non-linearity and interactions between large numbers of feature inputs. Some of these complicated tasks include image classification, speech recognition, and face recognition. We describe…

More Details
Unsupervised Machine Learning Algorithms

Recall that unlike supervised learning, unsupervised learning does not use labeled data. The algorithm finds patterns within the data. The two main categories of unsupervised ML algorithms are dimension reduction, using principal components analysis, and clustering, which includes k-means and…

More Details
Supervised Machine Learning Algorithms

1. Penalized Regression Penalized regression is a technique that is useful for reducing/shrinking a large number of features to a manageable set and for making good predictions in a variety of large data sets. It is used to avoid overfitting….

More Details
Overfitting and Methods of Addressing it

Overfitting is a problem that arises when the machine learning algorithm fits the training data too well, making it unable to predict well using new data. Overfitting means training a model to such a degree of specificity to the training…

More Details