Introduction to Linear Regression
Linear regression is a mathematical method used for analyzing how the variation in one variable can explain the variation in another variable. Let \(Y\) be the variable we wish to explain. As such, the observation of this variable is \(Y_i\),…
Applications of Big Data and Data Science
Data science is an interdisciplinary field that uses developments in computer science, statistics, and other fields to extract information from Big Data or data in general. Data Processing Methods Data analysts and scientists in big data analysis use different data…
Big Data
Big data is a term that describes large, complex datasets. These datasets are analyzed with computers to uncover patterns and trends, particularly those related to human behavior. Big data includes traditional sources like company reports and government data and non-traditional…
Introduction to Big Data Techniques
Fintech refers to technological innovation in designing and delivering financial services and products. At its core, fintech has helped companies, business owners, and investment managers better manage their operations through specialized software and algorithms. Note that the term fintech is…
Parametric and Non-Parametric Test
Parametric versus Non-parametric Tests of Independence A parametric test is a hypothesis test concerning a population parameter used when the data has specific distribution assumptions. If these assumptions are not met, non-parametric tests are used. In summary, researchers use non-parametric…
Hypothesis Tests of Risk and Risk
Hypothesis Test Concerning Single Mean The z-test is the ideal hypothesis test when the sample’s sampling distribution is normally distributed or when the standard deviation is known. The z-statistic is the test statistic used in hypothesis testing. Testing \(\bf{H_0: \mu…
Hypothesis Testing
A hypothesis is an assumed statement about a population’s characteristics, often considered an opinion or claim about an issue. To determine if a hypothesis is accurate, statistical tests are used. Hypothesis testing uses sample data to evaluate if a sample…
Resampling
Resampling refers to the act of repeatedly drawing samples from the original observed data sample for the statistical inference of population parameters. The two commonly used methods of resampling are bootstrap and jackknife. Bootstrap Using a computer, the bootstrap resampling…
The Central Limit Theorem
The central limit theorem asserts that “given a population described by any probability distribution having mean \(\mu\) and finite variance \(\sigma^2\), the sampling distribution of the sample mean \(\bar{X}\) computed from random samples of size \(n\) from this population will…
Probability Sampling Methods
Sampling is the systematic process of selecting a subset or sample from a larger population. Sampling is essential because it is costly and time-consuming to analyze the whole population. Sampling methods can be broadly categorized into probability sampling and non-probability…