Candidate’s objectives:

After completing this reading you should be able to:

- Describe the basic steps to conduct a Monte Carlo simulation.
- Describe ways to reduce Monte Carlo sampling error.
- Explain how to use antithetic variate technique to reduce Monte Carlo sampling error.
- Explain how to use control variates to reduce Monte Carlo sampling error and when it is effective.
- Describe the benefits of reusing sets of random number draws across Monte Carlo experiments and how to reuse them.
- Describe the bootstrapping method and its advantage over Monte Carlo simulation.
- Describe the pseudo-random number generation method and how a good simulation design alleviates the effects the choice of the seed has on the properties of the generated series.
- Describe situations where the bootstrapping method is ineffective.
- Describe disadvantages of the simulation approach to financial problem solving.

## Monte Carlo Simulations

The analysis of properties and characteristics of various interesting statistics calls for the application of simulation studies. In econometrics, the technique comes in handy in case a particular estimation method is unknown.

The following are some of the areas in econometrics where simulations are probably applicable:

- Quantification of the simultaneous equation bias which is induced when an endogenous variable is considered exogenous.
- When a Dickery-Fuller test’s appropriate critical values are being determined.
- When the impact of heteroskedasticity on an autocorrelation test’s size and power is being determined.

In finance, the application of simulations include:

- Where exotic options are priced.
- When the impacts of variations in the macroeconomic environment are being determined.
- When risk management models are undergoing stress-testing.

In this regard, we are going to study how a Monte Carlo simulation can be conducted. We follow the following steps:

- First, the data is generated with respect to the desired data generating process. The errors will be drawn from some specified distribution.
- Next, the test statistic is computed after conducting the regression.
- Then the test statistic is saved, (this can include any parameter of interest).
- Conduct the first step \(N\) times.

This quantity \(N\) will denote the number of replications. It is advisable that it should be large and feasible. Applying a few replications increases the sensitivity of the results to odd combinations of random number draws.

## Techniques of Reducing Variance

Let \({ x }_{ i }\) denote the value of the parameter of interest for replication \(i\). It is almost certain that we will obtain different average values of \(x\) by either computing the average value of the said parameter of interest or undertake an exactly similar study only this time to apply different sets of random draws. The following is an equation, \({ S }_{ X }\), for the standard error estimate applied in the evaluation of the sampling variation in a Monte Carlo study:

$$ { S }_{ X }=\sqrt { \frac { var\left( X \right) }{ N } } $$

The estimates of the quantities of interest over the \(N\) replications have a variance denoted as \(var\left( X \right)\). \(N\) should be set at an unfeasibly high level for an acceptable accuracy to be achieved.

Application of the variance reduction techniques can also suppress Monte Carlo sampling error. These techniques are numerous. We will describe two of those techniques namely:

- Antithetic Variates technique, and
- Control Variates technique.

## Antithetic Variates

Since to adequately cover the entire probability space requires sets of sampling to be done repeatedly over and over, the Monte Carlo study requires a lot of replications.

In the method of antithetic variate, the complement of a set of random numbers is taken and a parallel simulation ran on them.

Consider two sets of Monte Carlo simulations and the parameter of interest across them has the following average value:

$$ \bar { x } =\frac { { x }_{ 1 }+{ x }_{ 2 } }{ 2 } \quad \quad \quad \quad \quad equation\quad I $$

Where the replication sets 1 and 2 have average parameter values denoted as \({ x }_{ 1 }\) and \({ x }_{ 2 }\) respectively.

The following is the equation of the variance of \( \bar { x }\):

$$ var\left( \bar { x } \right) =\frac { 1 }{ 4 } \left( var\left( { x }_{ 1 } \right) +var\left( { x }_{ 2 } \right) +2cov\left( { x }_{ 1 },{ x }_{ 2 } \right) \right) \quad \quad \quad equation\quad II $$

Independence will be displayed by the two sets of Monte Carlo replications, absent the application of antithetic variates.

Therefore, the covariance will be zero, such that:

$$ var\left( \bar { x } \right) =\frac { 1 }{ 4 } \left( var\left( { x }_{ 1 } \right) +var\left( { x }_{ 2 } \right) \right) \quad \quad \quad equation\quad III $$

If antithetic variates are applied, the covariance in equation \(II\) will become negative hence reducing the Monte Carlo sampling error.

Since \(corr\left( { u }_{ t },-{ u }_{ t } \right) =cov\left( { u }_{ t },-{ u }_{ t } \right) =-1\), the first impression when antithetic variates will be applied is that there will be a huge reduction in Monte Carlo sampling variation. However, the relevant covariance lies between the standard replication’s simulated quantity of interest and those applying the antithetic variates.

Between the random draws and their antithetic variates, we can have the perfect negative covariance. The quasi-random sequences of draws are the alternative techniques of variance reduction operating via similar principles.

They include stratified sampling, moment-matching, and low-discrepancy sequencing. In these techniques, a specific sequence of representative samples is selected from a specified probability distribution. Then subsequent replications are used to fill the unselected gaps in the probability distribution, by selecting successive samples. This will yield a set of appropriately distributed random draws across all the outcomes of interest.

## Control Variates

In control variates, a similar variable to that applied in the simulation will be used. However, before the simulation, the properties of the variable should be known.

Let the variable of known properties be denoted as \(y\). The variable whose properties are under simulation should be denoted as \(x\). We will carry out the simulation on both \(x\) and \(y\). Employed in both classes will be the same set of random number draws. Furthermore, the simulation estimate of \(x\) will be denoted as \(\hat { x } \), and that of \(y\) as \(\hat { y } \).

Therefore, we can derive a new estimate of \(x\) in the following manner:

$$ { x }^{ \ast }=y+\left( \hat { x } -\hat { y } \right) \quad \quad \quad equation\quad IV $$

Under certain conditions, the Monte Carlo sampling error of \(x\) will surpass that of \({ x }^{ \ast }\).Taking the variance of both the RHS and the LHS in equation \(IV\), we have that:

$$ var\left( { x }^{ \ast } \right) =var\left( y+\left( \hat { x } -\hat { y } \right) \right) \quad \quad \quad equation\quad V $$

Because \(y\) is the analytically known quantity and hence cannot be subjected to sampling variation, \(var\left( { y } \right)=0\). For Monte Carlo sampling variance to be lower with control variates than without them, then:

$$ var\left( { x }^{ \ast } \right) <var\left( \hat { x } \right) $$

Therefore:

$$ var\left( \hat { y } \right) -2cov\left( \hat { x },\hat { y } \right) <0 $$

Or:

$$ cov\left( \hat { x } ,\hat { y } \right) >\frac { 1 }{ 2 } var\left( \hat { y } \right) \quad \quad equation\quad VI $$

If inequality \(VI\) is divided on both sides by the products of the standard deviations, it follows that:

$$ corr\left( \hat { x } ,\hat { y } \right) >\frac { 1 }{ 2 } \sqrt { \frac { var\left( \hat { y } \right) }{ var\left( \hat { x } \right) } } $$

## Re-Usage of Random Numbers across Experiments

The variability of the difference in the estimates across experiments can decline in a massive way if the same set of draws are applied across experiments. Furthermore, when a long series of draws are taken and then divided into several smaller sets to be applied to various different experiments is an alternative possibility.

It is difficult for the computational time to be saved through random number re-usage. This is because a very small proportion of the overall time taken to undertake the whole experiment is usually taken when making the random draws.

## Bootstrapping

A description of the parameters of empirical estimators is obtained through bootstrapping. This entails the application of sample data points and repeated sampling with replacement from the actual data.

Let us consider the estimation of some parameter \(\theta\) given a sample of data:

$$ y={ y }_{ 1 },{ y }_{ 2 },\dots ,{ y }_{ t } $$

We can study a sample of bootstrap estimators to approximate the statistical features of \({ \hat { \theta } }_{ T }\).To do this, we take \(N\) samples of size \(T\) with replacement from \(y\) and with each new sample, \(\hat { \theta } \) is re-computed. We then obtain a series of \(\hat { \theta } \) estimates can consider their distributions.

With bootstrapping, the researcher can make inferences absent some distributional assumptions that are strong. This can be attributed to the fact that the applied distribution is that of the actual data. To compute the test statistic of interest from each set of new samples drawn with replacement from the sample, the sample is treated as a population from which samples can be drawn, hence sampling from the sample.

Let \({ \hat { \theta } }^{ \ast }\) be the test statistics computed from the new samples. We can obtain a distribution of values of \({ \hat { \theta } }^{ \ast }\) and compute the standard errors or any other statistics of interest from the said distribution.

## Bootstrapping in the Context of Regression

The following is a standard regression model:

$$ y=X\beta +u $$

There are two methods to bootstrap the regression model.

### Resampling the Data

In this method, the data is taken and the entire row corresponding to observation \(i\) is resampled together. We follow the below-listed steps in data resampling:

- A sample of size \(T\) is resampled from the original data by resampling with replacement from the whole rows taken together.
- The coefficient matrix, \({ \hat { \beta } }^{ \ast }\), for the bootstrap sample is computed.
- The above steps should be repeated \(N\) times to obtain a set of \(N\) coefficient vectors, \({ \hat { \beta } }^{ \ast }\), which will be all different. This will yield to a distribution of estimates for each of the coefficients.

### Resampling from Residuals

The following steps are applied:

- First, the model is estimated on the actual data and the fitted values \(\hat { y } \) obtained. Then the residuals, \(\hat { u } \), are computed.
- Next, the sample size \(T\) will be taken with replacement from these residuals. We then add the fitted values to the bootstrapped residuals for the bootstrapped-dependent variable to be generated. The bootstrapped residuals are:
$$ { y }^{ \ast }=\hat { y } +{ \hat { u } }^{ \ast } $$

- To obtain a bootstrapped coefficient vector, \({ \hat { \beta } }^{ \ast }\), the new dependent variable is then regressed on the original \(X\) data.
- Finally, apply step 2 to repeat a total of \(N\) times.

## Simulations Where Bootstrap Will be Ineffective

The following are the two situations where bootstraps will not be sufficiently effective:

- In cases where there are outliers in the data, hence there is a likelihood that the bootstrap’s conclusion will be affected.
- Non-independent data – When bootstrap is applied, the assumption the data are independent of one another.

## Random Number Generation

The following regression can be used as the basis for generating numbers that are a continuous uniform (0, 1):

$$ { y }_{ i+1 }=\left( a{ y }_{ i }+c \right) modulo\quad m,i=0,1,\dots ,T $$

Then:

$$ { R }_{ i+1 }=\frac { { y }_{ i+1 } }{ m } \quad for\quad i=0,1,\dots ,T $$

Where there are \(T\) random draws, the initial value of \(y\) is denoted as \({ y }_{ 0 }\), otherwise called the seed, the multiplier is denoted as \(a\), and the increment denoted as \(c\).

The initial value, \({ y }_{ 0 }\), should be specified in order for the random draws to be generated, in all simulation studies. The properties of the generated series will be affected by the choice of this value.

### Demerits of the Distribution

- The calculations involved might be long and sophisticated.
- There is a likelihood of not getting precise results.
- In many situations, replicating the results is difficult.
- Experiment-specific simulation results.

### Monte Carlo Simulation in Econometrics: Deriving a Set of Critical values for a Dickey-Fuller Test

The following is the equation for a Dickey-Fuller test, \({ y }_{ t }\), applied to some series:

$$ { y }_{ t }=\phi { y }_{ t-1 }+{ u }_{ t } $$

We then test for \({ H }_{ 0 }:\phi =1\), against \({ H }_{ 1 }:\phi <1\).

In that case, the following equation gives the relevant test statistic:

$$ \tau =\frac { \hat { \phi } -1 }{ SE\left( \hat { \phi } \right) } $$

It will be necessary for the simulation to obtain the relevant critical values since the test statistic never follow a standard distribution, under the null hypothesis of a unit root.