1. Exploratory
Data Analysis 1.3. EDA Techniques 1.3.5. Quantitative Techniques
| |||
Purpose: Detect Non-Randomness, Time Series Modeling |
The autocorrelation ( Box
and Jenkins, 1976) function can be used for the following two
purposes:
| ||
Definition | Given measurements, Y1,
Y2, ..., YN at time
X1, X2, ..., XN, the
lag k autocorrelation function is defined as
![]() Although the time variable, X, is not used in the formula for autocorrelation, the assumption is that the observations are equi-spaced. Autocorrelation is a correlation coefficient. However, instead of correlation between two different variables, the correlation is between two values of the same variable at times Xi and Xi+k. When the autocorrelation is used to detect non-randomness, it is usually only the first (lag 1) autocorrelation that is of interest. When the autocorrelation is used to identify an appropriate time series model, the autocorrelations are usually plotted for many lags. | ||
Sample Output | Dataplot generated the following
autocorrelation output using the LEW.DAT
data set: THE LAG-ONE AUTOCORRELATION COEFFICIENT OF THE | ||
Questions | The autocorrelation function can be used to
answer the following questions
| ||
Importance | Randomness is one of the key assumptions
in determining if a univariate statistical process is in control. If the
assumptions of constant location and scale, randomness, and fixed
distribution are reasonable, then the univariate process can be modeled
as:
![]() If the randomness assumption is not valid, then a different model needs to be used. This will typically be either a time series model or a non-linear model (with time as the independent variable). | ||
Related Techniques | Autocorrelation
Plot Run Sequence Plot Lag Plot Runs Test | ||
Case Study | The heat
flow meter data demonstrate the use of autocorrelation in determining
if the data are from a random process.
The beam deflection data demonstrate the use of autocorrelation in developing a non-linear sinusoidal model. | ||
Software | The autocorrelation capability is available in most general purpose statistical software programs, including Dataplot. |