Augmented Dickey-Fuller ADF test Manual, using Python, R, STATA. Learn how to perform the Augmented Dickey-Fuller ADF test manually to check stationarity in time series data. This guide explains the ADF test step-by-step with practical examples in Python, R, and STATA. Perfect for students, researchers, and data analysts looking to understand unit root testing and improve their econometrics and time series analysis skills. ADF test is the basic test to check the stationarity in Time Series Analysis.
Econometrics subject is the major subject of research analysis in all fields but specifically economics and finance performed in Masters, M.Phil and PhD level researches. This subject is taught in all major Universities around the Globe at these levels such as Bonn, Freie, Konstanz, DIW, PU, QAU, AIOU, MU, DU and many other Universities around the Globe.
Table of Contents
Augmented Dickey-Fuller ADF test Manual, using Python, R, STATA
Introduction
The ADF test helps us figure out if a time series is stable over time or if it changes in some way. A stable (stationary) series means its average, spread, and how values relate to each other don’t shift as time passes. In contrast, a non-stationary series changes in these aspects, like a trend that keeps going up or down.
The Augmented Dickey-Fuller (ADF) test checks if a time series has a “unit root,” which basically means it’s non-stationary.
- The test’s null hypothesis assumes the series is non-stationary (has a unit root).
- The alternative hypothesis suggests the series is stationary (no unit root).
Stationarity is important because many forecasting models, like ARIMA, rely on data that doesn’t change its basic properties over time. If you use non-stationary data without fixing it, the model’s predictions can be unreliable and misleading.
The “Augmented” in ADF means the test is improved to handle more complex patterns in the data, like when past values influence future values beyond just one-time step.
Types of ADF Model
The Augmented Dickey-Fuller (ADF) test has three common model specifications that differ based on whether they include a constant and/or a trend, depending on the nature of the time series data.
1.Without Constant and trend Model
2.With Constant only drift Model
3.With Constant & trend Model
Model 1: No Constant, No Trend (None) Random Walk
- Use when the series fluctuates around zero with no trend and is mean-reverting around zero. This applies to already demeaned or differenced data.
- It assumes no drift or trend, is the most restrictive model, and is rarely used with raw economic or financial data.
- Tests for a zero-mean random walk; null hypothesis: unit root with no drift or trend vs. alternative: stationary around zero.
Model 2: With Constant Only (Drift)
- Use when the series fluctuates around a non-zero mean with no clear trend, typical for financial returns or difference-stationary series.
- Includes a constant term capturing drift (average change).
- Most commonly applied model; appropriate for series with wandering means but no deterministic trend.
- Null hypothesis: unit root with drift; alternative: stationary around a non-zero mean.
Model 3: With Constant and Trend
- Use when the series shows a clear upward or downward deterministic trend, such as macroeconomic variables in levels (GDP, population, prices).
- Includes both constant and a time trend term.
- The most general and conservative model, suitable when a trend may be present.
- Null hypothesis: unit root with drift and trend; alternative: stationary around a deterministic trend.
Why Model Choice Matters
- Models 1 and 2 test stationarity around a constant mean (zero or non-zero).
- Model 3 tests stationarity around a deterministic trend, which is essential when the series has a trending behavior.
- Using a model without appropriate deterministic components may lead to false conclusions (wrong rejection or low power).
Practical Decision Framework
Visual Inspection:
- Fluctuates around zero, no trend → Model 1
- Fluctuates around a non-zero mean, no trend → Model 2
- Shows a clear trend → Model 3
Statistical Strategy:
- Start with the most general model (constant + trend).
- If trend coefficient is insignificant, simplify to Model 2 or Model 1.
- Use information criteria (AIC/BIC) to compare fits.
Economic Context:
- Stock prices: usually tested with constant and trend.
- Interest rates: typically, with constant only.
- GDP growth: usually with constant only.
Example for Manual Calculation
Year | 2000 | 2001 | 2002 | 2003 | 2004 | 2005 | 2006 | 2007 | 2008 |
Yt | 10 | 12 | 15 | 20 | 22 | 25 | 28 | 30 | 35 |
Yt | ∆Yt = Yt – Yt-1 | Yt-1 | Yt-1 x ∆Yt | Y²t-1 |
10 | ||||
12 | 2 | 10 | 20 | 100 |
15 | 3 | 12 | 36 | 144 |
20 | 5 | 15 | 75 | 225 |
22 | 2 | 20 | 40 | 400 |
25 | 3 | 22 | 66 | 484 |
28 | 3 | 25 | 75 | 625 |
30 | 2 | 28 | 56 | 784 |
35 | 5 | 30 | 150 | 900 |
∑(Yt-1 x ∆Yt)= 518 | ∑(Y²t-1)=3662 |
Estimation of Slope (OLS Regression)
Calculation of Residuals
γ̂ | Yt-1 | ∆Yt | ∆Ŷt = γ̂ Yt-1 | ɛ̂ = ∆Yt – ∆Ŷt | ɛ̂² |
0.1414 | 10 | 2 | 1.414 | 0.586 | 0.343396 |
0.1414 | 12 | 3 | 1.6968 | 1.3032 | 1.69833 |
0.1414 | 15 | 5 | 2.121 | 2.879 | 8.288641 |
0.1414 | 20 | 2 | 2.828 | -0.828 | 0.685584 |
0.1414 | 22 | 3 | 3.1108 | -0.1108 | 0.012277 |
0.1414 | 25 | 3 | 3.535 | -0.535 | 0.286225 |
0.1414 | 28 | 2 | 3.9592 | -1.9592 | 3.838465 |
0.1414 | 30 | 5 | 4.242 | 0.758 | 0.574564 |
∑ ɛ̂² = 15.72748 |
Standard Error of γ̂
Test Statistic
Decision
Critical value at 5% (no constant, small sample) ≈ -2.9
Our statistic = +5.72 (positive)
Since ADF > critical value, fail to reject H₀.
Conclusion
The series has a unit root → it is non-stationary.

How to Calculate ADF Full Version Test in Python
# Import libraries
import pandas as pd
from statsmodels.tsa.stattools import adfuller
# Your dataset
data = [10, 12, 15, 20, 22, 25, 28, 30, 35]
# Convert to pandas Series
y = pd.Series(data)
# Run Augmented Dickey-Fuller test
result = adfuller(y)
# Print results
print(“ADF Statistic:”, result[0])
print(“p-value:”, result[1])
print(“Used Lags:”, result[2])
print(“Number of Observations:”, result[3])
print(“Critical Values:”, result[4])
# Interpretation
if result[1] < 0.05:
print(“Reject H0: Series is stationary”)
else:
print(“Fail to reject H0: Series is non-stationary (unit root present)”)
How to Calculate ADF Full Version Test in R
# Install package if not already installed
install.packages(“tseries”)
# Load the library
library(tseries)
# Your dataset
y <- c(10, 12, 15, 20, 22, 25, 28, 30, 35)
# Run Augmented Dickey-Fuller test
adf_result <- adf.test(y)
# Print the result
print(adf_result)
How to Calculate ADF Full Version Test in STATA
clear
input year Yt
2000 10
2001 12
2002 15
2003 20
2004 22
2005 25
2006 28
2007 30
2008 35
end
tsset year
dfuller Yt, lags(1)What should we do for Stationarity?
Different econometric models like ARIMA, OLS, VAR etc. requires stationarity. If a time series is non-stationary, it means that mean, variance and autocorrelation changes overtime which is not suitable for econometric models discussed above.
We should do following measures to make it stationary:
1.Differencing
Take the 1st or 2nd difference such as:
First Difference
Second Difference
2. Transformation (For Non-Constant Variance)
If the series shows changing fluctuations over time—meaning the size of its ups and downs isn’t consistent (called heteroscedasticity) you can apply a transformation to make the variance more stable. Common methods include taking the logarithm or the square root of the values, or using a more flexible approach called the Box-Cox transformation, which covers both log and square root transformations as special cases. These transformations shrink the bigger values more than the smaller ones, which evens out the variation across the data. This is especially helpful for series that grow exponentially.
How to use it: First, apply one of these transformations to your data. But since the series might still show a trend after that, you may also need to take differences (for example, difference the log-transformed series) to fully stabilize it.
3.DE trending
If a time series is non-stationary just because it has a clear, predictable trend like a straight line you can deal with this by modelling that trend and then removing it.
Y’t = Yt – (α + βt)
By taking away the trend, you’re left with the more stable, stationary parts of the series that fluctuate around a constant level.
How to do it: You fit a simple linear or polynomial regression to the time points and then subtract the trend values you’ve estimated from the original data.
Keep in mind, though, this works well if the trend is deterministic (fixed and predictable). For more random or unpredictable trends, differencing the data is usually a better choice because it handles such stochastic trends more effectively.
4. Seasonal Differencing
If your time series shows clear seasonal patterns like sales always peaking every December, it means the series isn’t stable across those seasons.
Seasonal differencing helps fix this by subtracting each value from the value at the same time in the previous season. For example, with monthly data that has yearly seasonality, you’d subtract the value from 12 months ago:
(where m is the seasonal period, like 12 for monthly data).
It removes repeating seasonal effects by comparing each point to its counterpart in the previous cycle.
Seasonal differencing is often combined with regular differencing and is a key step in models like SARIMA.
5. Decomposition
Decomposition takes a detailed approach by splitting the series into its main parts: trend, seasonality, and residuals (the leftover noise).
In additive model:
In multiplicative model:
After breaking the series down, the residual part should be stable and random, which makes it easier to model.
How to use it: Apply decomposition methods like STL to separate these components, then focus on modelling the stationary residuals.
Related Articles
2.What is the KPSS Test, How to Perform KPSS in Python, R, and STATA
