2.What is the KPSS Test, How to Perform KPSS in Python, R, and STATA. In this blog post, we explore the KPSS (Kwiatkowski–Phillips–Schmidt–Shin) test for stationarity, a crucial tool in time series analysis within econometrics. Understanding whether a series is stationary is fundamental before applying forecasting models or regression techniques, as non-stationary data can lead to misleading results.
This post provides a clear explanation of the KPSS test along with practical applications in Python, R, and STATA. We also highlight the importance of econometrics in research, with a special focus on how the KPSS test and the Augmented Dickey-Fuller (ADF) test complement each other to strengthen empirical analysis and ensure robust findings in economics, finance, and social science studies.
Econometrics subject is the major subject of research analysis in all fields but specifically economics and finance performed in Masters, M.Phil. and PhD level researches. This subject is taught in all major Universities around the Globe at these levels such as Bonn, Freie, Konstanz, DIW, PU, QAU, AIOU, MU, DU and many other Universities around the Globe.
Table of Contents
2.What is the KPSS Test, How to Perform KPSS in Python, R, and STATA
The KPSS test, named after Kwiatkowski, Phillips, Schmidt, and Shin, is a statistical method used in econometrics to determine whether a time series is stationary. Stationarity means that the series’ statistical properties, like its mean and variance, do not change over time.
How Does KPSS Differ from the ADF Test?
The KPSS test is unique because it reverses the usual setting of the Augmented Dickey-Fuller (ADF) test, which is more commonly used. In the ADF test, the null hypothesis assumes the series is non-stationary (has a unit root), while in the KPSS test, the null hypothesis assumes the series is stationary.
KPSS Null Hypothesis (H₀): The series is stationary (either around a constant mean or a deterministic trend).
KPSS Alternative Hypothesis (H₁): The series is non-stationary (has a unit root).
Why Use the KPSS Test?
Because the KPSS test assumes the opposite null hypothesis compared to the ADF test, using both tests together allow for stronger, more reliable conclusions about stationarity:
- If the ADF test rejects its null (indicating stationarity) and the KPSS test fails to reject its null (also indicating stationarity), you gain confidence that the series is indeed stationary.
- If the tests disagree, it signals more complex types of non-stationarity, such as sudden shifts in the mean or rapid adjustments, warranting further analysis.
Versions of the KPSS Test
The KPSS test can be applied in two main ways to assess stationarity:
Level Stationarity Test: Assumes the time series fluctuates around a constant mean.
Trend Stationarity Test: Assumes the time series is stationary around a deterministic trend over time.
This makes the KPSS test a valuable complement to traditional tests like the ADF, helping researchers get a fuller picture of a series’ behavior.
The Test Statistic
The KPSS statistic is calculated as:
Where:
The partial sum of the residuals. You first regress your series on a constant (for level stationarity) or on a constant and a trend (for trend stationarity) to get the residuals.
T is the sample size (number of observations).
It is a consistent estimator of the long-run variance of the residuals. This is the tricky part in manual calculation, as it involves selecting a lag truncation parameter to account for serial correlation. For simplicity, we will assume no serial correlation, so it is just the variance of the residuals.
Manual Calculation Example
Let’s use a simple, hypothetical dataset of 10 observations to test for level stationarity.
The Data
We have the following time series:
y | 6 | 7 | 5 | 4 | 8 | 7 | 8 | 9 | 10 | 10 |
Regress on a Constant and Find Residuals
Under the null of level stationarity, the model is:
yt=α+ɛt
The best estimate for α is the mean of y.
y | 6 | 7 | 5 | 4 | 8 | 7 | 8 | 9 | 10 | 10 | ∑y=74 |
t | et =y – y̅ | St | |||
1 | 6 | -1.4 | -1.4 | 1.96 | 1.96 |
2 | 7 | -0.4 | -1.8 | 3.24 | 0.16 |
3 | 5 | -2.4 | -4.2 | 17.64 | 5.76 |
4 | 4 | -3.4 | -7.6 | 57.76 | 11.56 |
5 | 8 | 0.6 | -7 | 49 | 0.36 |
6 | 7 | -0.4 | -7.4 | 54.76 | 0.16 |
7 | 8 | 0.6 | -6.8 | 46.24 | 0.36 |
8 | 9 | 1.6 | -5.2 | 27.04 | 2.56 |
9 | 10 | 2.6 | -2.6 | 6.76 | 6.76 |
10 | 10 | 2.6 | 0 | 0 | 6.76 |
∑St² = 264.4 | ∑et² = 36.4 |
Note: The final St should always be 0 (or very close) when using OLS residuals from a regression that includes a constant.
Calculation of the Variance of residuals
Compare with Critical Value
We need to compare our calculated statistic (0.7263) with the critical value from the KPSS distribution table.
For Level Stationarity at a 5% significance level, the critical value for T=10 is approximately 0.347.
Conclusion:
Since our test statistic (0.7263) is greater than the critical value (0.347), we reject the null hypothesis.
Interpretation: There is statistical evidence to suggest that this time series is not level-stationary.
Note: KPSS Tests heavily depend on lags, which is difficult in manual calculation, so you will have to run KPSS test in Python, R or STATA. Below all the commands are given at the same data for KPSS in Python, R, and STATA.

KPSS Calculation for the Above Data in Python
import numpy as np
import pandas as pd
from statsmodels.tsa.stattools import kpss
# Given data
y = [6, 7, 5, 4, 8, 7, 8, 9, 10, 10]
# Run KPSS test (default = level stationarity, i.e., around mean)
statistic, p_value, n_lags, critical_values = kpss(y, regression=’c’, nlags=”auto”)
# Print results
print(“KPSS Test Statistic:”, statistic)
print(“p-value:”, p_value)
print(“Number of Lags:”, n_lags)
print(“Critical Values:”, critical_values)
# Interpretation
if p_value < 0.05:
print(“Reject Null Hypothesis: The series is likely NON-STATIONARY.”)
else:
print(“Fail to Reject Null Hypothesis: The series is likely STATIONARY.”)
KPSS Calculation for the Above Data in R
# Install the package if not already installed
# install.packages(“tseries”)
# Load the library
library(tseries)
# Given data
y <- c(6, 7, 5, 4, 8, 7, 8, 9, 10, 10)
# Run KPSS test (stationarity around mean)
result <- kpss.test(y, null = “Level”)
# Print results
print(result)
# If you want to test stationarity around a trend:
result_trend <- kpss.test(y, null = “Trend”)
print(result_trend)
KPSS Calculation for the Above Data in STATA
* KPSS is not installed by default in STATA, so first install
ssc install kpss
* Step 1: Enter your data
input y
6
7
5
4
8
7
8
9
10
10
end
* Step 2: Create a simple time index
gen t = _n
* Step 3: Declare the dataset as time series
tsset t
* Step 4: Run KPSS test
* For level stationarity (around a mean)
kpss y, notrend
* For trend stationarity (around a deterministic trend)
kpss y, trend
Why Checking Stationarity with KPSS is Important
Stationarity: The Foundation of Time Series Analysis
Many econometric models like ARIMA, VAR, cointegration tests, and Granger causality—rely on the key assumption that the data series is stationary. A stationary time series has a constant mean, variance, and auto-covariance over time. Without stationarity, models risk producing spurious results, meaning relationships that appear significant but are actually meaningless.
KPSS as a Complement to Other Tests
The KPSS test stands out because its null hypothesis assumes the series is stationary, unlike the ADF or PP tests that assume non-stationarity as their null. Using KPSS alongside these tests is a best practice called confirmatory testing:
- If ADF fails to reject non-stationarity but KPSS rejects stationarity, it indicates a strong case for a unit root (non-stationarity).
- If both tests agree, confidence in the result increases significantly.
This complementary approach reduces the chance of drawing the wrong conclusions about your data.
Guiding Model Choice
Knowing if your series is stationary determines how you model it:
- Stationary series: Model directly (e.g., ARMA).
- Non-stationary series: May require differencing (ARIMA) or checking for cointegration with other series.
Correct identification prevents model misspecification.
Avoiding Spurious Regression
Regressing one non-stationary series on another can produce deceptively high statistical measures, falsely suggesting a meaningful relationship. By applying stationarity tests like KPSS (with ADF/PP), you confirm if differencing or transformations are needed, thus avoiding misleading results.
Implications for Forecasting and Policy
Knowing the stationarity status helps determine whether to forecast in levels or growth rates:
- Non-stationary data (like GDP or stock prices) often grow over time; forecasts use growth rates.
- Stationary data (such as inflation or interest rate spreads) are forecast in levels.
- KPSS aids in making these crucial decisions.
The Importance of Stationarity
Stationarity ensures that the characteristics of your time series are consistent over time, specifically regarding mean, variance, and covariance. Using non-stationary data can lead to:
- Spurious regression: False correlations.
- Unreliable estimates: Biased coefficients and statistics.
- Poor predictive performance: Inaccurate and invalid models.
Why KPSS is Specifically Valuable
A Robust Partner to the ADF Test
The KPSS test’s unique value lies in its opposite null hypothesis compared to ADF:
Test | Null Hypothesis (H₀) | Alternative (H₁) |
ADF | Non-Stationary (unit root) | Stationary |
KPSS | Stationary | Non-Stationary (unit root) |
Using both together gives a clearer picture through four possible scenarios:
Scenario | ADF Result | KPSS Result | Interpretation |
1. Clearly Stationary | Reject H₀ | Do not reject H₀ | Strong evidence of stationarity |
2. Clearly Non-Stationary | Do not reject H₀ | Reject H₀ | Strong evidence of unit root |
3. Difference-Stationary | Do not reject H₀ | Do not reject H₀ | Series needs differencing (stochastic trend) |
4. Trend-Stationary | Reject H₀ | Reject H₀ | Series stationary around a deterministic trend; needs de-trending |
This partnership is essential for choosing the right data transformation.
Clarifying Types of Non-Stationarity
- If ADF says non-stationary but KPSS says stationary (scenario 3), this indicates a stochastic trend requiring differencing (using lags of the series) to achieve stationarity.
- If ADF says stationary but KPSS rejects stationarity (scenario 4), this suggests a deterministic trend where de-trending (removing linear trend) is the correct treatment.
Applying the wrong treatment can reduce forecast accuracy, so KPSS helps avoid costly errors.
Higher Power in Certain Cases
The KPSS test sometimes detects stationarity more reliably than ADF, especially when the series is close to non-stationary but actually stationary. It strengthens confidence when ADF’s evidence is weak.
Real-World Example: Analyzing GDP
An economist testing GDP would:
- Run ADF — if it fails to reject non-stationarity, the GDP likely has a unit root.
- Run KPSS — if it also rejects stationarity, it confirms non-stationarity.
- Conclude the series has a stochastic trend and needs differencing modeling GDP growth rather than the level.
Summary
Using the KPSS test is crucial because it:
- Provides a vital robustness check alongside ADF.
- Helps diagnose the exact type of non-stationarity to apply the correct transformation.
- Improves model accuracy and forecast reliability by proper treatment of trends.
- Offers stronger evidence for stationarity when it fails to reject its null.
In short, ignoring KPSS in time series analysis means missing key diagnostic information that ensures your conclusions and models are trustworthy.
Related Articles
1.Augmented Dickey-Fuller ADF test Manual, using Python, R, STATA