Decoding Statistical Testing with a Model-Based Perspective
statistics
How to view classical statistical hypothesis testing in a unified model-based framework.
Author
Matt Bowers
Published
January 19, 2026
Ahh, statistical hypothesis testing—it’s a cornerstone of classical statistical inference. It can also seem a bit daunting when presented as a list of over 100 different methods from which the user is expected to choose the right procedure based on their application domain, data, and question. But it doesn’t have to be that way—there is a unifying perspective that can help us cut right through all that complexity.
It turns out that many traditional frequentist tests can be viewed as special cases of regression models.
This is a powerful and liberating idea, because it can help us transition from choosing from the zoo of canned methods to building our own bespoke analyses, tailored to our exact use case. Adopting this perspective will allow us to
transcend rote memorization of which test should be applied where and instead reason from first principles
make our assumptions clear and explicit
make extensions like incorporating covariates and interactions more natural
So the plan for this post is to take a look at three of the most common statistical tests—the two-sample t-test, the ANOVA, and the chi-squared test for independence. For each test, we’ll simulate some data and implement it using the traditional test approach as well as a regression-based approach. I’m going to do a detailed analytical breakdown for the two-sample t-test to show the equivalence between the two approaches, leaving the analytical breakdowns of the other tests to you, dear reader, as exercises.
Let’s roll!
Two-Sample t-test
We’ll look at the t-test from two perspectives—the classical setup and a linear regression reformulation. In each case we’ll break the approach down into these items: data generating process, estimator, expectation and variance of the estimator, test statistic, and sampling distribution of the test statistic. You can use this kind of breakdown to understand pretty much any classical statistical test. In this case, the point is to clearly show that the classical t-test and the linear regression formulation yield identical tests.
The Classical t-test Approach
The data generating process
You have two populations or processes \(Y_0\) and \(Y_1\), and you want to know whether their true means \(\mu_0\) and \(\mu_1\) are equal. We assume that both processes are Gaussian with equal but unknown variance \(\sigma^2\):
You draw \(n_0\) samples from group 0 and \(n_1\) samples from group 1 for a total of \(n=n_0+n_1\) samples, and compute the sample means \(\bar{Y}_0\) and \(\bar{Y}_1\). Your estimator for the difference in means is simply:
\[\hat{\delta} = \bar{Y}_1 - \bar{Y}_0\]
Expectation of the estimator
Since \(E[\bar{Y}_0] = \mu_0\) and \(E[\bar{Y}_1] = \mu_1\), we have:
Under the null hypothesis \(H_0: \mu_1 = \mu_0\), this test statistic follows a Student’s t-distribution with \(n_0 + n_1 - 2\) degrees of freedom.
Having horrifying flashbacks to your intro to stats class yet? No worries. Let’s look at it from a new perspective.
The Regression Approach
The data generating process
We can express the exact same data generating process as a linear regression model. Stack all observations into a single length-\(n\) vector \(Y\) and create a dummy variable \(X \in \{0,1\}\) indexing which group each observation came from:
\[ Y = \beta_0 + \beta_1 X + \epsilon \]
where \(\epsilon \overset{iid}{\sim} N(0, \sigma^2)\).
For our dummy variable, it turns out that: - The residual variance \(\hat{\sigma}^2\) equals the pooled variance \(\hat{\sigma}_{\text{pooled}}^2\) - The sum \(\sum_{i=1}^{n}(X_i - \bar{X})^2 = \frac{n_0 n_1}{n_0 + n_1}\)
Under the null hypothesis \(H_0: \beta_1 = 0\), this test statistic follows a Student’s t-distribution with \(n_0 + n_1 - 2\) degrees of freedom (the residual degrees of freedom from the regression).
The Punchline
See what just happened? The two approaches give us:
The same point estimate: \(\hat{\delta} = \hat{\beta}_1 = \bar{Y}_1 - \bar{Y}_0\)
The same standard error: \(\sqrt{\hat{\sigma}_{\text{pooled}}^2(1/n_0 + 1/n_1)}\)
The same test statistic: \(t = \frac{\bar{Y}_1 - \bar{Y}_0}{\sqrt{\hat{\sigma}_{\text{pooled}}^2 (1/n_0 + 1/n_1)}}\)
The same sampling distribution: \(t_{n_0+n_1-2}\)
Therefore, the same p-value
In other words these approaches are mathematically equivalent.
Implementation
Let’s simulate some data and implement both testing approaches.
As promised, the two-sample equal-variance t-test yields identical results to a linear regression with a dummy variable.
One-Way ANOVA
The two-sample t-test generalizes naturally to comparing means across more than two groups—that’s one-way ANOVA. The classical ANOVA asks: “Are the means of \(k\) groups all equal, or does at least one differ from the others?”
From a regression perspective, this is just a linear model with a categorical predictor that has \(k\) levels. We create \(k-1\) dummy variables (leaving one group as the reference), and the F-test from ANOVA is equivalent to testing whether all the dummy variable coefficients are simultaneously zero.
Let’s see this in action. We’ll simulate data from three groups with different means and compare the classical ANOVA F-test to the F-test from a linear regression.
Nice! The F-statistics and p-values match perfectly. The ANOVA is testing whether the group coefficients in the regression are all zero—which is exactly what the classical ANOVA F-test does.
Chi-Squared Test of Independence
The chi-squared test asks whether two categorical variables are independent. The classic example: is group (A vs. B) independent of outcome (success vs. failure)? If you arrange the counts in a 2×2 contingency table, the chi-squared test tells you whether the proportions differ significantly between groups.
From a regression perspective, this is testing whether a binary outcome’s probability depends on a categorical predictor—which is exactly what logistic regression does. For a 2×2 table, we model the log-odds of the outcome as a function of group membership, and testing independence is equivalent to testing whether the logistic regression coefficient equals zero.
Note: unlike linear regression where we can use t-tests on coefficients to determine if a predictor matters, logistic regression and other GLMs typically use likelihood ratio tests to compare the full model with one that has the predictor of interest dropped.
Let’s simulate some binary outcome data for two groups and compare the classical chi-squared test to a logistic regression.
import numpy as npimport pandas as pdfrom scipy import statsimport statsmodels.api as smfrom statsmodels.formula.api import logit# Simulate binary outcomes for two groupsnp.random.seed(42)n_a =100n_b =100# Group A: 30% success rategroup_a_outcomes = np.random.binomial(1, 0.30, n_a)# Group B: 50% success rategroup_b_outcomes = np.random.binomial(1, 0.50, n_b)# Create contingency tablecontingency_table = pd.crosstab( index=pd.Series(['A']*n_a + ['B']*n_b, name='group'), columns=pd.Series(np.concatenate([group_a_outcomes, group_b_outcomes]), name='outcome'))# Classical chi-squared testchi2_stat, p_val_chi2, dof, expected = stats.chi2_contingency(contingency_table)# Logistic regression approachdf = pd.DataFrame({'outcome': np.concatenate([group_a_outcomes, group_b_outcomes]),'group': ['A']*n_a + ['B']*n_b})logit_model = logit('outcome ~ C(group)', data=df).fit(disp=0)# Get the likelihood ratio test statistic (comparable to chi-squared)lr_stat = logit_model.llr # Likelihood ratio test statisticp_val_lr = logit_model.llr_pvalueprint("Classical Chi-Squared Test:")print(f" χ² statistic: {chi2_stat:.6f}")print(f" p-value: {p_val_chi2:.6f}")print("\nLogistic Regression (Likelihood Ratio Test):")print(f" LR χ² statistic: {lr_stat:.6f}")print(f" p-value: {p_val_lr:.6f}")
Classical Chi-Squared Test:
χ² statistic: 8.299616
p-value: 0.003965
Logistic Regression (Likelihood Ratio Test):
LR χ² statistic: 9.232498
p-value: 0.002378
Whoa, this time we get slightly different test statistics and p-values; what’s happening here?
The classical Pearson chi-squared test and the likelihood ratio test from logistic regression are actually different test statistics testing the same hypothesis. Both are asymptotically chi-squared distributed under the null, but they use different formulas and give different values for finite samples. The p-values are close and lead to the same conclusion—both are testing whether group membership and outcome are independent.
There are many valid ways to answer questions using data, and appropriate approaches will tend to agree with each other.
Advantages of a Model-Based Testing Approach
We’ve seen that classical tests like the t-test, ANOVA, and chi-squared are special cases of regression models. But so what? Why bother with this reformulation when the canned tests work just fine?
The model-based perspective offers some real practical advantages that become apparent once you start working with messier, real-world data. Let’s look at two key benefits.
Clear and Explicit Assumptions
In the model-based approach all key assumptions are expressed compactly in the model specification itself. Here’s the model we wrote down for comparing group means:
\[ Y = \beta_0 + \beta_1 X + \epsilon \]
where \(\epsilon \overset{iid}{\sim} N(0, \sigma^2)\).
From this model spec, we can immediately see the assumptions being made:
The outcome is assumed to be normally distributed
The variance of the outcome is assumed to be equal in the two groups (homoscedasticity)
The samples in each group are assumed to be independent and identically distributed
We don’t have to memorize separate lists of assumptions for each test in the statistical zoo. We can simply read the model specification and see exactly what we’re assuming about the data generating process. And if one of these assumptions seems questionable for our context, we know exactly which part of the model to modify.
Extension to Covariates
This model-based approach makes it straightforward to include covariates in the analysis. Suppose we’re comparing means between two groups, but we also have a continuous covariate (like age, tenure, marketing spend, baseline measurement, etc.) that might affect the outcome.
In the classical testing framework, we’d need to reach for a different test—maybe ANCOVA—and navigate a new set of conditions and assumptions. But in the regression framework, we simply add another term to our model:
Now \(\beta_1\) represents the group difference adjusted for the covariate. We’re still testing \(H_0: \beta_1 = 0\), just as before. The logic is identical; we’ve just extended the model.
The same logic applies for testing interactions between group membership and covariates.
I don’t even know what canned stat test lets you test interactions like that, but luckily it doesn’t matter. The model-based approach makes these extensions feel like natural elaborations rather than jumps to entirely different procedures. We’re reasoning from first principles about how we think the data were generated, rather than searching through flowcharts for the “right” test.
Wrapping Up
We’ve just scratched the surface of this simple idea that we can take a model-based approach to inference in many cases where the default might be to use a traditional canned statistical procedure. Doing so has many benefits, including making assumptions clear, extensibility to more complex scenarios with other covariates and interactions, and naturally enabling us to use Bayesian inference to do inference on the model parameters. Try it out; next time you reach for an off-the-shelf statistical test, see if you can take a model-based approach instead.