Table of Contents
- What Is Goodness-of-Fit?
- Understanding Goodness-of-Fit
- Establish an Alpha Level
- Types of Goodness-of-Fit Tests
- Kolmogorov-Smirnov (K-S) Test
- The Anderson-Darling (A-D) Test
- Shapiro-Wilk (S-W) Test
- Other Goodness-of-Fit Tests
- Importance of Goodness-of-Fit Tests
- Goodness-of-Fit Test vs. Independence Test
- Goodness-of-Fit Example
- What Does Goodness-of-Fit Mean?
- Why Is Goodness-of-Fit Important?
- What Is Goodness-of-Fit in the Chi-Square Test?
- How Do You Do the Goodness-of-Fit Test?
- The Bottom Line
What Is Goodness-of-Fit?
Let me explain goodness-of-fit to you directly: it's a statistical test that checks how well your sample data fits a distribution from a population assuming a normal distribution. In simple terms, it tests if your sample is skewed or if it truly represents what you'd expect in the actual population.
Goodness-of-fit measures the difference between the values you observe and those you'd expect from the model under a normal distribution. You have several methods to calculate this, including the chi-square test.
Key Takeaways
- Goodness-of-fit shows if your sample data matches an expected set from a population with normal distribution.
- There are various goodness-of-fit tests, with chi-square being the most common.
- The chi-square test checks for relationships between categorical data.
- The Kolmogorov-Smirnov test verifies if a sample comes from a specific population distribution.
Understanding Goodness-of-Fit
You should know that goodness-of-fit tests are statistical tools for making inferences about observed values. For example, they can tell you if a sample group really represents the whole population.
These tests show how actual values relate to the predicted ones in your model. When you're making decisions, they help you predict future trends and patterns more easily.
As I mentioned, there are several types, like the chi-square (the most common), Kolmogorov-Smirnov, and Shapiro-Wilk. You usually run these with software, but statisticians can use specific formulas for each.
To do the test, you need a variable with an assumed distribution, plus a data set including observed values from your actual data, expected values from your assumptions, and the total categories in the set.
Remember, these tests are often used to check normality of residuals or if two samples come from identical distributions.
Establish an Alpha Level
When interpreting a goodness-of-fit test, you must set an alpha level, like the p-value in a chi-square test.
The p-value is the probability of getting results as extreme as what you observed, assuming the null hypothesis is correct.
A null hypothesis says there's no relationship between variables, while the alternative says there is one.
You measure the frequency of observed values, then use them with expected values and degrees of freedom to calculate chi-square. If the result is below alpha, the null is invalid, meaning a relationship exists.
Types of Goodness-of-Fit Tests
Let's break down the main types. The chi-square test, or chi-square for independence, tests claims about a population using a random sample. It's for data in classes or bins, needing a large enough sample for accuracy, but it doesn't show the relationship's type or strength—like if it's positive or negative. To calculate it, set your alpha (say 0.05 for 95% confidence), identify categorical variables, and define hypotheses about their relationships. Variables must be mutually exclusive, and it's not for continuous data.
Kolmogorov-Smirnov (K-S) Test
The Kolmogorov-Smirnov test checks if a sample comes from a specific distribution in a population. It's non-parametric, good for large samples over 2000, and doesn't rely on any particular distribution. You aim to prove the null hypothesis that the sample follows the normal distribution. It uses null and alternative hypotheses with alpha, but applies to continuous distributions. The statistic D decides if you accept or reject the null—if D exceeds the critical value, reject it.
The Anderson-Darling (A-D) Test
This is a variation of the K-S test, but it weights the tails of the distribution more heavily. While K-S is sensitive to center differences, A-D focuses on tail variations, which is useful in finance for tail risk. It produces a statistic A2 to compare against the null hypothesis.
Shapiro-Wilk (S-W) Test
The Shapiro-Wilk test checks if a sample follows a normal distribution, best for small samples up to 2000 with one continuous variable. It uses a QQ Plot to display quantiles; if they're linear, they match the distribution. You estimate variance with the plot and compare to population variance—if the ratio is near 1, accept the null that it's normal. It also uses alpha, null (sample is normal), and alternative hypotheses.
Other Goodness-of-Fit Tests
Beyond these, you have options like the Bayesian information criterion (BIC) for model selection, balancing complexity and fit. The Cramer-von Mises criterion assesses fit to a hypothesized distribution, common in economics or finance. The Akaike information criterion (AIC) measures model quality with a trade-off on fit and complexity. The Hosmer-Lemeshow test compares expected and observed frequencies for binary outcomes in groups. Kuiper's test is like K-S but more tail-sensitive. Moran's I test checks spatial autocorrelation in data.
A general rule: ensure each group in your test has at least five data points for sufficient information.
Importance of Goodness-of-Fit Tests
These tests matter in statistics for several reasons. They assess how well your model fits observed data, checking if data consistency matches your assumed model, which helps choose better models. They also spot outliers or abnormalities that affect fit—you might need to remove them. Plus, they provide info on data variability and parameters for predictions. You may need to refine your model based on the dataset, residuals, and p-value.
Goodness-of-Fit Test vs. Independence Test
Don't confuse goodness-of-fit with independence tests; both assess variable relationships but answer different questions. Goodness-of-fit evaluates how well observed data fits a probability distribution. Independence tests check if two variables are associated, like if one changes with the other (e.g., does smoking cause lung cancer?). Use independence for specific variable relationships, and goodness-of-fit for overall model appropriateness.
Goodness-of-Fit Example
Consider this hypothetical: a gym assumes highest attendance on Mondays, Tuesdays, and Saturdays, average on Wednesdays and Thursdays, and lowest on Fridays and Sundays. They staff accordingly, but finances are poor, so the owner counts attendees for six weeks and compares to assumptions using chi-square. With new data, they adjust management for better profitability.
What Does Goodness-of-Fit Mean?
It's a hypothesis test to see how closely observed data matches expected data. It checks if samples follow normal distributions, if categorical variables relate, or if random samples share distributions.
Why Is Goodness-of-Fit Important?
These tests confirm if data aligns with expectations, aiding decisions. For instance, a retailer surveys preferences and uses chi-square to find, with 95% confidence, young people prefer product A, leading to marketing changes.
What Is Goodness-of-Fit in the Chi-Square Test?
In chi-square, it assesses categorical variable relationships and if the sample represents the population, estimating how closely observed data fits expected.
How Do You Do the Goodness-of-Fit Test?
Choose based on your goal—Shapiro-Wilk for small-sample normality, Kolmogorov-Smirnov for specific distributions. Each has its formula, but all involve null hypotheses and significance levels.
The Bottom Line
Goodness-of-fit tests check how well sample data fits population expectations. You compare observed to expected values with a discrepancy measure. Pick the test based on your sample size and what you need to know.
Other articles for you

Total utility represents the overall satisfaction a consumer gains from consuming goods or services, contrasting with marginal utility and influencing economic behaviors.

Platykurtic distributions have negative excess kurtosis, leading to thinner tails and fewer extreme events compared to normal distributions, making them appealing to risk-averse investors.

A stop-loss order helps investors automatically sell a security at a predefined price to limit potential losses or secure gains.

A leveraged buyout (LBO) is a financial strategy where a company acquires another primarily using borrowed funds, often leveraging the target's assets as collateral.

The Jobseeker's Allowance (JSA) is a UK benefit for unemployed people actively seeking work to help with living costs.

Worldwide income encompasses all earnings of US citizens and residents from global sources, subject to US taxation.

The Greatest Generation describes Americans born from 1900 to 1925 who endured the Great Depression and contributed to World War II efforts.

Trillion cubic feet (Tcf) is a standard volume measurement for natural gas in the U.S

Economic capital measures the capital a financial firm needs to stay solvent based on its risk profile.

A guarantor is someone who agrees to pay a borrower's debt if they default, providing security without owning the asset.