Class ChiSquareTest
- java.lang.Object
-
- org.apache.commons.statistics.inference.ChiSquareTest
-
public final class ChiSquareTest extends Object
Implements chi-square test statistics.This implementation handles both known and unknown distributions.
Two samples tests can be used when the distribution is unknown a priori but provided by one sample, or when the hypothesis under test is that the two samples come from the same underlying distribution.
- Since:
- 1.1
- See Also:
- Chi-square test (Wikipedia)
-
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description double
statistic(double[] expected, long[] observed)
Computes the chi-square goodness-of-fit statistic comparingobserved
andexpected
frequency counts.double
statistic(long[] observed)
Computes the chi-square goodness-of-fit statistic comparing theobserved
counts to a uniform expected value (each category is equally likely).double
statistic(long[][] counts)
Computes the chi-square statistic associated with a chi-square test of independence based on the inputcounts
array, viewed as a two-way table in row-major format.double
statistic(long[] observed1, long[] observed2)
Computes a chi-square statistic associated with a chi-square test of independence of frequency counts inobserved1
andobserved2
.SignificanceResult
test(double[] expected, long[] observed)
Perform a chi-square goodness-of-fit test evaluating the null hypothesis that theobserved
counts conform to theexpected
counts.SignificanceResult
test(long[] observed)
Perform a chi-square goodness-of-fit test evaluating the null hypothesis that theobserved
counts conform to a uniform distribution (each category is equally likely).SignificanceResult
test(long[][] counts)
Perform a chi-square test of independence based on the inputcounts
array, viewed as a two-way table.SignificanceResult
test(long[] observed1, long[] observed2)
Perform a chi-square test of independence of frequency counts inobserved1
andobserved2
.static ChiSquareTest
withDefaults()
Return an instance using the default options.ChiSquareTest
withDegreesOfFreedomAdjustment(int v)
Return an instance with the configured degrees of freedom adjustment.
-
-
-
Method Detail
-
withDefaults
public static ChiSquareTest withDefaults()
Return an instance using the default options.- Returns:
- default instance
-
withDegreesOfFreedomAdjustment
public ChiSquareTest withDegreesOfFreedomAdjustment(int v)
Return an instance with the configured degrees of freedom adjustment.The default degrees of freedom for a sample of length
n
aren - 1
. An intrinsic null hypothesis is one where you estimate one or more parameters from the data in order to get the numbers for your null hypothesis. For a distribution withp
parameters where up top
parameters have been estimated from the data the degrees of freedom is in the range[n - 1 - p, n - 1]
.- Parameters:
v
- Value.- Returns:
- an instance
- Throws:
IllegalArgumentException
- if the value is negative
-
statistic
public double statistic(long[] observed)
Computes the chi-square goodness-of-fit statistic comparing theobserved
counts to a uniform expected value (each category is equally likely).Note: This is a specialized version of a comparison of
observed
with anexpected
array of uniform values. The result is faster than callingstatistic(double[], long[])
and the statistic is the same, with an allowance for accumulated floating-point error due to the optimized routine.- Parameters:
observed
- Observed frequency counts.- Returns:
- Chi-square statistic
- Throws:
IllegalArgumentException
- if the sample size is less than 2;observed
has negative entries; or all the observations are zero.- See Also:
test(long[])
-
statistic
public double statistic(double[] expected, long[] observed)
Computes the chi-square goodness-of-fit statistic comparingobserved
andexpected
frequency counts.Note:This implementation rescales the
expected
array if necessary to ensure that the sum of the expected and observed counts are equal.- Parameters:
expected
- Expected frequency counts.observed
- Observed frequency counts.- Returns:
- Chi-square statistic
- Throws:
IllegalArgumentException
- if the sample size is less than 2; the array sizes do not match;expected
has entries that are not strictly positive;observed
has negative entries; or all the observations are zero.- See Also:
test(double[], long[])
-
statistic
public double statistic(long[][] counts)
Computes the chi-square statistic associated with a chi-square test of independence based on the inputcounts
array, viewed as a two-way table in row-major format.- Parameters:
counts
- 2-way table.- Returns:
- Chi-square statistic
- Throws:
IllegalArgumentException
- if the number of rows or columns is less than 2; the array is non-rectangular; the array has negative entries; or the sum of a row or column is zero.- See Also:
test(long[][])
-
statistic
public double statistic(long[] observed1, long[] observed2)
Computes a chi-square statistic associated with a chi-square test of independence of frequency counts inobserved1
andobserved2
. The sums of frequency counts in the two samples are not required to be the same. The formula used to compute the test statistic is:\[ \sum_i{ \frac{(K * a_i - b_i / K)^2}{a_i + b_i} } \]
where
\[ K = \sqrt{ \sum_i{a_i} / \sum_i{b_i} } \]
Note: This is a specialized version of a 2-by-n contingency table. The result is faster than calling
statistic(long[][])
with the table composed asnew long[][]{observed1, observed2}
. The statistic is the same, with an allowance for accumulated floating-point error due to the optimized routine.- Parameters:
observed1
- Observed frequency counts of the first data set.observed2
- Observed frequency counts of the second data set.- Returns:
- Chi-square statistic
- Throws:
IllegalArgumentException
- if the sample size is less than 2; the array sizes do not match; either array has entries that are negative; either all counts ofobserved1
orobserved2
are zero; or if the count at some index is zero for both arrays.- See Also:
test(long[], long[])
-
test
public SignificanceResult test(long[] observed)
Perform a chi-square goodness-of-fit test evaluating the null hypothesis that theobserved
counts conform to a uniform distribution (each category is equally likely).- Parameters:
observed
- Observed frequency counts.- Returns:
- test result
- Throws:
IllegalArgumentException
- if the sample size is less than 2;observed
has negative entries; or all the observations are zero- See Also:
statistic(long[])
-
test
public SignificanceResult test(double[] expected, long[] observed)
Perform a chi-square goodness-of-fit test evaluating the null hypothesis that theobserved
counts conform to theexpected
counts.The test can be configured to apply an adjustment to the degrees of freedom if the observed data has been used to create the expected counts.
- Parameters:
expected
- Expected frequency counts.observed
- Observed frequency counts.- Returns:
- test result
- Throws:
IllegalArgumentException
- if the sample size is less than 2; the array sizes do not match;expected
has entries that are not strictly positive;observed
has negative entries; all the observations are zero; or the adjusted degrees of freedom are not strictly positive- See Also:
withDegreesOfFreedomAdjustment(int)
,statistic(double[], long[])
-
test
public SignificanceResult test(long[][] counts)
Perform a chi-square test of independence based on the inputcounts
array, viewed as a two-way table.- Parameters:
counts
- 2-way table.- Returns:
- test result
- Throws:
IllegalArgumentException
- if the number of rows or columns is less than 2; the array is non-rectangular; the array has negative entries; or the sum of a row or column is zero.- See Also:
statistic(long[][])
-
test
public SignificanceResult test(long[] observed1, long[] observed2)
Perform a chi-square test of independence of frequency counts inobserved1
andobserved2
.Note: This is a specialized version of a 2-by-n contingency table.
- Parameters:
observed1
- Observed frequency counts of the first data set.observed2
- Observed frequency counts of the second data set.- Returns:
- test result
- Throws:
IllegalArgumentException
- if the sample size is less than 2; the array sizes do not match; either array has entries that are negative; either all counts ofobserved1
orobserved2
are zero; or if the count at some index is zero for both arrays.- See Also:
statistic(long[], long[])
-
-