Class MannWhitneyUTest

    • Method Detail

      • with

        public MannWhitneyUTest with​(ContinuityCorrection v)
        Return an instance with the configured continuity correction.

        If ENABLED, adjust the U rank statistic by 0.5 towards the mean value when computing the z-statistic if a normal approximation is used to compute the p-value.

        Parameters:
        v - Value.
        Returns:
        an instance
      • statistic

        public double statistic​(double[] x,
                                double[] y)
        Computes the Mann-Whitney U statistic comparing two independent samples possibly of different length.

        This statistic can be used to perform a Mann-Whitney U test evaluating the null hypothesis that the two independent samples differ by a location shift of mu.

        This returns the U1 statistic. Compute the U2 statistic using:

         u2 = (long) x.length * y.length - u1;
         
        Parameters:
        x - First sample values.
        y - Second sample values.
        Returns:
        Mann-Whitney U1 statistic
        Throws:
        IllegalArgumentException - if x or y are zero-length; or contain NaN values.
        See Also:
        withMu(double)
      • test

        public MannWhitneyUTest.Result test​(double[] x,
                                            double[] y)
        Performs a Mann-Whitney U test comparing the location for two independent samples. The location is specified using mu.

        The test is defined by the AlternativeHypothesis.

        • 'two-sided': the distribution underlying (x - mu) is not equal to the distribution underlying y.
        • 'greater': the distribution underlying (x - mu) is stochastically greater than the distribution underlying y.
        • 'less': the distribution underlying (x - mu) is stochastically less than the distribution underlying y.

        If the p-value method is auto an exact p-value is computed if the samples contain less than 50 values; otherwise a normal approximation is used.

        Computation of the exact p-value is only valid if there are no tied ranks in the data; otherwise the p-value resorts to the asymptotic approximation using a tie correction and an optional continuity correction.

        Note: Exact computation requires tabulation of values not exceeding size (n+1)*(m+1)*(u+1) where u is the minimum of the U1 and U2 statistics and n and m are the sample sizes. This may use a very large amount of memory and result in an OutOfMemoryError. Exact computation requires a finite binomial coefficient binom(n+m, m) which is limited to n+m <= 1029 for any n and m, or min(n, m) <= 37 for any max(n, m). An OutOfMemoryError is not expected using the limits configured for the auto p-value computation as the maximum required memory is approximately 23 MiB.

        Parameters:
        x - First sample values.
        y - Second sample values.
        Returns:
        test result
        Throws:
        IllegalArgumentException - if x or y are zero-length; or contain NaN values.
        OutOfMemoryError - if the exact computation is user-requested for large samples and there is not enough memory.
        See Also:
        statistic(double[], double[]), withMu(double), with(AlternativeHypothesis), with(ContinuityCorrection)