Class StandardDeviation

  • All Implemented Interfaces:
    DoubleConsumer, DoubleSupplier, IntSupplier, LongSupplier, DoubleStatistic, StatisticAccumulator<StandardDeviation>, StatisticResult

    public final class StandardDeviation
    extends Object
    implements DoubleStatistic, StatisticAccumulator<StandardDeviation>
    Computes the standard deviation of the available values. The default implementations uses the following definition of the sample standard deviation:

    \[ \sqrt{ \tfrac{1}{n-1} \sum_{i=1}^n (x_i-\overline{x})^2 } \]

    where \( \overline{x} \) is the sample mean, and \( n \) is the number of samples.

    • The result is NaN if no values are added.
    • The result is NaN if any of the values is NaN or infinite.
    • The result is NaN if the sum of the squared deviations from the mean is infinite.
    • The result is zero if there is one finite value in the data set.

    The use of the term \( n − 1 \) is called Bessel's correction. Omitting the square root, this provides an unbiased estimator of the variance of a hypothetical infinite population. If the biased option is enabled the normalisation factor is changed to \( \frac{1}{n} \) for a biased estimator of the sample variance. Note however that square root is a concave function and thus introduces negative bias (by Jensen's inequality), which depends on the distribution, and thus the corrected sample standard deviation (using Bessel's correction) is less biased, but still biased.

    The accept(double) method uses a recursive updating algorithm based on West's algorithm (see Chan and Lewis (1979)).

    The of(double...) method uses the corrected two-pass algorithm from Chan et al, (1983).

    Note that adding values using accept and then executing getAsDouble will sometimes give a different, less accurate, result than executing of with the full array of values. The former approach should only be used when the full array of values is not available.

    Supports up to 263 (exclusive) observations. This implementation does not check for overflow of the count.

    This class is designed to work with (though does not require) streams.

    Note that this instance is not synchronized. If multiple threads access an instance of this class concurrently, and at least one of the threads invokes the accept or combine method, it must be synchronized externally.

    However, it is safe to use accept and combine as accumulator and combiner functions of Collector on a parallel stream, because the parallel instance of Stream.collect() provides the necessary partitioning, isolation, and merging of results for safe and efficient parallel execution.

    References:

    • Chan and Lewis (1979) Computing standard deviations: accuracy. Communications of the ACM, 22, 526-531. doi: 10.1145/359146.359152
    • Chan, Golub and Levesque (1983) Algorithms for Computing the Sample Variance: Analysis and Recommendations. American Statistician, 37, 242-247. doi: 10.2307/2683386
    Since:
    1.1
    See Also:
    Standard deviation (Wikipedia), Bessel's correction, Jensen's inequality, Variance
    • Method Detail

      • create

        public static StandardDeviation create()
        Creates an instance.

        The initial result is NaN.

        Returns:
        StandardDeviation instance.
      • of

        public static StandardDeviation of​(double... values)
        Returns an instance populated using the input values.

        Note: StandardDeviation computed using accept may be different from this standard deviation.

        See StandardDeviation for details on the computing algorithm.

        Parameters:
        values - Values.
        Returns:
        StandardDeviation instance.
      • accept

        public void accept​(double value)
        Updates the state of the statistic to reflect the addition of value.
        Specified by:
        accept in interface DoubleConsumer
        Parameters:
        value - Value.
      • getAsDouble

        public double getAsDouble()
        Gets the standard deviation of all input values.

        When no values have been added, the result is NaN.

        Specified by:
        getAsDouble in interface DoubleSupplier
        Returns:
        standard deviation of all values.
      • setBiased

        public StandardDeviation setBiased​(boolean v)
        Sets the value of the biased flag. The default value is false. The bias term refers to the computation of the variance; the standard deviation is returned as the square root of the biased or unbiased sample variance. For further details see Variance.setBiased.

        This flag only controls the final computation of the statistic. The value of this flag will not affect compatibility between instances during a combine operation.

        Parameters:
        v - Value.
        Returns:
        this instance
        See Also:
        Variance.setBiased(boolean)