Calculating variance with large numbers(计算大数字的方差)
问题描述
我还没有真正使用过那么多的方差计算,我也不知道会发生什么.其实我数学一点都不好.
I haven't really used variance calculation that much, and I don't know quite what to expect. Actually I'm not too good with math at all.
我有一个由 0-10000 范围内的 1000000 个随机数值组成的数组.
I have a an array of 1000000 random numeric values in the range 0-10000.
数组可以变得更大,所以我使用 64 位 int 来求和.
The array could grow even larger, so I use 64 bit int for sum.
我试图找到关于如何计算方差的代码,但我不知道我是否得到了正确的输出.
I have tried to find code on how to calc variance, but I don't know if I get correct output.
平均值为 4692,中位数为 4533.我使用以下代码得到方差 1483780.469308:
The mean is 4692 and median is 4533. I get variance 1483780.469308 using the following code:
// size is the element count, in this case 1000000
// value_sum is __int64
double p2 = pow( (double)(value_sum - (value_sum/size)), (double)2.0 );
double variance = sqrt( (double)(p2 / (size-1)) );
我得到了一个合理的价值吗?
Am I getting a reasonable value?
计算有问题吗?
推荐答案
注意:这看起来不像是在计算方差.
Note: It doesn't look like you're calculating the variance.
方差是通过从每个元素中减去平均值并计算这些差异的加权和来计算的.
Variance is calculated by subtracting the mean from every element and calculating the weighted sum of these differences.
所以你需要做的是:
// Get mean
double mean = static_cast<double>(value_sum)/size;
// Calculate variance
double variance = 0;
for(int i = 0;i<size;++i)
{
variance += (MyArray[i]-mean)*(MyArray[i]-mean)/size;
}
// Display
cout<<variance;
请注意,这是样本方差,在潜在分布未知时使用(因此我们假设分布均匀).
Note that this is the sample variance, and is used when the underlying distribution is unknown (so we assume a uniform distribution).
此外,经过一番挖掘,我发现这不是一个无偏估计.Wolfram Alpha 对此有话要说,但作为一个例子,当 MATLAB 计算方差,它返回偏差校正样本方差".
Also, after some digging around, I found that this is not an unbiased estimator. Wolfram Alpha has something to say about this, but as an example, when MATLAB computes the variance, it returns the "bias-corrected sample variance".
偏差修正后的方差可以用每个元素除以size-1
得到,或者:
The bias-corrected variance can be obtained by dividing by each element by size-1
, or:
//Please check that size > 1
variance += (MyArray[i]-mean)*(MyArray[i]-mean)/(size-1);
还要注意的是,mean
的值保持不变.
Also note that, the value of mean
remains the same.
这篇关于计算大数字的方差的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:计算大数字的方差
基础教程推荐
- 如何在 C++ 中处理或避免堆栈溢出 2022-01-01
- 调用std::Package_TASK::Get_Future()时可能出现争用情况 2022-12-17
- 什么是T&&(双与号)在 C++11 中是什么意思? 2022-11-04
- 设计字符串本地化的最佳方法 2022-01-01
- 您如何将 CreateThread 用于属于类成员的函数? 2021-01-01
- C++,'if' 表达式中的变量声明 2021-01-01
- 运算符重载的基本规则和习语是什么? 2022-10-31
- 如何定义双括号/双迭代器运算符,类似于向量的向量? 2022-01-01
- C++ 程序在执行 std::string 分配时总是崩溃 2022-01-01
- C++ 标准:取消引用 NULL 指针以获取引用? 2021-01-01