Amortized analysis of std::vector insertion(std::vector 插入的摊销分析)
问题描述
我们如何对 std::vector 中的后面(push_back)插入进行分析?每次插入的摊销时间为 O(1).特别是在 Stephan T Lavavej 的 channel9 视频 和 在此(从 17:42 开始) 他说,为了获得最佳性能,Microsoft 实施此方法将向量的容量增加了大约 1.5.
How do we do the analysis of insertion at the back (push_back) in a std::vector? It's amortized time is O(1) per insertion. In particular in a video in channel9 by Stephan T Lavavej and in this ( 17:42 onwards ) he says that for optimal performance Microsoft's implementation of this method increases capacity of the vector by around 1.5.
这个常数是如何确定的?
How is this constant determined?
推荐答案
假设你的意思是 push_back
而不是插入,我相信重要的部分是乘以某个常数(而不是抓住 N每次都有更多元素),只要你这样做,你就会得到摊销固定时间.更改因子会改变平均情况和最坏情况的性能.
Assuming you mean push_back
and not insertion, I believe that the important part is the multiply by some constant (as opposed to grabbing N more elements each time) and as long as you do this you'll get amortized constant time. Changing the factor changes the average case and worst case performance.
具体来说:如果您的常数因子太大,您将获得良好的平均情况性能,但最坏情况下的性能会很差,尤其是当数组变大时.例如,想象一下将一个 10000 大小的向量加倍 (2x) 只是因为您推送了第 10001 个元素.正如 Michael Burr 间接指出的那样,这里的实际成本可能是您将内存增长得比您需要的大得多.我想补充一点,如果您的因素太大,缓存问题会影响速度.可以说,如果您增长得比您需要的大得多,就会产生实际成本(内存和计算).
Concretely: If your constant factor is too large, you'll have good average case performance, but bad worst case performance especially as the arrays get big. For instance, imagine doubling (2x) a 10000 size vector just because you have the 10001th element pushed. As Michael Burr indirectly pointed out, the real cost here is probably that you'll grow your memory much larger than you need it to be. I would add to this that there are cache issues that affect speed if your factor is too large. Suffice it to say that there are real costs (memory and computation) if you grow much larger than you need.
但是,如果您的常数因子太小,例如 (1.1x),那么您的最坏情况性能将很好,但平均性能很差,因为您将不得不承担重新分配太多次的成本.
However if your constant factor is too small, say (1.1x) then you're going to have good worst case performance, but bad average performance, because you're going to have to incur the cost of reallocating too many times.
另外,请参阅 Jon Skeet 对之前有一个类似的问题.(感谢@Bo Persson)
Also, see Jon Skeet's answer to a similar question previously. (Thanks @Bo Persson)
关于分析的更多信息:假设您有 n
个要推回的项目和 M
的乘法因子.那么重新分配的次数将大致为 n
(log_M(n)
) 的对数基数 M
.而第 i
次重新分配的成本将与 M^i
成正比(M
的 i
次幂).那么所有回推的总时间将是M^1 + M^2 + ... M^(log_M(n))
.回推的次数是n
,因此你得到这个级数(这是一个几何级数,在极限中大约减少到(nM)/(M-1)
) 除以 n
.这大致是一个常数,M/(M-1)
.
A little more about the analysis: Say you have n
items you are pushing back and a multiplication factor of M
. Then the number of reallocations will be roughly log base M
of n
(log_M(n)
). And the i
th reallocation will cost proportional to M^i
(M
to the i
th power). Then the total time of all the pushbacks will be M^1 + M^2 + ... M^(log_M(n))
. The number of pushbacks is n
, and thus you get this series (which is a geometric series, and reduces to roughly (nM)/(M-1)
in the limit) divided by n
. This is roughly a constant, M/(M-1)
.
对于较大的 M
值,您会超调很多并分配比合理经常需要的更多(我在上面提到过).对于 M
的小值(接近 1),这个常数 M/(M-1)
会变大.这个因素直接影响平均时间.
For large values of M
you will overshoot a lot and allocate much more than you need reasonably often (which I mentioned above). For small values of M
(close to 1) this constant M/(M-1)
becomes large. This factor directly affects the average time.
这篇关于std::vector 插入的摊销分析的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:std::vector 插入的摊销分析
基础教程推荐
- 如何在不破坏 vtbl 的情况下做相当于 memset(this, ...) 的操作? 2022-01-01
- 如何“在 Finder 中显示"或“在资源管理器中显 2021-01-01
- 如何使图像调整大小以在 Qt 中缩放? 2021-01-01
- 在 C++ 中循环遍历所有 Lua 全局变量 2021-01-01
- 使用从字符串中提取的参数调用函数 2022-01-01
- Windows Media Foundation 录制音频 2021-01-01
- 为 C/C++ 中的项目的 makefile 生成依赖项 2022-01-01
- 从 std::cin 读取密码 2021-01-01
- 管理共享内存应该分配多少内存?(助推) 2022-12-07
- 为什么语句不能出现在命名空间范围内? 2021-01-01