HashMap resize方法实现细节

HashMap resize method implementation detail(HashMap resize方法实现细节)

本文介绍了HashMap resize方法实现细节的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

正如标题所示,这是一个关于 HashMap#resize 的实现细节的问题 - 即内部数组的大小翻倍.这有点罗嗦,但我真的试图证明我已经尽我所能理解这一点......

这发生在此特定存储桶/bin 中的条目以 Linked 方式存储时 - 因此具有准确的顺序并且在问题的上下文中 这很重要.

resize 通常也可以从其他地方调用,但我们只看这种情况.

假设您将这些字符串作为键放在 HashMap 中(右侧是 hashcode after HashMap#hash - 这是内部重新散列.)是的,这些是精心生成的,不是随机的.

 DFHXR - 11111YSXFJ-01111图迪 - 11111AXVUH - 01111RUTWZ - 11111德杜克 - 01111WFCVW-11111ZETCU-01111GCVUR - 11111

这里有一个简单的模式需要注意 - 最后 4 位对于所有这些都是相同的 - 这意味着当我们插入 8 个这些键(总共有 9 个)时,它们最终将在同一个桶中;并且在第 9 天 HashMap#put,将调用 resize.

因此,如果当前 HashMap 中有 8 个条目(具有上述键之一) - 这意味着此映射中有 16 个桶,它们的最后 4 位键决定了条目的位置结果.

我们放了第九把钥匙.此时 TREEIFY_THRESHOLD 被命中并且 resize 被调用.bin 加倍为 32 并且键中的另一位决定该条目的去向(所以,现在是 5 位).

最终到达这段代码(当 resize 发生时):

 节点<K,V>loHead = null, loTail = null;节点<K,V>hiHead = null, hiTail = null;节点<K,V>下一个;做 {下一个 = e.下一个;if ((e.hash & oldCap) == 0) {如果(loTail == null)低头 = e;别的loTail.next = e;loTail = e;}别的 {如果(hiTail == null)hiHead = e;别的hiTail.next = e;hiTail = e;}} while ((e = next) != null);如果(loTail!= null){loTail.next = null;新标签[j] = loHead;}如果(hiTail!= null){hiTail.next = null;newTab[j + oldCap] = hiHead;}

实际上并没有那么复杂...它的作用是将当前 bin 拆分为 将移动 到其他 bin 的条目和 不会移动 到其他 bin 的条目垃圾箱 - 但肯定会留在这个垃圾箱中.

它实际上是非常聪明的——通过这段代码:

 if ((e.hash & oldCap) == 0)

它的作用是检查下一位(在我们的例子中是第 5 位)是否实际上为零 - 如果是,则意味着该条目将保持在原位;如果不是,它将在新 bin 中以 2 次方偏移量移动.

现在最后的问题是:调整大小中的那段代码是经过精心制作的,以便它保持条目的顺序在那个 bin 中.

所以在你将这 9 个键放入 HashMap 之后,顺序将是:

DFHXR ->TUDDY ->RUTWZ ->WFCVW->GCVUR(一箱)YSXFJ->AXVUH ->推理->ZETCU(另一个垃圾箱)

为什么要保留 HashMap 中某些条目的顺序.Map 中的顺序真的 很糟糕,详细这里或这里.

解决方案

设计考虑已记录在同一源文件中,在 第 211 行

<块引用>

* 当 bin 列表被树化、拆分或未树化时,我们保持* 它们以相同的相对访问/遍历顺序(即,字段* Node.next) 以更好地保持局部性,并稍微* 简化调用的拆分和遍历的处理*迭代器.删除.在插入时使用比较器,以保持* 总排序(或此处要求的接近)* 重新平衡,我们将类和 identityHashCodes 比较为* 决胜局.

由于通过迭代器删除映射不会触发调整大小,因此在 resize 中特别保留顺序的原因是更好地保留局部性,并稍微简化拆分处理",以及与政策保持一致.

As the title suggests this is a question about an implementation detail from HashMap#resize - that's when the inner array is doubled in size. It's a bit wordy, but I've really tried to prove that I did my best understanding this...

This happens at a point when entries in this particular bucket/bin are stored in a Linked fashion - thus having an exact order and in the context of the question this is important.

Generally the resize could be called from other places as well, but let's look at this case only.

Suppose you put these strings as keys in a HashMap (on the right there's the hashcode after HashMap#hash - that's the internal re-hashing.) Yes, these are carefully generated, not random.

 DFHXR - 11111
 YSXFJ - 01111 
 TUDDY - 11111 
 AXVUH - 01111 
 RUTWZ - 11111
 DEDUC - 01111
 WFCVW - 11111
 ZETCU - 01111
 GCVUR - 11111 

There's a simple pattern to notice here - the last 4 bits are the same for all of them - which means that when we insert 8 of these keys (there are 9 total), they will end-up in the same bucket; and on the 9-th HashMap#put, the resize will be called.

So if currently there are 8 entries (having one of the keys above) in the HashMap - it means there are 16 buckets in this map and the last 4 bits of they key decided where the entries end-up.

We put the nine-th key. At this point TREEIFY_THRESHOLD is hit and resize is called. The bins are doubled to 32 and one more bit from the keys decides where that entry will go (so, 5 bits now).

Ultimately this piece of code is reached (when resize happens):

 Node<K,V> loHead = null, loTail = null;
 Node<K,V> hiHead = null, hiTail = null;
 Node<K,V> next;
 do {
     next = e.next;
     if ((e.hash & oldCap) == 0) {
          if (loTail == null)
               loHead = e;
          else
               loTail.next = e;
          loTail = e;
     }
     else {
        if (hiTail == null)
            hiHead = e;
        else
            hiTail.next = e;
        hiTail = e;
     }
 } while ((e = next) != null);



 if (loTail != null) {
     loTail.next = null;
     newTab[j] = loHead;
 }
 if (hiTail != null) {
     hiTail.next = null;
     newTab[j + oldCap] = hiHead;
 }

It's actually not that complicated... what it does it splits the current bin into entries that will move to other bins and to entries that will not move to other bins - but will remain into this one for sure.

And it's actually pretty smart how it does that - it's via this piece of code:

 if ((e.hash & oldCap) == 0) 

What this does is check if the next bit (the 5-th in our case) is actually zero - if it is, it means that this entry will stay where it is; if it's not it will move with a power of two offset in the new bin.

And now finally the question: that piece of code in the resize is carefully made so that it preserves the order of the entries there was in that bin.

So after you put those 9 keys in the HashMap the order is going to be :

DFHXR -> TUDDY -> RUTWZ -> WFCVW -> GCVUR (one bin)

YSXFJ -> AXVUH -> DEDUC -> ZETCU (another bin)

Why would you want to preserve order of some entries in the HashMap. Order in a Map is really bad as detailed here or here.

解决方案

The design consideration has been documented within the same source file, in a code comment in line 211

* When bin lists are treeified, split, or untreeified, we keep 
* them in the same relative access/traversal order (i.e., field 
* Node.next) to better preserve locality, and to slightly 
* simplify handling of splits and traversals that invoke 
* iterator.remove. When using comparators on insertion, to keep a 
* total ordering (or as close as is required here) across 
* rebalancings, we compare classes and identityHashCodes as 
* tie-breakers. 

Since removing mappings via an iterator can’t trigger a resize, the reasons to retain the order specifically in resize are "to better preserve locality, and to slightly simplify handling of splits", as well as being consistent regarding the policy.

这篇关于HashMap resize方法实现细节的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!

本文标题为:HashMap resize方法实现细节

基础教程推荐