.NET Tuple 和 Equals 性能

.NET Tuple and Equals performance(.NET Tuple 和 Equals 性能)

本文介绍了.NET Tuple 和 Equals 性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我直到今天才注意到的.显然,.NET 实现的常用元组类(Tuple<T>Tuple<T1、T2> 等)会导致 值类型 执行基于等式的操作时.

This is something I had not noticed until today. Apparently, the .NET implementation of the much used tuple classes (Tuple<T>, Tuple<T1, T2> etc) causes boxing penalties for value types when equality based operations are performed.

以下是该类在框架中的实现方式(来自 ILSpy):

Here is how the class is kind of implemented in the framework (source via ILSpy):

public class Tuple<T1, T2> : IStructuralEquatable 
{
    public T1 Item1 { get; private set; }
    public T2 Item2 { get; private set; }

    public Tuple(T1 item1, T2 item2)
    {
        this.Item1 = item1;
        this.Item2 = item2;
    }

    public override bool Equals(object obj)
    {
        return this.Equals(obj, EqualityComparer<object>.Default);
    }

    public override int GetHashCode()
    {
        return this.GetHashCode(EqualityComparer<object>.Default);
    }

    public bool Equals(object obj, IEqualityComparer comparer)
    {
        if (obj == null)
        {
            return false;
        }

        var tuple = obj as Tuple<T1, T2>;
        return tuple != null 
            && comparer.Equals(this.Item1, tuple.Item1) 
            && comparer.Equals(this.Item2, tuple.Item2);
    }

    public int GetHashCode(IEqualityComparer comparer)
    {
        int h1 = comparer.GetHashCode(this.Item1);
        int h2 = comparer.GetHashCode(this.Item2);

        return (h1 << 5) + h1 ^ h2;
    }
}

我看到的问题是它会导致两阶段装箱-拆箱,例如 Equals 调用,一个,在 comparer.Equals 将项目装箱,两个,EqualityComparer 调用 non-generic Equals,而后者又必须在内部将项目拆箱为原始类型.

The problem I see is it causes a two stage boxing-unboxing, say for Equals calls, one, at the comparer.Equals which boxes the item, two, the EqualityComparer<object> calls the non-generic Equals which in turn will internally have to unbox the item to orginal type.

相反,他们为什么不做类似的事情:

Instead why wouldn't they do something like:

public override bool Equals(object obj)
{
    var tuple = obj as Tuple<T1, T2>;
    return tuple != null
        && EqualityComparer<T1>.Default.Equals(this.Item1, tuple.Item1)
        && EqualityComparer<T2>.Default.Equals(this.Item2, tuple.Item2);
}

public override int GetHashCode()
{
    int h1 = EqualityComparer<T1>.Default.GetHashCode(this.Item1);
    int h2 = EqualityComparer<T2>.Default.GetHashCode(this.Item2);

    return (h1 << 5) + h1 ^ h2;
}

public bool Equals(object obj, IEqualityComparer comparer)
{
    var tuple = obj as Tuple<T1, T2>;
    return tuple != null
        && comparer.Equals(this.Item1, tuple.Item1)
        && comparer.Equals(this.Item2, tuple.Item2);
}

public int GetHashCode(IEqualityComparer comparer)
{
    int h1 = comparer.GetHashCode(this.Item1);
    int h2 = comparer.GetHashCode(this.Item2);

    return (h1 << 5) + h1 ^ h2;
}

我很惊讶在 .NET 元组类中以这种方式实现相等.我在其中一个字典中使用元组类型作为键.

I was surprised to see equality implemented this way in .NET tuple class. I was using tuple type as a key in one of the dictionaries.

有什么理由必须按照第一个代码中所示来实现吗?在那种情况下使用这个类有点令人沮丧.

Is there any reason why this has to be implemented as shown in the first code? Its a bit discouraging to make use of this class in that case.

我不认为代码重构和非重复数据应该是主要问题.同样的非泛型/装箱实现也落后于 IStructuralComparable,但由于 IStructuralComparable.CompareTo 使用较少,因此它经常不是问题.

I dont think code refactoring and non-duplicating data should have been the major concerns. The same non-generic/boxing implementation has gone behind IStructuralComparable too, but since IStructuralComparable.CompareTo is less used its not a problem often.

我用第三种方法对上述两种方法进行了基准测试,第三种方法仍然不那么费力,就像这样(仅是必需品):

I benchmarked the above two approaches with a third approach which is still less taxing, like this (only the essentials):

public override bool Equals(object obj)
{
    return this.Equals(obj, EqualityComparer<T1>.Default, EqualityComparer<T2>.Default);
}

public bool Equals(object obj, IEqualityComparer comparer)
{
    return this.Equals(obj, comparer, comparer);
}

private bool Equals(object obj, IEqualityComparer comparer1, IEqualityComparer comparer2)
{
    var tuple = obj as Tuple<T1, T2>;
    return tuple != null
        && comparer1.Equals(this.Item1, tuple.Item1)
        && comparer2.Equals(this.Item2, tuple.Item2);
} 

对于几个 Tuple 字段,有 1000000 个 Equals 调用.结果如下:

for a couple of Tuple<DateTime, DateTime> fields a 1000000 Equals calls. This is the result:

第一种方法(原始 .NET 实现)- 310 毫秒

1st approach (original .NET implementation) - 310 ms

第二次接近 - 60 毫秒

2nd approach - 60 ms

第三种方法 - 130 毫秒

3rd approach - 130 ms

默认实现比最优方案慢约 4-5 倍.

The default implementation is about 4-5 times slower than the optimal solution.

推荐答案

你想知道它是否必须"以这种方式实现.简而言之,我会说不:有许多功能等效的实现.

You wondered if it 'has to' be implemented that way. In short, I would say no: there are many functionally equivalent implementations.

但是为什么现有的实现会如此明确地使用 EqualityComparer.Default?这可能只是写这篇文章的人在心理上针对错误"进行优化的人的情况,或者至少与您在内部循环中的速度场景不同.根据他们的基准,它可能看起来是正确"的事情.

But why does the existing implementation make such explicit usage of EqualityComparer<object>.Default? It may just be a case of the person who wrote this mentally optimizing for the 'wrong', or at least different thing than your scenario of speed in an inner loop. Depending on their benchmark it may appear be the 'right' thing.

但是,什么样的基准场景会导致他们做出这样的选择?好吧,他们所针对的优化似乎是针对最少数量的 EqualityComparer 类模板实例化进行优化.他们可能会选择这个,因为模板实例化会带来内存或加载时间成本.如果是这样,我们可以猜测他们的基准场景可能是基于应用程序启动时间或内存使用情况,而不是一些紧密循环的场景.

But what benchmark scenario could lead them to make that choice? Well the optimization they have targeted seems to be to optimize for the minimum number of EqualityComparer class template instantiations. They might likely choose this because template instantiation comes with memory or load-time costs. If so, we can guess their benchmark scenario could have been based on app-startup-time or memory usage rather than some tight looping scenario.

这是支持该理论的一个知识点(通过使用确认偏差找到:) - 如果 T 是一个结构,则不能共享 EqualityComparer 实现方法体.摘自 http://blogs.microsoft.co.il/sasha/2012/09/18/runtime-representation-of-genericspart-2/

Here is one knowledge point to support the theory (found by using confirmation bias :) - EqualityComparer implementations method bodies cannot be shared if T is a struct. Excerpted from http://blogs.microsoft.co.il/sasha/2012/09/18/runtime-representation-of-genericspart-2/

当 CLR 需要创建一个封闭的泛型类型的实例时,比如List,它会创建一个方法表和EEClass开放式.与往常一样,方法表包含方法指针,它由 JIT 编译器动态编译.然而,有一个这里的关键优化:在封闭泛型上编译的方法体可以共享具有引用类型参数的类型.[...]相同想法不适用于值类型.例如,当 T 很长时,赋值语句 items[size] = item 需要不同的指令,因为必须复制 8 个字节而不是 4 个.甚至更大值类型甚至可能需要不止一条指令;等等.

When the CLR needs to create an instance of a closed generic type, such as List, it creates a method table and EEClass based on the open type. As always, the method table contains method pointers, which are compiled on the fly by the JIT compiler. However, there is a crucial optimization here: compiled method bodies on closed generic types that have reference type parameters can be shared. [...] The same idea does not work for value types. For example, when T is long, the assignment statement items[size] = item requires a different instruction, because 8 bytes must be copied instead of 4. Even larger value types may even require more than one instruction; and so on.

这篇关于.NET Tuple 和 Equals 性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!

本文标题为:.NET Tuple 和 Equals 性能

基础教程推荐