为 SQL Server 中的字段生成唯一哈希

Generate Unique hash for a field in SQL Server(为 SQL Server 中的字段生成唯一哈希)

本文介绍了为 SQL Server 中的字段生成唯一哈希的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写用于我们现有会员基础的会员提供程序.我使用 EF4.1 进行所有数据库访问,我遇到的问题之一是最初设置数据库时,关系是以编程方式完成的,而不是在数据库中完成.如果需要在并非所有用户都需要的列上建立关系,但为了建立关系确实需要是唯一的(根据我的理解).

I'm in the process of writing a Membership Provider for use with our existing membership base. I use EF4.1 for all of my database access and one of the issued that I'm running into is when the DB was originally setup the relationships were done programmatically instead of in the db. One if the relationships needs to be made on a column that isn't required for all of our users, but in order to make the relationships does need to be unique (from my understanding).

我认为可行的解决方案是在 userid 字段上执行 MD5 哈希(这是唯一的......这将/应该保证该字段中的唯一值).我在 sql server 上遇到问题的部分是在不替换存储在 employeeNum 字段(有问题的那个)中的现有值的情况下执行此操作的查询.

My solution that I believe will work is to do an MD5 hash on the userid field (which is unique ...which would/should guarantee a unique value in that field). The part that I'm having issues with on sql server is the query that would do this WITHOUT replacing the existing values stored in the employeeNum field (the one in question).

简而言之,我的问题是.在值不存在的所有行的 employeeNum 字段(可能基于 userid 字段的 md5 哈希)中获取唯一值的最佳方法是什么?t 已经存在.此外,在次要/主要程度上……这听起来是个好计划吗?

So in a nutshell my question is. What is the best way to get a unique value in the employeeNum field (possibly based on an md5 hash of the userid field) on all the rows in which a value isn't already present. Also, to a minor/major extent...does this sound like a good plan?

推荐答案

如果您的问题只是如何为 userid 生成哈希值,您可以使用计算列以这种方式完成(或生成此值作为插入过程).我不清楚您是否了解 HASHBYTES 函数或当您说最佳"时您正在查看的其他标准.

If your question is just how to generate a hash value for userid, you can do it this way using a computed column (or generate this value as part of the insert process). It isn't clear to me whether you know about the HASHBYTES function or what other criteria you're looking at when you say "best."

DECLARE @foo TABLE
(
  userid INT, 
  hash1 AS HASHBYTES('MD5',  CONVERT(VARCHAR(12), userid)),
  hash2 AS HASHBYTES('SHA1', CONVERT(VARCHAR(12), userid))
);

INSERT @foo(userid) SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 500;

SELECT userid, hash1, hash2 FROM @foo;

结果:

userid  hash1                               hash2
------  ----------------------------------  ------------------------------------------
1       0xC4CA4238A0B923820DCC509A6F75849B  0x356A192B7913B04C54574D18C28D46E6395428AB
2       0xC81E728D9D4C2F636F067F89CC14862C  0xDA4B9237BACCCDF19C0760CAB7AEC4A8359010B0
500     0xCEE631121C2EC9232F3A2F028AD5C89B  0xF83A383C0FA81F295D057F8F5ED0BA4610947817

在 SQL Server 2012 中,我强烈建议至少使用 SHA2_256,而不是上述任何一种.(您忘记提及您使用的版本 - 总是有用的信息.)

In SQL Server 2012, I highly recommend at least SHA2_256 instead of either of the above. (You forgot to mention what version you're using - always useful information.)

说了这么多,我仍然想提请注意我在评论中提出的观点:这里的最佳"解决方案是修复模型.如果 employeeNum 是可选的,则不应让 EF 认为它是必需的或唯一的,如果它实际上不是某种标识符,则不应在关系中使用它.如果您首先为关系使用正确的属性,为什么用户会关心 employeeNumuserid 之间的冲突?

All that said, I still want to call attention to the point I made in the comments: the "best" solution here is to fix the model. If employeeNum is optional, EF shouldn't be made to think it is required or unique, and it shouldn't be used in relationships if it is not, in fact, some kind of identifier. Why would a user care about collisions between employeeNum and userid if you're using the right attribute for the relationship in the first place?

EDIT 按照 OP 的要求

那么说UPDATE table SET EmployeeNum = 1000000 + UserID WHERE EmployeeNum IS NULL 有什么问题?如果 EmployeeNum 将保持在 1000000 以下,那么你就保证没有冲突并且你完全避免了散列.

So what is wrong with saying UPDATE table SET EmployeeNum = 1000000 + UserID WHERE EmployeeNum IS NULL? If EmployeeNum will stay below 1000000 then you've guaranteed no collisions and you've avoided hashing altogether.

如果 employeeNum 可能包含一个字符串,您可以生成类似的填充,但又是 EF 促进了这些可怕的列名吗?为什么带有 Num 后缀的列除了数字之外不包含任何内容?

You could generate similar padding if employeeNum might contain a string, but again is it EF that promotes these horrible column names? Why would a column with a Num suffix contain anything but a number?

这篇关于为 SQL Server 中的字段生成唯一哈希的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!

本文标题为:为 SQL Server 中的字段生成唯一哈希

基础教程推荐