natural key vs surrogate key an innodb foreign key(自然键与代理键一个innodb外键)
问题描述
一个问题:
我有 2 张桌子:
Product
id INT
name VARCHAR(64)
something TEXT
else INT
entirely BOOL
和
Ingredient
id INT
name VARCHAR(64)
description TEXT
现在我也有一个链接表
Products_Ingredients
product_id INT
ingredient_id INT
对于我的多对多关系.
现在产品和成分都将拥有唯一的名称.所以我可以使用名称作为自然键......但这会是个好主意吗?
Now both products and ingredients will have unique names. So I can use names as natural keys... however will that be a good idea?
假设我有一个产品:Paint Thinner Supreme
含成分:丁基硝基四环素
Say I have a product: Paint Thinner Supreme
with ingredient: Butylonitrotetrocycline
将这些名称用作链接表中的复合键是个好主意吗?
Will that be a good idea to use those names as composite key in the link table?
尽管我了解在代理项上使用自然键背后的想法,但我还是忍不住想,使用简单整数作为主键(和外键)会快得多.MySQL 服务器消化这些不同密钥的方式会有所不同吗?
As much as I understand idea behind using natural keys over the surrogates, I kinda can't stop thinking that using simple integers as primary keys (and foreign ones) will be much faster. Will there be a difference in a way in which MySQL server digests those different keys?
你的意见是什么?
推荐答案
什么时候可以衡量,意见并不重要.
Opinions don't matter when you can measure.
我使用自然键和代理在 PostgreSQL 上实现了这一点.我总共使用了 300,000 种产品、180 种成分,并为 100,000 种随机选择的产品(1053462 行)填充了两个产品成分"表,每种产品包含 3 到 17 种成分.
I implemented this on PostgreSQL using both natural keys and surrogates. I used 300,000 total products, 180 ingredients, and populated two "product ingredient" tables with 3 to 17 ingredients per product, for 100,000 randomly selected products (1053462 rows).
使用自然键选择单个产品的所有成分,返回时间为 0.067 毫秒.使用代理,0.199ms.
Selecting all the ingredients for a single product using natural keys returned in 0.067 ms. Using surrogates, 0.199ms.
使用在 0.145 毫秒内返回的自然键返回单个产品的所有非 ID 列.使用代理,0.222 ms
Returning all the non-id columns for a single product using natural keys returned in 0.145 ms. Using surrogates, 0.222 ms
所以自然键在这个数据集上的速度大约快 2 到 3 倍.
So natural keys were about 2 to 3 times faster on this data set.
自然键不需要任何连接即可返回此数据.代理键需要两个连接.
Natural keys don't require any joins to return this data. Surrogate keys require two joins.
实际的性能差异取决于表的宽度、行数、页面大小和名称的长度,等等.会有一个点,代理键开始优于自然键,但很少有人尝试衡量这一点.
The actual performance difference depends on the width of your tables, number of rows, page size, and length of names, and things like that. There will be a point where surrogate keys start outperforming natural keys, but few people try to measure that.
当我为雇主的运营数据库设计数据库时,我构建了一个测试平台,其中包含围绕自然键设计的表格和围绕 ID 数字设计的表格.这两种模式都有超过 1300 万行计算机生成的样本数据.在少数情况下,对 id 编号模式的查询比自然键模式高出 50%.(因此,使用 id 号花费 20 秒的复杂查询使用自然键需要 30 秒.)但是 80% 的测试查询对自然键模式具有更快的 SELECT 性能.有时它的速度快得惊人——相差 30 比 1.
When I was designing the database for my employer's operational database, I built a testbed with tables designed around natural keys and with tables designed around id numbers. Both those schemas have more than 13 million rows of computer-generated sample data. In a few cases, queries on the id number schema outperformed the natural key schema by 50%. (So a complex query that took 20 seconds with id numbers took 30 seconds with natural keys.) But 80% of the test queries had faster SELECT performance against the natural key schema. And sometimes it was staggeringly faster--a difference of 30 to 1.
我们预计在未来几年内,自然键的性能将优于我们数据库中的代理.(除非我们将某些表移至 SSD,在这种情况下,自然键可能永远胜过代理.)
We expect natural keys to outperform surrogates in our database for years to come. (Unless we move certain tables over to an SSD, in which case natural keys will probably outperform surrogates forever.)
这篇关于自然键与代理键一个innodb外键的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:自然键与代理键一个innodb外键
基础教程推荐
- 将数据从 MS SQL 迁移到 PostgreSQL? 2022-01-01
- 使用pyodbc“不安全"的Python多处理和数据库访问? 2022-01-01
- SQL Server:只有 GROUP BY 中的最后一个条目 2021-01-01
- SQL Server 中单行 MERGE/upsert 的语法 2021-01-01
- 如何在 SQL Server 的嵌套过程中处理事务? 2021-01-01
- 在 VB.NET 中更新 SQL Server DateTime 列 2021-01-01
- SQL Server 2016更改对象所有者 2022-01-01
- Sql Server 字符串到日期的转换 2021-01-01
- ERROR 2006 (HY000): MySQL 服务器已经消失 2021-01-01
- 无法在 ubuntu 中启动 mysql 服务器 2021-01-01