Gaps and islands on 2 columns - if column A consecutive and column B identical(2 列上的间隙和孤岛 - 如果 A 列连续且 B 列相同)
问题描述
我有一张如下表:
CREATE TABLE `table` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`cc` int(3) unsigned NOT NULL,
`number` int(10) NOT NULL,
`name` varchar(64) NOT NULL,
`datetime` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`)
) ENGINE=InnoDB
DBMS 是 Debian 9.1 上的 MariaDB 10.1.26.我一直试图让它列出连续数字的范围.通过以下查询,我能够做到这一点:
The DBMS is MariaDB 10.1.26 on Debian 9.1. I had been trying to get it to list ranges of consecutive numbers. With the following query, I am able to accomplish that:
SELECT min(number) first_number, max(number) last_number, count(*) AS no_records FROM (
SELECT c.*, @rn := @rn + 1 rn
from (SELECT number FROM table WHERE cc = 1 GROUP BY number ORDER BY number) AS c
CROSS JOIN (SELECT @rn := 0) r
) c
GROUP BY number - rn ORDER BY number ASC
但是,如果我希望根据附加列中的值将项目组合在一起,则此方法行不通.假设我希望仅当 name
的值都相同时才对项目进行分组.说这是我的数据:
But if I wanted items bunched together based on the value in an additional column, this doesn't work. Say I want the items grouped only when their values for name
are all the same. Say this is my data:
INSERT INTO `table` (`id`, `cc`, `number`, `name`) VALUES
(1, 1, 12, 'Hello'),
(2, 1, 2, 'Apple'),
(3, 1, 3, 'Bean'),
(4, 1, 10, 'Hello'),
(5, 1, 11, 'Hello'),
(6, 1, 1, 'Apple'),
(7, 1, 14, 'Deer'),
(8, 1, 14, 'Door'),
(9, 1, 15, 'Hello'),
(10, 1, 17, 'Hello'),
我想得到这样的报告:
first last count name
1 2 2 Apple
3 3 1 Bean
10 12 3 Hello
14 14 1 Deer
14 14 1 Door
15 15 1 Hello
17 17 1 Hello
换句话说,除了对连续的项目进行分组外,当它们的 name
值不同时,这些组还会被分成不同的组.(换句话说,只有当它们都是连续的并且具有相同的确切 name
时,它们才一起在一个岛中).我来的最近(而且不是很近)是这样做的:
In other words, in addition to grouping items that are consecutive, these groups are split up into separate groups when their values of name
differ. (In other words, items are only in an island together if they are all consecutive AND have the same exact name
). The closest (and it's not very close) that I have come, is doing this:
SELECT min(number) first_number, max(number) last_number, count(*) AS no_records FROM (
SELECT c.*, @rn := @rn + 1 rn
from (SELECT number FROM table WHERE cc = 1 GROUP BY number, name ORDER BY number) AS c
CROSS JOIN (SELECT @rn := 0) r
) c
GROUP BY number - rn, name ORDER BY number ASC
不过,这不起作用,发生的情况是它似乎将名称的第一次出现返回为 first
,最后一次出现返回为 last
,其中 no_records
是它们之间的数量差异,这肯定是不对的.
This doesn't work, though, and what happens is it seems to return the first appearance of a name as first
and the last appearance as last
, with no_records
being the difference in number between them, which is certainly not right at all.
我觉得这个问题可能是相关的,但我无法理解它,当我尝试将它调整到我的表时,它或多或少地相当于一个简单的 SELECT *
.我需要对我的查询进行哪些修改才能使其正常工作?
I feel like this question might be related, but I've not been able to make much sense of it, and when I tried tweaking it to my table, it just did the equivalent of a simple SELECT *
more or less. What modifications to my query do I need to make to get it to work?
记住:
- 项目可以按任意顺序插入
- 数字可以重复
- 名称可以重复,不一定连续
推荐答案
您的查询没有太大变化.您基本上需要在子查询中选择 name
和 number
并按相同的顺序排序.然后您可以在外部查询中按 name, number - rn
进行分组.
There is not much to change in your query. You basically need to select name
and number
in the subquery and sort in the same order. Then you can group by name, number - rn
in the outer query.
SELECT
min(number) first_number,
max(number) last_number,
count(*) AS no_records,
name
FROM (
SELECT c.*, @rn := @rn + 1 rn
from (
SELECT name, number
FROM `table`
WHERE cc = 1
ORDER BY name, number
LIMIT 99999999999999999
) AS c
CROSS JOIN (SELECT @rn := 0) r
) c
GROUP BY name, number - rn
ORDER BY first_number ASC, name ASC;
结果:
first_number last_number no_records name
1 2 2 Apple
3 3 1 Bean
10 12 3 Hello
14 14 1 Deer
14 14 1 Door
15 15 1 Hello
17 17 1 Hello
db<>小提琴
我通常反对以这种方式使用会话变量.原因是此类解决方案依赖于内部实现,并且可能被版本更新或设置更改破坏.例如:一旦 MariaDB 决定在没有 LIMIT 的情况下忽略子查询中的 ORDER BY 子句.这就是为什么我包含了一个巨大的 LIMIT.
I usually advocate against the use of session variables in this way. The reason is that such solutions depend on internal implementation, and can be broken by version updates or settings changes. For example: Once MariaDB decided to ignore the ORDER BY clause in subqueries without LIMIT. This is why I included a huge LIMIT.
我还在外部 ORDER BY 子句中将 number
替换为 first_number
以避免 ONLY_FULL_GROUP_BY 模式出现问题.
I also replaced number
with first_number
in the outer ORDER BY clause to avoid problems with ONLY_FULL_GROUP_BY mode.
一种更稳定的生成行号的方法是在临时表中使用 AOTO_INCREMENT 列:
A more stable way to generate row numbers is to use an AOTO_INCREMENT column in a temporary table:
drop temporary table if exists tmp_tbl;
create temporary table tmp_tbl (
rn int unsigned auto_increment primary key,
name varchar(64) not null,
number int not null
);
insert into tmp_tbl (name, number)
select name, number
from `table`
order by name, number;
最终的 SELECT 查询与上面的外部查询相同:
The final SELECT query is identical with the outer query above:
SELECT
min(number) first_number,
max(number) last_number,
count(*) AS no_records,
name
FROM tmp_tbl
GROUP BY name, number - rn
ORDER BY first_number ASC, name ASC;
db<>小提琴
在更新的版本中(从 MariaDB 10.2 开始),您可以改用 ROW_NUMBER()
窗口函数:
In a more recent version (starting from MariaDB 10.2) you can use ROW_NUMBER()
window function instead:
SELECT
min(number) first_number,
max(number) last_number,
count(*) AS no_records,
name
FROM (
SELECT
name,
number,
row_number() OVER (ORDER BY name, number) as rn
FROM `table`
WHERE cc = 1
) c
GROUP BY name, number - rn
ORDER BY first_number ASC, name ASC;
db<>小提琴
这篇关于2 列上的间隙和孤岛 - 如果 A 列连续且 B 列相同的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:2 列上的间隙和孤岛 - 如果 A 列连续且 B 列相同
基础教程推荐
- SQL Server:只有 GROUP BY 中的最后一个条目 2021-01-01
- 如何在 SQL Server 的嵌套过程中处理事务? 2021-01-01
- SQL Server 中单行 MERGE/upsert 的语法 2021-01-01
- 将数据从 MS SQL 迁移到 PostgreSQL? 2022-01-01
- 使用pyodbc“不安全"的Python多处理和数据库访问? 2022-01-01
- 无法在 ubuntu 中启动 mysql 服务器 2021-01-01
- Sql Server 字符串到日期的转换 2021-01-01
- SQL Server 2016更改对象所有者 2022-01-01
- 在 VB.NET 中更新 SQL Server DateTime 列 2021-01-01
- ERROR 2006 (HY000): MySQL 服务器已经消失 2021-01-01