使用“ group by”子句优化MySQL查询“ count（*）”|软件开发问答库

我正在使用MySQL 5.7，下面的sql运行大约10+秒，而总共只有60K数据，并且没有太多列，我不为什么要执行那么长时间。为了优化SQL，我添加了索引并更新了SQL，但是没有用。

研究：当我删除where子句时，它运行400毫秒，添加它并删除count（*）后，它也要花费400毫秒。

SELECT adt.data_source_id AS groupName, count(1) AS count
    FROM assets_data_table adt
    WHERE adt.is_deleted = 0
      AND adt.tenant_id = 2
      AND adt.sync_status = 1
    GROUP BY adt.data_source_id;

如下所示显示我的表DDL：

create table assets_data_table
(
    id bigint unsigned auto_increment comment 'pk'
        primary key,
    create_at datetime default CURRENT_TIMESTAMP not null comment 'create date',
    update_at datetime default CURRENT_TIMESTAMP not null on update CURRENT_TIMESTAMP comment 'update date',
    create_by int default 0 not null ,
    update_by int default 0 not null ,
    is_deleted tinyint(1) default 0 not null comment '1 deleted，0 undeleted',
    table_name varchar(100) not null ,
    table_name_cn varchar(100) null ,
    data_source_id bigint not null ,
    sync_status tinyint(1) default 0 not null ,
    db_id bigint not null ,
    schema_name varchar(100) null ,
    table_storage bigint default 0 not null ,
    table_owner bigint default 0 not null ,
    table_hot bigint default 0 not null ,
    extra_attribute json null comment 'json string',
    origin_table_id varchar(32) null ,
    tenant_id bigint not null ,
    sync_date datetime null ,
    table_create_at datetime null 
)
comment 'table';
# added to optimize sql
create index assets_data_table_ck
    on assets_data_table (is_deleted, tenant_id, sync_status);
# added to optimize sql
create index assets_data_table_ck_1
    on assets_data_table (is_deleted, tenant_id, sync_status, data_source_id);

create index idx_datasource_id
    on assets_data_table (data_source_id);

create index idx_dbid
    on assets_data_table (db_id);

解释：

| id | select\_type | table | partitions | type | possible\_keys | key | key\_len | ref | rows | filtered | Extra |
| 1 | SIMPLE | adt | NULL | ref | idx\_datasource\_id,assets\_data\_table\_ck,assets\_data\_table\_ck\_1 | assets\_data\_table\_ck\_1 | 10 | const,const,const | 28218 | 100 | Using where; Using index |

在此处附加查询结果：

我的优化：使用子查询，第一步：选择所需的列； 2.使用group by子句。它可以使用索引，只需要400毫秒即可执行。

SELECT adt.data_source_id AS groupName, count(*) AS count
FROM (select is_deleted, tenant_id, sync_status, data_source_id from assets_data_table
      WHERE is_deleted = 0
        AND tenant_id = 2
        AND sync_status = 1) adt
GROUP BY adt.data_source_id;

| id | select\_type | table | partitions | type | possible\_keys | key | key\_len | ref | rows | filtered | Extra |

| 1 | SIMPLE | assets\_data\_table | NULL | ref | idx\_datasource\_id,assets\_data\_table\_ck,assets\_data\_table\_ck\_1 | assets\_data\_table\_ck\_1 | 10 | const,const,const | 28218 | 100 | Using where; Using index |

我的问题：

是否有其他方法可以优化此查询？
group by子句如何工作，为什么在我使用子查询之后它可以通过索引，它可以以其他方式使用索引吗？

谢谢！

WHERE adt.is_deleted = 0
  AND adt.tenant_id = 2
  AND adt.sync_status = 1
GROUP BY adt.data_source_id;

需要4列的复合索引：

 INDEX(is_deleted, tenant_id, sync_status, -- in any order
       data_source_id)   -- last

并且，由于仅使用了这4列，因此索引处于“覆盖”状态，因此使其速度可能是scaisEdge的Answer的两倍。在EXPLAIN中用“使用索引”（不是“使用索引条件”）表示“覆盖”。

嵌套查询建议不太可能对您有帮助。（它在其他情况下也有帮助。）

如果您同时拥有

(is_deleted, tenant_id, sync_status)
(is_deleted, tenant_id, sync_status, data_source_id)

第一个是“冗余”。另外，当第二个更好时，优化器可能会错误地使用第一个。

BIGINT很大（8个字节）。缩小它可能会有所帮助。