全部面试刷题微服务数据库消息队列搜索引擎大数据运维 go语言人工智能

Doris生产环境的compaction配置如何选择？

提问者：帅平问题分类：数据库

Doris生产环境的compaction配置如何选择？我看有Vertical compaction和Segment compaction。如何选？

Doris Compaction

发布于：7个月前 (12-17) IP属地：四川省

4 个回答

没你的江山怎么画

这两种compaction的使用场景不太一样，
1、Vertical Compaction适用于大宽表场景，可以有效降低内存开销并提升执行速度。be.conf配置方式如下：

#开启该功能
enable_vertical_compaction = true
#每个列组包含的列个数，经测试，默认 5 列一组 compaction 的效率及内存使用较友好
vertical_compaction_num_columns_per_group = 5
#配置 vertical compaction 之后落盘文件的大小，默认值 268435456(字节)
vertical_compaction_max_segment_size = 268435456

2、Segment Compaction则适用于单批次大数据量的导入场景，通过合并多个Segment来减少最终生成的文件数量‌，be.conf配置方式如下：

#开启功能
enable_segcompaction = true
#配置合并的间隔。默认 10 表示每生成 10 个 segment 文件将会进行一次 segment compaction。一般设置为 10 - 30，过大的值会增加 segment compaction 的内存用量。
segcompaction_batch_size = 25

日常的场景来说，我们都是使用segment。

发布于：7个月前 (12-17) IP属地：四川省

穿越到古代找美女

记得创建表的时候一定要带上：

"enable_single_replica_compaction" = "true"

这是开启单副本compaction，节省集群的cpu和io资源，示例如下：

CREATE TABLE `orders` (
  `order_id` int NULL,
  `user_id` int NULL,
  `order_status` int NULL,
  `payment_method` int NULL,
  `payable_amount` decimal(38,9) NULL,
  `cts` date NULL
) ENGINE=OLAP
DUPLICATE KEY(`order_id`)
DISTRIBUTED BY HASH(`order_id`) BUCKETS 2
PROPERTIES (
"replication_allocation" = "tag.location.default: 1",
"enable_single_replica_compaction" = "true"
);

发布于：7个月前 (12-17) IP属地：四川省

伤疤

记得创建表的时候一定要带上：

"enable_single_replica_compaction" = "true"

这是开启单副本compaction，节省集群的cpu和io资源，示例如下：

CREATE TABLE `orders` (
  `order_id` int NULL,
  `user_id` int NULL,
  `order_status` int NULL,
  `payment_method` int NULL,
  `payable_amount` decimal(38,9) NULL,
  `cts` date NULL
) ENGINE=OLAP
DUPLICATE KEY(`order_id`)
DISTRIBUTED BY HASH(`order_id`) BUCKETS 2
PROPERTIES (
"replication_allocation" = "tag.location.default: 1",
"enable_single_replica_compaction" = "true"
);

enable_single_replica_compaction = true，这个compaction仅支持DUPLICATE表和AGGREGATE表，不支持unique表。

发布于：7个月前 (12-17) IP属地：四川省

我来回答