當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

ClickHouse之集群搭建以及数据复制

發布時間：2024/4/14 编程问答 31 豆豆

生活随笔收集整理的這篇文章主要介紹了 ClickHouse之集群搭建以及数据复制小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

前面的文章簡單的介紹了ClickHouse，以及也進行了簡單的性能測試。本次說說集群的搭建以及數據復制，如果復制數據需要zookeeper配合。

環境：

1. 3臺機器，我這里是3臺虛擬機。都安裝了clickhouse。

2. 綁定hosts，其實不綁定也沒關系，配置文件里面直接寫ip。（3臺機器都綁定hosts，如下）

192.168.0.10 db_server_yayun_01 192.168.0.20 db_server_yayun_02 192.168.0.30 db_server_yayun_03

3. 創建配置文件，默認這個配置文件是不存在的。/etc/clickhouse-server/config.xml有提示，如下：
If element has 'incl' attribute, then for it's value will be used corresponding substitution from another file.
By default, path to file with substitutions is /etc/metrika.xml. It could be changed in config in 'include_from' element.
Values for substitutions are specified in /yandex/name_of_substitution elements in that file.

配置文件/etc/metrika.xml內容如下：

<yandex> <clickhouse_remote_servers><perftest_3shards_1replicas><shard><internal_replication>true</internal_replication><replica><host>db_server_yayun_01</host><port>9000</port></replica></shard><shard><replica><internal_replication>true</internal_replication><host>db_server_yayun_02</host><port>9000</port></replica></shard><shard><internal_replication>true</internal_replication><replica><host>db_server_yayun_03</host><port>9000</port></replica></shard></perftest_3shards_1replicas> </clickhouse_remote_servers><zookeeper-servers><node index="1"><host>192.168.0.30</host><port>2181</port></node> </zookeeper-servers><macros><replica>192.168.0.10</replica> </macros><networks><ip>::/0</ip> </networks><clickhouse_compression> <case>
<min_part_size>10000000000</min_part_size> <min_part_size_ratio>0.01</min_part_size_ratio><method>lz4</method> </case>
</clickhouse_compression></yandex>

3臺機器的配置文件都一樣，唯一有區別的是：

服務器ip是多少這里就寫多少，其實不寫ip也沒關系，3臺機器不重復就行。這里是復制需要用到的配置。還有zk的配置如下：

<zookeeper-servers><node index="1"><host>192.168.0.30</host><port>2181</port></node> </zookeeper-servers>

我的zk是安裝在30的機器上面的，只安裝了一個實例，生產環境肯定要放到單獨的機器，并且配置成集群。配置文件修改好以后3臺服務器重啟。
官方文檔給的步驟是：

ClickHouse deployment to clusterClickHouse cluster is a homogenous cluster. Steps to set up:1. Install ClickHouse server on all machines of the cluster 2. Set up cluster configs in configuration file 3. Create local tables on each instance 4. Create a Distributed table

前面2步都搞定了，下面創建本地表，再創建Distributed表。（3臺機器都創建，DDL不同步，蛋疼）

CREATE TABLE ontime_local (FlightDate Date,Year UInt16) ENGINE = MergeTree(FlightDate, (Year, FlightDate), 8192); CREATE TABLE ontime_all AS ontime_local ENGINE = Distributed(perftest_3shards_1replicas, default, ontime_local, rand())

插入數據（隨便一臺機器就行）：

:) insert into ontime_all (FlightDate,Year)values('2001-10-12',2001);INSERT INTO ontime_all (FlightDate, Year) VALUESOk.1 rows in set. Elapsed: 0.013 sec. :) insert into ontime_all (FlightDate,Year)values('2002-10-12',2002);INSERT INTO ontime_all (FlightDate, Year) VALUESOk.1 rows in set. Elapsed: 0.004 sec. :) insert into ontime_all (FlightDate,Year)values('2003-10-12',2003);INSERT INTO ontime_all (FlightDate, Year) VALUESOk.

我這里插入了3條數據。下面查詢看看（任何一臺機器都可以）：

當在其中一臺機器上面查詢的時候，抓包其他機器可以看見是有請求的。

tcpdump -i any -s 0 -l -w - dst port 9000

那么關閉其中一臺機器呢？

:) select * from ontime_all;SELECT * FROM ontime_all ┌─FlightDate─┬─Year─┐ │ 2001-10-12 │ 2001 │ └────────────┴──────┘ ┌─FlightDate─┬─Year─┐ │ 2002-10-12 │ 2002 │ └────────────┴──────┘ ┌─FlightDate─┬─Year─┐ │ 2003-10-12 │ 2003 │ └────────────┴──────┘ ↓ Progress: 6.00 rows, 24.00 B (292.80 rows/s., 1.17 KB/s.) Received exception from server: Code: 279. DB::Exception: Received from localhost:9000, ::1. DB::NetException. DB::NetException: All connection tries failed. Log: Code: 210, e.displayText() = DB::NetException: Connection refused: (db_server_yayun_02:9000, 192.168.0.20), e.what() = DB::NetException Code: 210, e.displayText() = DB::NetException: Connection refused: (db_server_yayun_02:9000, 192.168.0.20), e.what() = DB::NetException Code: 210, e.displayText() = DB::NetException: Connection refused: (db_server_yayun_02:9000, 192.168.0.20), e.what() = DB::NetException

可以看見已經拋錯了，竟然不是高可用？后面又看到了文檔的另外一種配置方法，那就是配置2個節點，副本2個，經過測試高可用沒有問題，另外也是分布式并行查詢。感興趣的同學可以自行測試。
https://clickhouse.yandex/reference_en.html#Distributed

下面進行數據復制的測試,zk已經配置好了，直接建表測試（3臺機器都創建）：

CREATE TABLE ontime_replica (FlightDate Date,Year UInt16) ENGINE = ReplicatedMergeTree('/clickhouse_perftest/tables/ontime_replica','{replica}',FlightDate,(Year, FlightDate),8192);

插入數據測試：

insert into ontime_replica (FlightDate,Year)values('2018-10-12',2018);

任何一臺機器均可查詢到。其實到現在對于集群和復制都還沒徹底搞明白，因為分布式表也進行了數據復制，所以有點懵。有大嬸的話歡迎一起交流。

參考資料：

https://clickhouse.yandex/reference_en.html#Distributed

https://clickhouse.yandex/tutorial.html

轉載于:https://www.cnblogs.com/gomysql/p/6708650.html

總結

以上是生活随笔為你收集整理的ClickHouse之集群搭建以及数据复制的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： scala学习笔记-基础语法（1）
下一篇： C# 读取Excel文件，并写入word