16_clickhouse,HDFS引擎,JDBC引擎
21.HDFS引擎
21.1.HDFS引擎
ClickHouse通過該引擎管理HDFS上的數據,從而與Apache Hadoop生態系統集成。
指定表引擎:
ENGINE = HDFS(URI, format)
URI參數是HDFS文件URI,format參數指定文件格式。
URI路徑支持globs,這樣該表是只讀的:
1.* : 替換任意字符,除了空字符和’/’。
2.? : 替換任意單個字符。
3.{some_string,another_string,yet_another_one} : 替換字符串列表中的任意一個。
4.{N…M} : 替換N到M范圍的任意數字,包括左右的邊界值。
示例:假設有如下HDFS的URI的文件,格式為TSV
'hdfs://hdfs1:9000/some_dir/some_file_1' 'hdfs://hdfs1:9000/some_dir/some_file_2' 'hdfs://hdfs1:9000/some_dir/some_file_3' 'hdfs://hdfs1:9000/another_dir/some_file_1''hdfs://hdfs1:9000/another_dir/some_file_2' 'hdfs://hdfs1:9000/another_dir/some_file_3'有幾種方法可以生成包含所有6個文件的表:
CREATE TABLE table_with_range (name String, value UInt32) ENGINE = HDFS('hdfs://hdfs1:9000/{some,another}_dir/some_file_{1..3}', 'TSV') CREATE TABLE table_with_question_mark (name String, value UInt32) ENGINE = HDFS('hdfs://hdfs1:9000/{some,another}_dir/some_file_?', 'TSV') CREATE TABLE table_with_asterisk (name String, value UInt32) ENGINE = HDFS('hdfs://hdfs1:9000/{some,another}_dir/*', 'TSV')21.2.案例
1.創建HDFS引擎表(以下的ClickHouse最好是hdfs所在的機器上,這樣的話才可以按照下面的方式訪問通。)
drop table hdfs_engine_table; CREATE TABLE hdfs_engine_table (name String,value UInt32) ENGINE=HDFS('hdfs://tqHadoopCluster/toto_2/tmp/hdfs','TSV'); 此處配置了ha,設置了名稱。tqHadoopCluster在單擊下是:ip:8021效果圖:
middleware :) CREATE TABLE hdfs_engine_table (name String,value UInt32) ENGINE=HDFS('hdfs://tqHadoopCluster/toto_2/tmp/hdfs','TSV');CREATE TABLE hdfs_engine_table (`name` String,`value` UInt32 ) ENGINE = HDFS('hdfs://tqHadoopCluster/toto_2/tmp/hdfs', 'TSV')Ok.0 rows in set. Elapsed: 0.007 sec. middleware :)2.插入數據
INSERT INTO hdfs_engine_table VALUES (‘one’, 1), (‘two’, 2), (‘three’, 3);
3.查看數據
SELECT * FROM hdfs_engine_table LIMIT 2; ┌─name──┬─value─┐ │ one │ 1 │ │ two │ 2 │ │ three │ 3 │ └───────┴───────┘4.查看HDFS路徑
hdfs dfs -ls hdfs://tqHadoopCluster/tmp/hdfs再如:
CREATE TABLE hdfs_engine_table1 (name String, value UInt32) ENGINE=HDFS('hdfs://tqHadoopCluster/tmp/other_storage1/*', 'TSV');22.JDBC引擎
22.1.JDBC引擎
ClickHouse通過JDBC引擎連接到外部數據庫, 如MySQL、Oracle、PostgreSQL等。
ClickHouse使用一個單獨的項目clickhouse-jdbc-bridge,其作為一個守護進程運行。
Clickhouse-jdbc-bridge項目鏈接:https://github.com/ClickHouse/clickhouse-jdbc-bridge (git地址:https://github.com/ClickHouse/clickhouse-jdbc-bridge.git)
Jar包地址:https://github.com/ClickHouse/clickhouse-jdbc-bridge/releases/tag/release_1.0.1
指定表引擎:
1.dbms_uri : 外部的DBMS的URI。格式:
jdbc:<driver_name>://<host_name>:<port>/?user=<username>&password=<password>。例如:MySQL的格式:jdbc:mysql://localhost:3306/?user=root&password=root。
2.external_database : 外部DBMS的數據庫。
3.external_database : 外部DBMS的數據庫。
啟動服務:
java -jar ./clickhouse-jdbc-bridge-1.0.1.jar --driver-path /root/jdbc/lib/
其中/root/jdbc/lib為數據庫驅動包所在的目錄。
修改配置:在config.xml修改:
<jdbc_bridge><host>192.168.106.102</host><port>9019</port> </jdbc_bridge>
然后重啟ClickHouse
下面是關于clickhouse-jdbc-bridge的一張圖:
22.2.案例
下載或編譯源碼,生成jar包:clickhouse-jdbc-bridge-2.0.0-SNAPSHOT.jar
clickhouse-jdbc-bridge-2.0.0-SNAPSHOT.jar放到服務器上。
啟動服務:
[root@middleware ~]# pwd /root [root@middleware ~]# java -jar ./clickhouse-jdbc-bridge-1.0.1.jar /root/mysql-connector-java-5.1.34.jar 2020-12-09 16:33:11.011 [main] [INFO ] {Utils:837} - Loading JSON from file [config/vertx.json]... 2020-12-09 16:33:12.012 [main] [WARN ] {Utils:846} - Failed to load JSON from file config/vertx.json 2020-12-09 16:33:17.017 [vert.x-eventloop-thread-0] [INFO ] {Utils:837} - Loading JSON from file [config/server.json]... 2020-12-09 16:33:18.018 [vert.x-eventloop-thread-0] [WARN ] {Utils:846} - Failed to load JSON from file config/server.json 2020-12-09 16:33:18.018 [vert.x-eventloop-thread-0] [INFO ] {JdbcBridgeVerticle:531} - Registering consumer to monitor configuration file(s) at [config/datasources] 2020-12-09 16:33:18.018 [vert.x-eventloop-thread-0] [INFO ] {JdbcBridgeVerticle:531} - Registering consumer to monitor configuration file(s) at [config/schemas] 2020-12-09 16:33:18.018 [vert.x-eventloop-thread-0] [INFO ] {JdbcBridgeVerticle:531} - Registering consumer to monitor configuration file(s) at [config/queries] 2020-12-09 16:33:18.018 [vert.x-eventloop-thread-0] [INFO ] {DefaultDataSourceManager:196} - Registering new type of datasource: [jdbc] -> [ru.yandex.clickhouse.jdbcbridge.impl.JdbcDataSource] 2020-12-09 16:33:18.018 [vert.x-eventloop-thread-0] [INFO ] {DefaultDataSourceManager:200} - Default datasource type is set to [jdbc] 2020-12-09 16:33:18.018 [vert.x-eventloop-thread-0] [INFO ] {DefaultDataSourceManager:196} - Registering new type of datasource: [config] -> [ru.yandex.clickhouse.jdbcbridge.impl.ConfigDataSource] 2020-12-09 16:33:18.018 [vert.x-eventloop-thread-0] [INFO ] {DefaultDataSourceManager:196} - Registering new type of datasource: [script] -> [ru.yandex.clickhouse.jdbcbridge.impl.ScriptDataSource] 2020-12-09 16:33:18.018 [vert.x-eventloop-thread-0] [INFO ] {Utils:837} - Loading JSON from file [config/httpd.json]... 2020-12-09 16:33:18.018 [vert.x-eventloop-thread-0] [WARN ] {Utils:846} - Failed to load JSON from file config/httpd.json 2020-12-09 16:33:18.018 [vert.x-eventloop-thread-0] [INFO ] {JdbcBridgeVerticle:220} - Starting web server... 2020-12-09 16:33:19.019 [vert.x-worker-thread-1] [INFO ] {DefaultDataSourceManager:262} - No datasource configuration found 2020-12-09 16:33:19.019 [vert.x-worker-thread-1] [INFO ] {DefaultQueryManager:77} - No schema configuration found 2020-12-09 16:33:19.019 [vert.x-worker-thread-1] [INFO ] {DefaultQueryManager:76} - No query configuration found 2020-12-09 16:33:19.019 [vert.x-eventloop-thread-0] [INFO ] {JdbcBridgeVerticle:224} - Server http://0.0.0.0:9019 started in 14397 ms2 準備MySQL表的數據
use test; drop table if exists `test`; CREATE TABLE `test` (`int_id` INT NOT NULL AUTO_INCREMENT,`int_nullable` INT NULL DEFAULT NULL,`float` FLOAT NOT NULL,`float_nullable` FLOAT NULL DEFAULT NULL, PRIMARY KEY (`int_id`));插入數據:
insert into test (`int_id`, `float`) VALUES (1,2);select * from test; +--------+--------------+-------+----------------+ | int_id | int_nullable | float | float_nullable | +--------+--------------+-------+----------------+ | 1 | NULL | 2 | NULL | +--------+--------------+-------+----------------+3 在ClickHouse創建JDBC引擎表
drop table if exists jdbc_table; CREATE TABLE jdbc_table (`int_id` Int32,`int_nullable` Nullable(Int32),`float` Float32,`float_nullable` Nullable(Float32) ) ENGINE JDBC('jdbc:mysql://192.168.110.173:3306/?user=root&password=xxxx', 'test', 'test');效果圖:
middleware :) CREATE TABLE jdbc_table :-] ( :-] `int_id` Int32, :-] `int_nullable` Nullable(Int32), :-] `float` Float32, :-] `float_nullable` Nullable(Float32) :-] ) :-] ENGINE JDBC('jdbc:mysql://xxx.xxx.xxx.xxx:3306/?user=root&password=xxxxxxx', 'test', 'test');CREATE TABLE jdbc_table (`int_id` Int32,`int_nullable` Nullable(Int32),`float` Float32,`float_nullable` Nullable(Float32) ) ENGINE = JDBC('jdbc:mysql://xxx.xxx.xxx.xxx:3306/?user=root&password=xxxxxxx', 'test', 'test')Ok.0 rows in set. Elapsed: 0.011 sec. middleware :)查看數據:
:) select int_id, float from jdbc_table; ? SELECT int_id, float FROM jdbc_table ? ┌─int_id─┬─float─┐ │ 1 │ 2 │ └────────┴───────┘總結
以上是生活随笔為你收集整理的16_clickhouse,HDFS引擎,JDBC引擎的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 15_clickhouse,MySQL引
- 下一篇: 泰米时空神武(泰米时空)