Hive Reader
Hive Reader plugin implements the ability to read data from Apache Hive database.
The main purpose of adding this plugin is to solve the problem of Kerberos authentication when using RDBMS Reader plugin to read Hive database. If your Hive database does not have Kerberos authentication enabled, you can directly use RDBMS Reader. If Kerberos authentication is enabled, you can use this plugin.
Example
We create the following table in Hive's test database and insert a record:
sql
create table default.hive_reader
(
col1 int,
col2 string,
col3 timestamp
)
stored as orc;
insert into hive_reader values(1, 'hello', current_timestamp()), (2, 'world', current_timestamp());The following configuration reads this table to terminal:
json
{
"job": {
"setting": {
"speed": {
"byte": -1,
"channel": 1
},
"errorLimit": {
"record": 0,
"percentage": 0
}
},
"content": {
"reader": {
"name": "hivereader",
"parameter": {
"column": [
"*"
],
"username": "hive",
"password": "",
"connection": {
"jdbcUrl": "jdbc:hive2://localhost:10000/default;principal=hive/[email protected]",
"table": [
"hive_reader"
]
},
"where": "logdate='20211013'",
"haveKerberos": true,
"kerberosKeytabFilePath": "/etc/security/keytabs/hive.headless.keytab",
"kerberosPrincipal": "[email protected]"
}
},
"writer": {
"name": "streamwriter",
"parameter": {
"print": true
}
}
}
}
}Save the above configuration file as job/hive2stream.json
Execute Collection Command
Execute the following command for data collection
bash
bin/addax.sh job/hive2stream.jsonParameters
| Configuration | Required | Type | Default Value | Description |
|---|---|---|---|---|
| jdbcUrl | Yes | list | None | JDBC connection information of target database |
| driver | No | string | None | Custom driver class name to solve compatibility issues, see description below |
| username | Yes | string | None | Username of data source |
| password | No | string | None | Password for specified username of data source, can be omitted if no password |