MongoDB Reader
MongoDBReader plugin uses MongoDB's Java client MongoClient to perform MongoDB read operations.
Configuration Example
This example reads a table from MongoDB and prints to terminal
{
"job": {
"setting": {
"speed": {
"channel": 1,
"bytes": -1
}
},
"content": {
"reader": {
"name": "mongodbreader",
"parameter": {
"username": "",
"password": "",
"column": [
"unique_id",
"sid",
"user_id",
"auction_id",
"content_type",
"pool_type",
"frontcat_id",
"catagoryid",
"gmt_create",
"taglist",
"property",
"scorea",
"scoreb",
"scorec"
],
"connection": {
"address": [
"127.0.0.1:27017"
],
"database": "tag_per_data",
"collection": "tag_data",
"authDb": "admin"
}
}
},
"writer": {
"name": "streamwriter",
"parameter": {
"print": "true"
}
}
}
}
}Parameters
| Configuration | Required | Type | Default Value | Description |
|---|---|---|---|---|
| address | Yes | list | None | MongoDB data address information, multiple can be written |
| username | No | string | None | MongoDB username |
| password | No | string | None | MongoDB password |
| database | Yes | string | None | MongoDB database |
| collection | Yes | string | None | MongoDB collection name |
| column | Yes | list | None | MongoDB document column names, does not support ["*"] to get all columns |
| query | No | string | None | Custom query conditions |
| fetchSize | No | int | 2048 | Batch size for retrieving records |
collection
The collection here currently only supports a single collection, so the type is set to string rather than the array type common in other plugins. This is particularly noteworthy.
column
column is used to specify the field names to be read. Here we make two assumptions about field name composition:
- Cannot start with single quote (
') - Cannot consist entirely of numbers and dots (
.)
Based on the above assumptions, we can simplify the column configuration while also specifying some constants as supplementary fields. For example, when collecting a table, we generally need to add collection time, collection source and other constants, which can be configured like this:
{
"column": ["col1", "col2", "col3", "'source_mongodb'", "20211026", "123.12"]
}The last three fields in the above configuration are constants, treated as string type, integer type, and floating point type respectively.
If the field is nested, you can use a dot (.) to indicate the hierarchical relationship, for example:
{
"column": ["col1", "col2.subcol1", "col2.subcol2", "col3"]
}query
query is a BSON string that conforms to MongoDB query format, for example:
{
"query": "{amount: {$gt: 140900}, oc_date: {$gt: 20190110}}"
}The above query is similar to where amount > 140900 and oc_date > 20190110 in SQL.
Type Conversion
| Addax Internal Type | MongoDB Data Type |
|---|---|
| Long | int, Long |
| Double | double |
| String | string, array |
| Date | date |
| Boolean | boolean |
| Bytes | bytes |