Skip to content

MongoDB Reader

MongoDBReader plugin uses MongoDB's Java client MongoClient to perform MongoDB read operations.

Configuration Example

This example reads a table from MongoDB and prints to terminal

json
{
  "job": {
    "setting": {
      "speed": {
        "channel": 1,
        "bytes": -1
      }
    },
    "content": {
      "reader": {
        "name": "mongodbreader",
        "parameter": {
          "username": "",
          "password": "",
          "column": [
            "unique_id",
            "sid",
            "user_id",
            "auction_id",
            "content_type",
            "pool_type",
            "frontcat_id",
            "catagoryid",
            "gmt_create",
            "taglist",
            "property",
            "scorea",
            "scoreb",
            "scorec"
          ],
          "connection": {
            "address": [
              "127.0.0.1:27017"
            ],
            "database": "tag_per_data",
            "collection": "tag_data",
            "authDb": "admin"
          }
        }
      },
      "writer": {
        "name": "streamwriter",
        "parameter": {
          "print": "true"
        }
      }
    }
  }
}

Parameters

ConfigurationRequiredTypeDefault ValueDescription
addressYeslistNoneMongoDB data address information, multiple can be written
usernameNostringNoneMongoDB username
passwordNostringNoneMongoDB password
databaseYesstringNoneMongoDB database
collectionYesstringNoneMongoDB collection name
columnYeslistNoneMongoDB document column names, does not support ["*"] to get all columns
queryNostringNoneCustom query conditions
fetchSizeNoint2048Batch size for retrieving records

collection

The collection here currently only supports a single collection, so the type is set to string rather than the array type common in other plugins. This is particularly noteworthy.

column

column is used to specify the field names to be read. Here we make two assumptions about field name composition:

  • Cannot start with single quote (')
  • Cannot consist entirely of numbers and dots (.)

Based on the above assumptions, we can simplify the column configuration while also specifying some constants as supplementary fields. For example, when collecting a table, we generally need to add collection time, collection source and other constants, which can be configured like this:

json
{
  "column": ["col1", "col2", "col3", "'source_mongodb'", "20211026", "123.12"]
}

The last three fields in the above configuration are constants, treated as string type, integer type, and floating point type respectively.

If the field is nested, you can use a dot (.) to indicate the hierarchical relationship, for example:

json
{
  "column": ["col1", "col2.subcol1", "col2.subcol2", "col3"]
}

query

query is a BSON string that conforms to MongoDB query format, for example:

json
{
  "query": "{amount: {$gt: 140900}, oc_date: {$gt: 20190110}}"
}

The above query is similar to where amount > 140900 and oc_date > 20190110 in SQL.

Type Conversion

Addax Internal TypeMongoDB Data Type
Longint, Long
Doubledouble
Stringstring, array
Datedate
Booleanboolean
Bytesbytes