An Overview of MongoDB Schema Validation

Akash Kathiriya

Everyone knows that MongoDB is schemaless, then why it is required to perform schema validation? It is easy and fast to develop the application with MongoDB's schema-less behavior and use it as a proof of concept. But once the application moves to production and becomes stable and mature, there is no need to change the schema frequently and it is not advisable also. At this time, it is very important to enforce some schema validation in your database to avoid unwanted data being inserted which can break your application. This becomes much more important when data is being inserted from multiple sources in the same database.

Schema validation allows you to define the specific structure of documents in each collection. If anyone tries to insert some documents which don't match with the defined schema, MongoDB can reject this kind of operation or give warnings according to the type of validation action.

MongoDB provides two ways to validate your schema, Document validation, and JSON schema validation. JSON Schema validation is the extended version of document validation, so let's start with document validation.

Document Validation

Most of the developers who have worked with relational databases know the importance of predictability of the data models or schema. Therefore, MongoDB introduced document validation from version 3.2. Let's see how to add validation rules in MongoDB collections.

Suppose, you have a collection of users which have the following types of documents.

{
    "name": "Alex",
    "email": "[email protected]",
    "mobile": "123-456-7890"
} 

And, following are the validations which we want to check while adding new documents in users collection:

  • name, email fields are mandatory
  • mobile numbers should follow specific structure: xxx-xxx-xxxx

To add this validation, we can use the “validator” construct while creating a new collection. Run the following query in Mongo shell,

db.createCollection("users", {
  validator: {
        $and: [
            {
                "name": {$type: "string", $exists: true}
            },
            {
                "mobile": {$type: "string", $regex: /^[0-9]{3}-[0-9]{3}-[0-9]{4}$/}
            },
            {
                "email": {$type: "string", $exists: true}
            }
        ]
    }
})

You should see the following output:

{ "ok" : 1 }

Now, if you try to add any new document without following the validation rules then mongo will throw a validation error. Try to run the following insert queries.

Query:1

db.users.insert({
    "name": "akash"
})

Output:

WriteResult({
    "nInserted" : 0,
    "writeError" : {
        "code" : 121,
        "errmsg" : "Document failed validation"
    }
})

Query:2

db.users.insert({
    "name": "akash",
    "email": "[email protected]",
    "mobile": "123-456-7890"
})

Output:

WriteResult({ "nInserted" : 1 })

However, there are some restrictions with document validation approach such as one can add any number of new key-value pair to the document and insert it into the collection. This can't be prevented by document validation. Consider the following example,

db.users.insert({
    "name": "akash",
    "email": "[email protected]",
    "mobile": "123-456-7890",
    "gender": "Male"
})

Output:

WriteResult({ "nInserted" : 1 })

Apart from this, document validation only checks for the values. Suppose, if you try to add the document with "nmae"(typo) as a key instead of "name", mongo will consider it as a new field and the document will be inserted in the DB. These things should be avoided when you are working with the production database. To support all this, MongoDB introduced the "jsonSchema" operator with “validator” construct from version 3.6. Let's see how to add the same validation rules as above and avoid adding new/misspelled fields.

Severalnines
 
Become a MongoDB DBA - Bringing MongoDB to Production
Learn about what you need to know to deploy, monitor, manage and scale MongoDB

jsonSchema Validation

Run the following command in mongo shell to add the validation rules using "jsonSchema" operator.

db.runCommand(
  {
    "collMod": "users_temp",
    "validator": {
      "$jsonSchema": {
        "bsonType": "object",
        "additionalProperties": false,
        "required": [
          "name",
          "email"
        ],
        "properties": {
          "_id": {},
          "name": {
            "bsonType": "string"
          },
          "email": {
            "bsonType": "string"
          },
          "mobile": {
            "bsonType": "string",
            "pattern": "^[0-9]{3}-[0-9]{3}-[0-9]{4}$"
          }
        }
      }
    }
  })

Let's see now, what happens when we try to insert the following document.

db.users.insert({
    "name": "akash",
    "email": "[email protected]",
    "mobile": "123-456-7890",
    "gender": "Male"
})

It will throw an error as we haven't defined gender field in the "jsonSchema".

WriteResult({
    "nInserted" : 0,
    "writeError" : {
        "code" : 121,
        "errmsg" : "Document failed validation"
    }
})

Same way, if you have typos in any field names, mongo will throw the same error.

The schema defined above is the same as the one which we used in document validation. Additionally, we added the "additionalProperties" field to avoid typos in field names and the addition of new fields in documents. It will allow only fields that are defined under "properties" field. Here is the overview of some properties which we can use under "jsonSchema" operator.

  • bsonType: array | object | string | boolean | number | null
  • required: an array of all mandatory fields
  • enum: an array of only possible values for any field
  • minimum: minimum value of the field
  • maximum: maximum value of the field
  • minLength: minimum length of the field
  • mixLength: maximum length of the field
  • properties: a collection of valid JSON schemas
  • additionalProperties: stops us from adding any other fields than mentioned under properties field
  • title: title for any field.
  • description: short description for any field.

Apart from schema validation, "jsonSchema" operator can also be used in find and match stage inside the aggregation pipeline.

Conclusion

Document/Schema validations are not required or desirable in all situations but generally, it's a good practice to add them in your database as it will increase the productivity of developers who are dealing with your database. They will know what kind of response to expect from the database since there won't be any random data.

In this article, we learned about the importance of schema validation in MongoDB and how to add validations at document level using document validation and "jsonSchema" operator.

ClusterControl
The only management system you’ll ever need to take control of your open source database infrastructure.