blog

An Overview of Client-Side Field Level Encryption in MongoDB

Onyancha Brian Henry

Published:

Data often requires high end security on nearly every level of the data transaction so as to meet security policies, compliance, and government regulations. Organization reputation may be wrecked if there is unauthorized access to sensitive data, hence failure to comply with the outlined mandate. 

In this blog we will be discussing some of the security measures you can employ in regards to MongoDB,  especially focusing on the client side of things.

Scenarios Where Data May Be Accessed

There are several ways someone can access your MongoDB data, here are some of them…

  1. Capture of data over an insecure network. Someone can access your data through an API with a VPN network and it will be difficult to track them down. Data at rest is often the culprit in this case.
  2. A super user such as an administrator having direct access. This happens when you fail to define user roles and restrictions.
  3. Having access to on-disk data while reading databases of backup files.
  4. Reading the server memory and logged data.
  5. Accidental disclosure of data by staff member.

MongoDB Data Categories and How They are Secured

In general, any database system involves two type of data: 

  1. Data-at-rest : One that is stored in the database files
  2. Data-in-transit: One that is transacted between a client, server and the database.

MongoDB has an Encryption at Rest feature that encrypts database files on disk hence preventing access to on-disk database files.

Data-in-transit over a network can be secured in MongoDB through Transport Encryption using TLS/SSL by encrypting the data.

In the case of data being accidentally disclosed by a staff member for instance a receptionist on desktop screen, MongoDB integrates the Role-Based Access Control that allows administrators to grant and restrict collection-level permission for users.

Data transacted over the server may remain in memory and these approaches do not at any point address the security concern against data access in server memory. MongoDB therefore introduced Client-Side Field Level Encryption for encrypting specific fields of a document that involve confidential data.

Field Level Encryption

MongoDB works with documents that have defined fields. Some fields may be required to hold confidential information such as credit card number, social security number, patience diagnosis data and so much more.

Field Level Encryption will enable us to secure the fields and they can only be accessed by an authorized personnel with the decryption keys.

Encryption can be done in two ways

  1. Using a secret key. A single key is used for both encrypting and decrypting hence it has to be presented at source and destination transmission but kept secret by all parties.
  2. Using a public key. Uses a pair of keys whereby one is used to encrypt and the other used to decrypt

When applying Field Level Encryption consider using a new database setup rather than an existing one.

Client-Side Field Level Encryption (CSFLE)

Introduced in MongoDB version 4.2 Enterprise to offer database administrators with an adjustment to encrypt fields involving values that need to be secured. This is to say, the sensitive data is encrypted or decrypted by the client and only communicated to and from the server in an encrypted form. Besides, even super users who don’t have the encryption keys, will not have control over these encrypted data fields.

How to Implement CSFLE

In order for you to implement the Client-Side Field Level Encryption, you require the following:

  1. MongoDB Server 4.2 Enterprise
  2. MongoDB  Compatible with CSFLE
  3. File System Permissions
  4. Specific language drivers. (In our blog we are going to use Node.js)

The implementation procedure involves:

  • A local development environment with a software for running client and server
  • Generating and validating the encryption keys.
  • Configuring the client for automatic field-level encryption
  • Throughout operations in terms of queries of the encrypted fields.

CSFLE Implementation

CSFLE uses  the envelope encryption strategy whereby data encryption keys are encrypted with another key known as the master key. The Client application creates a master key that is stored in the Local Key Provider essentially the local file system.However, this storage approach is insecure hence in production, one is advised to configure the key in a Key Management System (KMS) that stores and decrypts data encryption keys remotely.

After the data encryption keys are generated, they are stored in the vault collection in the same MongoDB replica set as the encrypted data.

Create Master Key

In node js, we need to generate a 96-byte locally managed master key and write it to a file in the directory where the main script is executed from: 

$npm install fs && npm install crypto

Then in the script:

const crypto = require(“crypto”)

const fs = require(“fs”)



try{

fs.writeFileSync(‘masterKey.txt’, crypto.randomBytes(96))

}catch(err){

throw err;

}

Create Data Encryption Key

This key is stored in a key vault collection where CSFLE enabled clients can access the key for encryption/decryption. To generate one, you need the following:

  • Locally-managed master key
  • Connection to your database that is, the MongoDB connection string
  • Key vault namespace (database and collection)

Steps to Generate the Data Encryption Key

  1. Read the local master key generate before

const localMasterKey = fs.readFileSync(‘./masterKey.txt’);
  1. Specify the KMS provider settings that will be used by the client to discover the master key.

const kmsProvider = {

local: {

key: localMasterKey

}

}
  1. Creating the Data Encryption Key. We need to create a client with the MongoDB connection string and key vault namespace configuration. Let’s say we will have a database called users and inside it a keyVault collection. You need to install uuid-base64 first by running the command

$ npm install uuid-base64

Then in your script

const base64 = require('uuid-base64');

const keyVaultNamespace = 'users.keyVaul';

const client = new MongoClient('mongodb://localhost:27017', {

  useNewUrlParser: true,

  useUnifiedTopology: true,

});

async function createKey() {

  try {

    await client.connect();

    const encryption = new ClientEncryption(client, {

      keyVaultNamespace,

      kmsProvider,

    });

    const key = await encryption.createDataKey('local');

    const base64DataKeyId = key.toString('base64');

    const uuidDataKeyId = base64.decode(base64DataKeyId);

    console.log('DataKeyId [UUID]: ', uuidDataKeyId);

    console.log('DataKeyId [base64]: ', base64DataKeyId);

  } finally {

    await client.close();

  }

}

createKey();

You will then be presented with some result that resemble

DataKeyId [UUID]: ad4d735a-44789-48bc-bb93-3c81c3c90824

DataKeyId [base64]: 4K13FkSZSLy7kwABP4HQyD==

The client must have ReadWrite permissions on the specified key vault namespace

 

  1. To verify that the Data Encryption Key was created

const client = new MongoClient('mongodb://localhost:27017', {

  useNewUrlParser: true,

  useUnifiedTopology: true,

});



async function checkClient() {

  try {

    await client.connect();

    const keyDB = client.db(users);

    const keyColl = keyDB.collection(keyVault);

    const query = {

      _id: ‘4K13FkSZSLy7kwABP4HQyD==’,

    };

    const dataKey = await keyColl.findOne(query);

    console.log(dataKey);

  } finally {

    await client.close();

  }

}

checkClient();

You should receive some result of the sort

{

  _id: Binary {

    _bsontype: 'Binary',

    sub_type: 4,

    position: 2,

    buffer: 

  },

  keyMaterial: Binary {

    _bsontype: 'Binary',

    sub_type: 0,

    position: 20,

    buffer: 

  },

  creationDate: 2020-02-08T11:10:20.021Z,

  updateDate: 2020-02-08T11:10:25.021Z,

  status: 0,

  masterKey: { provider: 'local' }

}

The returned document data incorporates: data encryption key id (UUID), data encryption key in encrypted form, KMS provider information of master key and metadata like day of creation.

Specifying Fields to be Encrypted Using the JSON Schema

A JSON Schema extension is used by the MongoDB drivers to configure automatic client-side encryption and decryption of the specified fields of documents in a collection. The CSFLE configuration for this schema will require: the encryption algorithm to use when encrypting each field, one or all the encryption keys encrypted with the CSFLE master key and the BSON Type of each field. 

However, this CSFLE JSON schema does not support document validation otherwise any validation instances will cause the client to throw an error. 

Clients who are not configured with the appropriate client-side JSON Schema can be restricted from writing unencrypted data to a field by using the server-side JSON Schema. 

There are mainly two encryption algorithms: Random and deterministic.

We will define some encryptMetadata key at root level of the JSON Schema and configure it with the fields to be encrypted by defining them in the properties field of the schema hence they will be able to inherit this encryption key.

{

    "bsonType" : "object",

    "encryptMetadata" : {

        "keyId" : // keyId generated here

    },

    "properties": {

        // field schemas here

    }

}

Let’s say you want to encrypt a bank account number field, you would do something like:

"bankAccountNumber": {

    "encrypt": {

        "bsonType": "int",

        "algorithm": "AEAD_AES_256_CBC_HMAC_SHA_512-Deterministic"

    }

}

Because of high cardinality and the field being queryable, we use the deterministic approach. Sensitive fields such as  blood type which have low query plan and low cardinality may be encrypted using the random approach.

Array fields should use random encryption with CSFLE to enhance auto-encryption for all the elements. 

Mongocryptd Application

Installed in the MongoDB Enterprise Service 4.2 and later, this is a separate encryption application that automates the Client-side Field Level Encryption. Whenever a CSFLE  enabled client is created, this service is automatically started by default to:

  • Validate encryption instructions outlined in the JSON Schema, detect which fields are to be encrypted in the throughput operations.
  • Prevent unsupported operations from being executed on the encrypted fields.

To insert the data we will do the normal insert query and the resulting document will have sample data below in regard with the bank account field.

{

…

"bankAccountNumber":"Ac+ZbPM+sk7gl7CJCcIzlRAQUJ+uo/0WhqX+KbTNdhqCszHucqXNiwqEUjkGlh7gK8pm2JhIs/P3//nkVP0dWu8pSs6TJnpfUwRjPfnI0TURzQ==",

…

}

When an authorized personnel performs a query, the driver will decrypt this data and return in in a readable format i.e 

{

…

"bankAccountNumber":43265436456456456756,

…

}

Note:  It is not possible to query for documents on a randomly encrypted field unless you use another field to find the document that contains an approximation of the randomly encrypted field data.

Conclusion

Data security should be considered at all levels in regard to one at rest and transit. MongoDB Enterprise 4.2 Server offers developers with a window to encrypt data from the client side using the Client-Side Field Level Encryption hence securing the data from the database host providers and insecure network access. CSFLE uses envelope encryption where a master key is used to encrypt data encryption keys. The master key should therefore be kept safe using key management tools such as Key Management System.

 

Subscribe below to be notified of fresh posts