Severalnines Blog
The automation and management blog for open source databases

Scaling Connections in PostgreSQL using Connection Pooling

Opening a database connection is an expensive operation, and connection pooling is used to keep database connections open so they can be reused. This avoids having to repeatedly open network sessions, authenticate and check authorisation. Pooling keeps the connections active so that, when a connection is later requested, one of the active ones is used rather than having to create one from scratch.

Connection Pooling

Connection pooling has become one of the most common methods of handling database connections prior to a query request. We normally think that a connection to the database is fast but that ain’t the case especially when a large number of clients are connecting. Without connection pooling, a request would take up to 35-50 ms to connect but 1-2ms if connection pooling is employed. Connection pooling is therefore preallocating database connections, and then recycling them when new clients are connecting

Reasons for Connection Pooling

  1. To avoid crashing your server. PostgreSQL servers are limited to a number of clients they handle at a time depending on the memory parameter. If this number is surpassed, then you will end up crashing the server. With connection pooling, clients use a set number of connections.
  2. Facilitate query processing. Normally, database requests are executed in a serial manner with a criterion of first-in first-out. With a large set of clients, this would take ages to achieve this process. Therefore the approach should be to make a single connection with pipelined requests which can be executed simultaneously rather than each at a time.
  3. Improve on security. Often a connection involves a handshake that may take 25-35 ms average over which an SSL is established, passwords are checked and sharing of the configuration information. All this work for each connected user will result in extensive usage of memory. However, with connection pooling, the number of connections is reduced hence saving on memory.

Types of Connection Pooling

There are basically two types of connection pooling, however, there is a third type work-around that acts like a connection pooling strategy known as persistent connections.

Persistent Connection Pooling

This approach intends to keep an initial connection active from the time it is initiated. It doesn’t fully hold the connection pooling features but well enough to provide some continuous connection. It is quite helpful for a small set of client connections whose overhead may range between 25-50 ms. A limitation with this approach is that it is limited to a number of connections to the db with normally a single connection per entry to the server.

Framework Connection Pooling

Framework connection pooling occurs at an application level whereby, whenever your server script is started, a pool of connections is established to handle query requests that will be arriving later.

Standalone Connection Pooling

For every connection to the database, an overhead memory between 5 to 10 MB is used up to cater for a query request. This is not quite good for a large number of connections. Using the framework connection pooling can be limited by this number of connections as it may encounter a large memory size usage. We thus opt to use the Standalone connection pooling which is configured in accordance to Postgres sessions, statements and transactions. The core advantage associated with this approach is: minimal overhead cost of about 2kb for every connection.

When you create a connection pooling class, it should meet the following factors for an increased database performance:

  1. Preallocate the connections
  2. Oversee connections that are available
  3. Assign new connections
  4. Wait for a connection to be available
  5. Close connection
Download the Whitepaper Today
 
PostgreSQL Management & Automation with ClusterControl
Learn about what you need to know to deploy, monitor, manage and scale PostgreSQL

Preallocating the Connections

Ensuring more connections in advance will facilitate requests handling at the point when the application has been started. For example if your server is developed with Java, you can use vectors to store available idle connections using the code below.

availableConnections = new Vector(connections); 
busyConnections = new Vector();
for(int i=0; i<connections; i++) {
availableConnections.addElement(makeNewConnection()); 
}

Overseeing Available Connections

The class should be able to check for any idle connection in a list of busy connections and return it. This is basically done in order to reuse a connection or close connections that are not in use. Sometimes connections time out hence while returning a connection it is quite important to check if it is still open. If not, you will need to discard this connection and repeat the process. When a connection is discarded a slot is opened which can be used to process a new connection when the limit has been reached. This can be achieved with

public synchronized Connection getConnection() throws SQLException {
if (!availableConnections.isEmpty()) { Connection existingConnection =
(Connection)availableConnections.lastElement(); int lastIndex = availableConnections.size() - 1; availableConnections.removeElementAt(lastIndex); if (existingConnection.isClosed()) {
notifyAll(); // Freed up a spot for anybody waiting.
return(getConnection()); // Repeat process. } else {
busyConnections.addElement(existingConnection);
return(existingConnection); }
} }

Assigning a New Connection

You should be able to start a background thread to assign a new connection if there is no idle on available and if the connection limit is almost being reached.

if ((totalConnections() < maxConnections) && !connectionPending) { // Pending = connecting in bg
makeBackgroundConnection(); }
try {
wait(); // Give up lock and suspend self.
} catch(InterruptedException ie) {} return(getConnection()); // Try again.

Waiting for a New Connection

When there is no idle connection and the connection limit has been reached, the configuration should be able to wait for a new connection to be available without continual pooling. We can do so using the wait method which provides a thread synchronization lock and suspends the thread until a notification has been provided.

try {
     wait();
} catch(InterruptedException ie) {} 
return(getConnection());

For good application ethics, clients should not wait in real time for a connection rather you will throw an exception when connections are absent with the code below:

throw new SQLException("Connection limit reached");

Closing the Connection

When connections are garbage collected, you should close them rather than doing it explicitly. However, if you want an explicit approach to close a connection then you can use:

public synchronized void closeAllConnections() {
// The closeConnections method loops down Vector, calling // close and ignoring any exceptions thrown. closeConnections(availableConnections); availableConnections = new Vector(); closeConnections(busyConnections);
busyConnections = new Vector();
}

Conclusion

Connection pooling for PostgreSQL helps us reduce the number of resources required for connecting to the database and improves the speed of connectivity to the database. This is achieved by pooling connections to the DB, maintaining these connections and consequently reducing the number of connections that must be opened.