As you may know, you can extend PostgreSQL functionality by using PostgreSQL extensions. You can use different extensions to make PostgreSQL like an enterprise version and address specific needs. In this blog, we’ll see some of the most popular open source extensions in PostgreSQL and how it can be used to expand its capability.
PostgreSQL is the most advanced open source database. Every year, PostgreSQL community release a new version with some new features but there are many features developed using extensions which can be used for a particular use case.
Some of these extensions are very powerful and popular to build an enterprise-grade PostgreSQL environment.
The pg_stat_statements module provides a means for tracking execution statistics of all SQL statements executed by a PostgreSQL server. When the pg_stat_statements module is loaded, it tracks statistics across all databases of the server. The statistics gathered by the pg_stat_statements module are made available via a view named pg_stat_statements. You can check more details about pg_stat_statements functions and configuration parameter here.
PostGIS is an extension to the PostgreSQL database system which allows GIS (Geographic Information Systems) objects to be stored in the database. It includes support for GiST-based R-Tree spatial indexes, and functions for analysis and processing of GIS objects.
PostGIS is used for Spatial and Geographic objects for PostgreSQL. PostGIS adds types (geometry, geography, raster etc) to the PostgreSQL database. PostGIS also adds functions, operators, and index enhancements that apply to these spatial types. You can check the installation method here.
The postgres_fdw module provides the foreign-data wrapper postgres_fdw, which can be used to access data stored in an external PostgreSQL database servers. When there are two PostgreSQL databases, postgres_fdw refers to the table of the other database from one database. Once you define FOREIGN TABLE, you can do SELECT, INSERT, UPDATE, DELETE like a local database.
The functionality provided by postgres_fdw module overlaps substantially with the functionality of the older dblink module. But postgres_fdw module provides more transparent and standards-compliant syntax for accessing remote tables, and it can give better performance in many cases.
The main use case of the foreign tables is to make the data available to systems without replicating or duplicating it. There are even simple implementations of sharding using FDW, because data in the other shards can be available for queries though FDWs.
In Oracle, you can get the data from remote database tables using DBLinks but main difference between DBLinks and FDW is that FDW can maintain the meta data or table definition about foreign table locally.
Mysql_fdw is a PostgreSQL extension implements a Foreign Data Wrapper (FDW) for MySQL. Mysql_fdw module facilitates the use of PostgreSQL server as a client for MySQL Server, which means it can then fetch data from the MySQL database as a client. Currently, EnterpriseDB is maintaining this module on github.
The previous version of mysql_fdw was supported only read-only, but the latest version provides the write capability. The user can now issue insert, update and delete statements for the foreign tables using the mysql_fdw. It uses the PostgreSQL type casting mechanism to provide opposite typecasting between MySQL and PostgreSQL data types.
The hstore module implements hstore data type for storing key-value pairs in a single value in PostgreSQL. The hstore data type is very useful in many use cases, such as semi-structured data or rows with many attributes that are rarely queried. Note that keys and values are just text strings only.
Before working with hstore data type, you need to create the hstore extension which loads the contrib module to your PostgreSQL instance.
pgAudit is a PostgreSQL Audit Extension which provides detailed session and/or object audit logging via the standard PostgreSQL logging facility.
The main goal of the pgAudit module is to provide PostgreSQL users with the capability to produce audit logs often required to comply with government, financial, or ISO certifications.
Generally, an audit is an official inspection of an individual’s or organization’s accounts, typically by an independent body. The information gathered by pgAudit module is properly called an audit trail or audit log.
pgAudit module was developed to support PostgreSQL 9.5 and above versions.
Session audit logging from pgAudit module provides detailed logs of all statements executed by a user in the backend. You can check more details, configuration and how to install and use it in PostgreSQL here.
Orafce is a module which implements Oracle compatible functions, data types and packages. This is an open source tool with BSD licence so anyone can use this tool. This module is very useful in Oracle to PostgreSQL migration task, it has many Oracle functions implemented in PostgreSQL.
Applications usually use those functions with multiple occurrences. You can reduce the modification cost of SQL by using this tool. You can check the how to migrate from Oracle to PostgreSQL blog.
All the functions and packages are implemented correctly and it is well tested.
Some of the functions are:
- utl_file – filesystem related functions
- Dbms_pipe and dbms_alert
- PLVdate,PLVstr, PLVchr
- Oracle compatible DATE data type and functions like ADD_MONTHS, LAST_DAY,NEXT_DAY and so on.
- NVL function
- SUBSTR and SUBSTRB function
- VARCHAR2 and NVARCHAR2 support
Pg_bulkload module provides high-speed data loading capability to PostgreSQL users. pg_bulkload module is designed to load a huge amount of data to a database. You can load the data to table bypassing PostgreSQL shared buffers. pg_bulkload module also has some ETL features like input data validation and data transformation with filter functions.
The original goal of pg_bulkload module was a faster alternative of COPY command in PostgreSQL.
Pg_bulkload module is developed and maintained by NTT OSS Center.
IMPORTANT NOTE: Under streaming replication environment in PostgreSQL, pg_bulkload module does not work properly. See here for more details.
The pgstattuple module provides various functions to obtain tuple-level statistics in PostgresQL. The function in the pgstattuple module returns a relation’s physical length, percentage of “dead” tuples, and other info. This may be helpful to users to determine whether vacuum is necessary or not. The argument to the function is the target relation’s name (optionally schema-qualified) or OID.
pg_trgm is a PostgreSQL extension providing simple fuzzy string matching in PostgreSQL. Its operational and conceptual overhead is much lower than PostgreSQL full-text search or a separate search engine.
A trigram or trigraph is a group of three consecutive characters taken from a string. You can measure the similarity of two strings by counting the number of trigrams they share. This simple idea of trigram turns out to be very effective for measuring the similarity of words in many natural languages.
In general, pg_trgm module can help when:
- You need fuzzy case-insensitive string matching in PostgreSQL.
- You want to speed up LIKE, ILIKE, ~ or ~* in PostgreSQL.
- You want to search for patterns that aren’t left-anchored (e.g. %john%). Such patterns are not supported by B-tree indexes in PostgreSQL.
The above list of modules/extensions are very useful to expand the PostgreSQL capability. There are some more extensions which are not included in the blog but it is very useful to expand PostgreSQL functionality. These are as follow.
- Language Extensions: PL/Python, PL/Perl, PL/R, PL/v8, PL/sh etc.
There are many extensions in PostgreSQL community supported extensions and some third party extensions, all used for a particular use case. You can expand PostgreSQL capability or functionality by using those modules.