Introduction to SQL Server on Linux

Rajendra Gupta

Microsoft SQL Server is an excellent choice for a relational database with key benefits including performance, security, reliability, and total cost of ownership. A few exciting features of SQL Server are outlined below:

  • SQL Server supports Python, R, Java, Spark with the combination of relational database and Artificial Intelligence (AI)

  • Database administrators and developers can choose the supported platform and language.

    • Available Platforms: Windows, Linux( RedHat, SUSE, Ubuntu), Docker containers.

  • Available Languages: C\C++, PHP, Java, Node.js, Python, Ruby.

  • SQL Server is the best relational database according to performance benchmarks. TPC-H 1TB, 10TB and 30TB SQL Server is considered the most secure database as per the National Institute of Standards and Technology (NIST). It supports Transparent Data Encryption, Column level encryption, Static and Dynamic data masking, Data discovery and classification, Certificates, SSL. SQL Server also allows for accessing external data from Hadoop clusters, NoSQL databases, Oracle, SAP Hana, Big data as external tables using PolyBase as well. SQL Server provides high availability and disaster recovery solutions such as the restoration of backups, log shipping, Always On Availability groups with multiple secondary replicas in synchronous and asynchronous data commit

SQL Server On Linux - Introduction

 

Image Reference: Microsoft cloud blogs

When we think about SQL Server, we always think about it as running on Windows. Starting from SQL Server 2017, you can run it on Linux as well.  

Microsoft executive vice president Scott Guthrie states: “Bringing SQL Server to Linux is another way we are making our products and new innovations more accessible to a broader set of users and meeting them where they are”.

SQL Server on Linux is an enterprise-ready relational database with industry-leading capabilities and robust business continuity. It combines the Microsoft SQL Server on a best-known and most-used open-source operating system Linux.

Supported SQL Server Flavors on Linux

  • Supported SQL Server flavors on Linux include Red Hat Enterprise Linux, SUSE Linux, Enterprise Server, Ubuntu, Kubernetes clusters, and Docker containers. That means that you will no longer have to worry about what Linux servers does your SQL Server support when choosing one!

The following image gives a high-level process model overview of the platform abstraction layer and its communication with Linux OS:

Image Reference: Microsoft cloud blogs

 Why Should You Run SQL Server on Linux?

You might be curious to learn more about SQL Server on Linux and wonder whether you can use it for critical databases. Therefore, let me tell you a few reasons you should:

  • Open-source platform: Linux is an open-source operating system. The Linux operating system requires low computing resources (RAM, CPU) if we compare it to other operating systems. Therefore, you can reduce the cost of an operating system license.
  • SQL Server license: You can move your existing Windows-based SQL Server Licenses to SQL Server at Linux without any additional cost. Therefore, you can plan to move Linux based SQL Server without worrying about licenses.
  • Enterprise-level features: SQL Server on Linux is an enterprise-ready database. It has features such as high availability and disaster recovery with Always On Availability Groups. You can also combine Always-on availability groups between Windows and Linux operating systems.

  • Simple backups: You can restore your database from Windows to Linux and vice-a-versa using a simple backup and restore method. Therefore, you can quickly move databases without worrying about the underlying operating system with multiple database environments.

  • Industry-leading performance: SQL Server Linux is tested on the TPC-E benchmark. It is ranked number 1 in the TPC-H1 TB, 10TB, 30 TB benchmarks.

  • Security: As per the Microsoft docs, the NIST institute rated SQL Server on Linux as the most secure database.

  • Simple installation: SQL Server on Linux supports command-line installation. It is quick, simple, and comparatively faster than installing on Windows Server.

  • Database upgrades: You can move out of unsupported SQL Server versions such as SQL Server 2008 R2 into SQL Server 2017 or 2019 Linux with simplified database migrations.

  • Quick deployments: SQL Server on Linux supports a docker container where you can deploy SQL Server within a few seconds. It helps developers build a container with SQL Server image and test their code without waiting for virtual machines or higher-end servers. The container can be deployed in the Azure cloud infrastructure. You can also use Kubernetes or Docker Swarm as an orchestration tool for managing many containers.

  • Similar functionality to T-SQL: The SQL Server on Linux uses similar T-SQL scripts, maintenance plans, backup mechanisms and routine administrative tasks. The users with Windows background can quickly get familiar with it without realizing much difference in the underlying operating system.

  • Data virtualization hub: SQL Server on Linux can act as a data virtualization hub by setting up external tables from Hadoop, Azure Blob Storage accounts, Oracle, PostgreSQL, MongoDB, and ODBC data sources.

  • Platform abstraction layer: Microsoft introduced a Platform Abstraction Layer (PAL) for database compatibility into a Linux environment. The PAL aligns operating system or platform-specific code in a single place. For example, the Linux setup includes about 81 MB of the uncompressed Windows libraries for SQLPAL.

Features of SQL Server on Linux

This section explores a few key features of the SQL Server on Linux that can justify migrating to SQL Server on Linux.

  1. Performance

SQL Server on Linux offers Hybrid Transactional Analytical Processing, a.k.a. HTAP, for fast transaction throughput, responsive analytics. The HTAP uses the following performance features:

  • In-Memory Online Transaction Processing (OLTP) - It contains memory-optimized tables and compiled stored procedures for improving the performance of heavy transactional applications. It can increase workload performance up to 30-100x.
  • Columnstore indexes - SQL Server on Linux supports the Columnstore index and the Rowstore indexes to improve  the performance of analytical queries.

  • Query store - The query store helps database administrators to monitor query performance and query regressions, their execution plan changes over time and reverts to the plan with the lowest query overhead.

  • Automatic tuning - In automatic tuning, SQL Server monitors the query performance based on the query store data. If it finds a new query plan causing an impact on performance, it reverts the old plan automatically without DBA intervention.

  • Intelligent query processing - There are features available from SQL Server 2019 for automatically improving query performance based on query workloads and collected statistics. It contains the following features:

    • Adaptive joins - SQL Server can automatically and dynamically select the join type as per the number of input rows.

    • Distinct approximate count - SQL Server can return an approximate count of the distinct number of rows with high performance and minimum resources.

    • Memory grant feedback - Sometimes we observe a spill to disk while executing large resource-intensive queries. It wastes assigned memory and impacts other queries as well. Therefore, the memory grant feedback helps SQL Server to avoid memory wastage based on the memory feedback.

    • Table variable cardinality - SQL Server on Linux can use the actual table variable cardinality instead of a fixed guess.

  • Security

Security is a critical feature for a relational database. Therefore, SQL Server on Linux contains advanced security features in all editions, including the standard edition:

  • Transparent Data Encryption (TDE) - The TDE features encrypt data files and database backups in rest. It protects the database from any malicious activity where the intruder grabs data files or backup files for accessing data.

  • Always Encrypted - The always encrypted features allow only applications to view and process the data. The developers and database administrators (highest privilege) cannot view original data (data that is decrypted). The encryption and decryption both take place on the client drivers. Therefore, data is encrypted both in-rest and in-motion.

  • Column-level Encryption - In column-level encryption, you can use certificates to encrypt the columns with sensitive information such as PII data and credit card numbers.

  • SQL Server Certificates - You can use SQL Server certificates for securing and encrypting all connections to the SQL Server on Linux.

  • Auditing - Auditing tracks specific events for capturing any malicious activity. You can view the event logs and audit files to help you investigate any data breaches.

  • Row-level Security - Row-level security allows users to view data based on user credentials. For example, the user can view only rows he is allowed to see. It prevents users from viewing or modifying other users' data.

  • Dynamic Data Masking - Dynamic data masking can mask the data in a column based on the masking functions. It can work with data such as email addresses, credit card numbers, social security numbers. For example, you can mask credit card numbers to display only the last four-digit numbers in the format of XXXX-XXXX-XXXX-1234.

  • Data Discovery and Classification - It is essential to identify, label and report sensitive data stored in your database. The Data discovery and classification tool can generate a report to discover sensitive data such as PII and classify the data based on the sensitivity.

  • Vulnerability Assessment - The vulnerability assessment can identify configurations and database design that can be vulnerable to common malicious attacks for your instance and database.3. High Availability

For a production database, the high availability and disaster recovery mechanism is very essential. Therefore, SQL Server on Linux includes the following high availability features:

  • Always on availability groups - You can configure availability groups between standalone SQL Server on Linux instances in an asynchronous way.

  • Always On failover clusters - SQL Server on Linux supports a pacemaker for providing a synchronous copy of the database in the same or a different data Centre. You can extend your Windows-based SQL Server Availability group with a Linux SQL Server replica node.

  • Log Shipping Using the SQL Agent - The log shipping works on the transaction log backups to provide warm stand-by data copies without complex configurations.

  • Containers Orchestrated Tools - You can use container orchestrated tools such as Kubernetes for enhancing SQL Server availability. It ensures that if a specific node of SQL Server is down, another node is bootstrapped automatically. Further, you can use always-on availability groups in Kubernetes clusters.

 

  1. Machine Learning Services

SQL Server on Linux supports machine learning models using R and Python scripts for data stored in your databases. Machine learning can help you do real-time predictive analytics on both operational and analytic data. You can add data science frameworks PyTorch, TensorFlow, Scikit-learn for enhancing Automation tasks capabilities using machine learning.

  1. PolyBase

SQL Server on Linux supports PolyBase, where you can configure external tables with Oracle, Big Data, SAP HANA, Hadoop, NoSQL Databases as a data virtualization tool. It eliminates the ETL transformation where you need to import or export data into SQL Server before querying it.

  1. Graph Database

SQL Server on Linux supports graph databases where it stores data as entities (nodes) and relationships(edges) for semantic queries

  1. Full-text Search

The SQL Server Linux supports full-text services for executing queries against the text data efficiently.

  1. SQL Server Integration Service (SSIS) Packages

SSIS packages can connect with the SQL Server on Linux databases or SQL Server in a container similar to the Windows-based SQL Server instance.

Conclusion

This article provided you with some high-level introduction to SQL Server on Linux, we went through its features. It lists why organizations should consider using SQL Server as their database solution on Linux operating systems and containers.

In subsequent articles, we will explore more useful SQL Server features on Linux and explore them practically.

ClusterControl
The only management system you’ll ever need to take control of your open source database infrastructure.