blog
MySQL in 2018: What’s in 8.0 and Other Observations
With most, if not all of 2018 behind us (depending on when you are reading this post), there is no doubt that it was a fantastic year for open-source SQL databases.
PostgreSQL 11 and MySQL 8 were both released, providing both communities with plenty to ‘talk about‘. Truth be told, both vendors have introduced many significant changes and additions in their respective releases and deserve their praise and accolades.
I normally guest post about the former here on the Severalnines blog (Many thanks to a great organization!) but I also have an interest in the latter. With many blog posts on my own website (link in my bio section), mostly targeting MySQL version 5.7, it (MySQL) is always in my peripherals.
So what does MySQL 8 have that version 5.7 does not have? What are the improvements? Well, there are many. In fact, too many to cover in just one blog post.
I recently upgraded to version 8 in my current Linux learning/development environment, so I thought to try my hand at pointing some of them out.
I cannot guarantee you an in-depth discussion on your ‘favorite‘ new feature(s). On the other hand, I will visit those that have caught my attention either via a personal interest or by way of the many terrific blog posts published throughout the year on version 8.
MySQL is getting better and better…Terrific improvements in version 8!
Roles
With Roles, DBA’s can mitigate redundancy, where many users would share the same privilege or set of privileges.
Roles are a part of the SQL standard.
After creating a specific role with the desired/required privilege(s), you can then assign users that particular role via the GRANT command or likewise, ‘taketh away‘ with REVOKE.
Roles come with numerous benefits and to make life a bit easier, there are a couple of tables to help you keep track of them:
-
mysql.role_edges – Here you find those roles and the users they are assigned.
mysql> DESC mysql.role_edges; +-------------------+---------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------------------+---------------+------+-----+---------+-------+ | FROM_HOST | char(60) | NO | PRI | | | | FROM_USER | char(32) | NO | PRI | | | | TO_HOST | char(60) | NO | PRI | | | | TO_USER | char(32) | NO | PRI | | | | WITH_ADMIN_OPTION | enum('N','Y') | NO | | N | | +-------------------+---------------+------+-----+---------+-------+ 5 rows in set (0.01 sec)
-
mysql.default_roles – Stores any default roles and those users assigned.
mysql> DESC mysql.default_roles; +-------------------+----------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------------------+----------+------+-----+---------+-------+ | HOST | char(60) | NO | PRI | | | | USER | char(32) | NO | PRI | | | | DEFAULT_ROLE_HOST | char(60) | NO | PRI | % | | | DEFAULT_ROLE_USER | char(32) | NO | PRI | | | +-------------------+----------+------+-----+---------+-------+ 4 rows in set (0.00 sec)
The combination of both tables (not in the SQL JOIN sense) essentially provides a ‘centralized location‘ where you can: know, monitor, and assess all of your implemented user-role privilege relationships and assignments.
Likely the simplest example role usage scenario would be:
You have several users who need ‘read-only access‘ on a specific table, therefore, requiring at least the SELECT privilege. Instead of granting it (SELECT) individually to each user, you can establish (create) a role having that privilege, then assign that role to those users.
But, roles come with a small ‘catch‘. Once created and assigned to a user, the receiving user must have an active default role set, during authentication upon login.
While on the subject of roles and users, I feel it is important to mention the change implemented in MySQL 8 concerning the validate_password component, which is a variant of the validate_password plugin used in version 5.7 .
This component provides various distinct ‘categories‘ of password checking: low, medium (default), and strong. Visit the validate_password component documentation for a full rundown on each levels’ validation specifics.
NoSQL Mingling with SQL – The Document Store
This feature is one I am still learning about, despite a fleeting interest in MongoDB in early 2016. To date, my interest, study, and learning have been focused solely on ‘SQL’. However, I am aware (through much reading on the web) that many are excited about this type of structuring (document-oriented) intertwined with ‘relational SQL’ now available in the MySQL 8 document store.
Below are many benefits available when using the document store. Be sure and mention your favorites I may have missed in the comments section:
- The JSON data type has been supported since MySQL version 5.7.8 yet, version 8 introduced significant enhancements for working with JSON. New JSON specific functions along with ‘shorthand‘ operators that can be used in place of multiple function calls – with equal results/output.
- Perhaps one of the foremost benefits is you no longer need to implement and work with multiple database solutions since NoSQL, SQL, or a combination of the two are supported in the document store.
- A “DevAPI”, provides seamless workflow capabilities within a NoSQL data context (collections and documents). (Visit the official DevAPI user guide documentation for more information).
- Powerful command-line sessions using Python, SQL, or Javascript as the ‘shell’ language.
- ACID compliant.
- Quickly explore and discover your data without defining a schema as you would in a relational model.
Common Table Expressions (CTE’s or the WITH clause)
What else can you say about CTE’s? These things are a game-changer! For starters, what exactly is a common table expression?
From Wikipedia:
“A common table expression, or CTE, (in SQL) is a temporary named result set, derived from a simple query and defined within the execution scope of a SELECT, INSERT, UPDATE, or DELETE statement.”
I’ll provide a simple example, demonstrating CTE’s. However, their full power is not harnessed in this section, as there are many more complex use-case examples than these.
I have a simple name table with this description and data:
mysql> DESC name;
+--------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------+-------------+------+-----+---------+-------+
| f_name | varchar(20) | YES | | NULL | |
| l_name | varchar(20) | YES | | NULL | |
+--------+-------------+------+-----+---------+-------+
2 rows in set (0.00 sec)
mysql> SELECT * FROM name;
+--------+------------+
| f_name | l_name |
+--------+------------+
| Jim | Dandy |
| Johhny | Applesauce |
| Ashley | Zerro |
| Ashton | Zerra |
| Ashmon | Zerro |
+--------+------------+
5 rows in set (0.00 sec)
Let’s find out how many last names start with ‘Z’:
mysql> SELECT *
-> FROM name
-> WHERE l_name LIKE 'Z%';
+--------+--------+
| f_name | l_name |
+--------+--------+
| Ashley | Zerro |
| Ashton | Zerra |
| Ashmon | Zerro |
+--------+--------+
3 rows in set (0.00 sec)
Easy enough.
However, using the WITH clause, you can ‘access‘ this same query results set (which can be thought of as a derived table) and refer to it later on within the same statement – or ‘scope‘:
WITH last_Z AS (
SELECT *
FROM name
WHERE l_name LIKE 'Z%')
SELECT * FROM last_Z;
+--------+--------+
| f_name | l_name |
+--------+--------+
| Ashley | Zerro |
| Ashton | Zerra |
| Ashmon | Zerro |
+--------+--------+
3 rows in set (0.00 sec)
I basically assign a name to the query, wrapping it in parenthesis. Then just select the data I want from what is now the last_Z CTE.
The last_Z CTE provides a complete result set, so you can filter it even further within the same statement:
WITH last_Z AS (
SELECT *
FROM name
WHERE l_name LIKE 'Z%')
SELECT f_name, l_name FROM last_Z WHERE l_name LIKE '%a';
+--------+--------+
| f_name | l_name |
+--------+--------+
| Ashton | Zerra |
+--------+--------+
1 row in set (0.00 sec)
A couple of the more powerful features are ‘chaining‘ multiple CTE’s together and referencing to other CTE’s within CTE’s.
Here is an example to give you an idea (although not so much useful):
WITH last_Z AS (
SELECT *
FROM name
WHERE l_name LIKE 'Z%'),
best_friend AS (
SELECT f_name, l_name
FROM last_Z
WHERE l_name LIKE '%a')
SELECT * from best_friend;
+--------+--------+
| f_name | l_name |
+--------+--------+
| Ashton | Zerra |
+--------+--------+
1 row in set (0.00 sec)
In the above query, you can see where I separated the last_Z CTE from the best_friend CTE with a comma then wrapped that query in parenthesis after the AS keyword.
Notice I am then able to refer to (and use) the last_Z CTE to essentially define the best_friend CTE.
Here are a few reasons why CTE’s are such a significant improvement in version 8:
- Other SQL vendors have supported CTE’s (many since earlier versions within their individual ecosystem’s) and now MySQL 8, has closed the gap in this area.
- A standard SQL inclusion.
- In some cases (where appropriate), CTE’s are a better option than Temporary Tables, Views, Derived Tables (or Inline Views), and some subqueries.
- CTE’s can provide an ‘on-the-fly‘ calculations results set you can query against.
- A CTE can reference itself – known as a recursive CTE (not demonstrated here).
- CTE’s can name and use other CTE’s
Window Functions
Analytic queries are now possible in MySQL 8. As Window functions are not my strong suit, I am focused on a more in-depth study and better understanding of them, on a whole, moving forward. These next example(s) are mostly elementary according to my understanding. Suggestions, advice, and best practices are welcome from readers.
I have this VIEW that provides a fictitious pipe data result set (something I somewhat understand):
mysql> SELECT * FROM pipe_vw;
+---------+-------------+-----------+-------+-------------+------------+----------------+
| pipe_id | pipe_name | joint_num | heat | pipe_length | has_degree | wall_thickness |
+---------+-------------+-----------+-------+-------------+------------+----------------+
| 181 | Joint-278 | 39393A | 9111 | 17.40 | 1 | 0.393 |
| 182 | Joint-8819 | 19393Y | 9011 | 16.60 | 0 | 0.427 |
| 183 | Joint-9844 | 39393V | 8171 | 10.40 | 0 | 0.393 |
| 184 | Joint-2528 | 34493U | 9100 | 11.50 | 1 | 0.427 |
| 185 | Joint-889 | 18393z | 9159 | 13.00 | 0 | 0.893 |
| 186 | Joint-98434 | 19293Q | 8174 | 9.13 | 0 | 0.893 |
| 187 | Joint-78344 | 17QTT | 179 | 44.40 | 1 | 0.893 |
| 188 | Joint-171C | 34493U | 17122 | 9.45 | 1 | 0.893 |
| 189 | Joint-68444 | 17297Q | 6114 | 11.34 | 0 | 0.893 |
| 190 | Joint-4841R | 19395Q | 5144 | 25.55 | 0 | 0.115 |
| 191 | Joint-1224C | 34493U | 8575B | 15.22 | 1 | 0.893 |
| 192 | Joint-2138 | 34493C | 91 | 13.55 | 1 | 0.893 |
| 193 | Joint-122B | 34493U | 9100B | 7.78 | 1 | 0.893 |
+---------+-------------+-----------+-------+-------------+------------+----------------+
13 rows in set (0.00 sec)
Imagine, I need the pipe asset records presented in some sort of row ranking depending on the length of each individual pipe. (E.g., The longest length is ‘labeled’ the number 1 position, the second longest length is ‘labeled’ position 2, etc…)
Based on the RANK() Window Function description in the documentation:
“Returns the rank of the current row within its partition, with gaps. Peers are considered ties and receive the same rank. This function does not assign consecutive ranks to peer groups if groups of size greater than one exist; the result is noncontiguous rank numbers.”
It looks to be a well-suited for this requirement.
mysql> SELECT pipe_name, pipe_length,
-> RANK() OVER(ORDER BY pipe_length DESC) AS long_to_short
-> FROM pipe_vw;
+-------------+-------------+---------------+
| pipe_name | pipe_length | long_to_short |
+-------------+-------------+---------------+
| Joint-78344 | 44.40 | 1 |
| Joint-4841R | 25.55 | 2 |
| Joint-278 | 17.40 | 3 |
| Joint-8819 | 16.60 | 4 |
| Joint-1224C | 15.22 | 5 |
| Joint-2138 | 13.55 | 6 |
| Joint-889 | 13.00 | 7 |
| Joint-2528 | 11.50 | 8 |
| Joint-68444 | 11.34 | 9 |
| Joint-9844 | 10.40 | 10 |
| Joint-171C | 9.45 | 11 |
| Joint-98434 | 9.13 | 12 |
| Joint-122B | 7.78 | 13 |
+-------------+-------------+---------------+
13 rows in set (0.01 sec)
In the next scenario, I want to build even further on the previous example by ranking the records of longest to shortest lengths, but, per each individual group of the distinct wall_thickness values.
Perhaps the below query and results will explain better where my prose may have not:
mysql> SELECT pipe_name, pipe_length, wall_thickness,
-> RANK() OVER(PARTITION BY wall_thickness ORDER BY pipe_length DESC) AS long_to_short
-> FROM pipe_vw;
+-------------+-------------+----------------+---------------+
| pipe_name | pipe_length | wall_thickness | long_to_short |
+-------------+-------------+----------------+---------------+
| Joint-4841R | 25.55 | 0.115 | 1 |
| Joint-278 | 17.40 | 0.393 | 1 |
| Joint-9844 | 10.40 | 0.393 | 2 |
| Joint-8819 | 16.60 | 0.427 | 1 |
| Joint-2528 | 11.50 | 0.427 | 2 |
| Joint-78344 | 44.40 | 0.893 | 1 |
| Joint-1224C | 15.22 | 0.893 | 2 |
| Joint-2138 | 13.55 | 0.893 | 3 |
| Joint-889 | 13.00 | 0.893 | 4 |
| Joint-68444 | 11.34 | 0.893 | 5 |
| Joint-171C | 9.45 | 0.893 | 6 |
| Joint-98434 | 9.13 | 0.893 | 7 |
| Joint-122B | 7.78 | 0.893 | 8 |
+-------------+-------------+----------------+---------------+
13 rows in set (0.00 sec)
This query uses the PARTITION BY clause on the wall_thickness column because we want the ranking (that ORDER BY pipe_length DESC provides) however, we need it in the context of the individual wall_thickness groups.
Each long_to_short column ranking resets back to 1 as you encounter (or change) to a different wall_thickness column value.
Let’s concentrate on the results of one single group.
Targeting the records with wall_thickness values 0.893, the row with pipe_length 44.40 has a corresponding long_to_short ‘ranking’ of 1 (it’s the longest), while the row with pipe_length 7.78 has a corresponding long_to_short ‘ranking’ of 8 (the shortest) all within that specific group (0.893) of wall_thickness values.
Window functions are quite powerful and their entire scope and breadth could not possibly be covered in one section alone. Be sure and visit the Window Functions supported in MySQL 8 documentation for more information on those currently available.
Improved Spatial Support and Capabilities
This is a tremendous set of features included in MySQL 8. Previous versions’ support, or lack thereof, simply could not compare to other vendor implementation(s) (think PostGIS for PostgreSQL).
For the past 10 plus years, I have worked in the field as a Pipeline Surveyor, collecting GPS and asset data, so this group of changes definitely catches my attention.
Spatial data expertise is a comprehensive subject in its own right and be assured, I am far from an expert on it. However, I hope to summarize the significant changes between versions 5.7 and 8 and convey them in an clear and concise manner.
Let’s familiarize ourselves with 2 key terms (and concepts) for the purposes of this section.
-
Spatial Reference System or SRS – Here is a partial definition from Wikipedia:
“A spatial reference system (SRS) or coordinate reference system (CRS) is a coordinate-based local, regional or global system used to locate geographical entities. A spatial reference system defines a specific map projection, as well as transformations between different spatial reference systems.”
-
Spatial Reference System Identifier or SRID – Also, Wikipedia has SRID’s defined as such:
“A Spatial Reference System Identifier (SRID) is a unique value used to unambiguously identify projected, unprojected, and local spatial coordinate system definitions. These coordinate systems form the heart of all GIS applications.”
MySQL supports many spatial data types. One of the more common ones is a POINT. If you use your GPS to navigate to your favorite restaurant, then that location is a POINT on a map.
MySQL 5.7 treats pretty much every ‘spatial object‘ as having an SRID of 0, which is significant for computations. Those calculations are computed in a Cartesian type of coordinate system. However, we all know that our globe is a sphere and far from flat. Therefore, in version 8, you have the ability to consider it as either flat or spherical in computations.
Back to those two terms, we defined previously.
Even though 0 is the default SRID in MySQL version 8, many (approximately 5,000+) other SRID’s are supported.
But why is that important?
This fantastic explanation via the blog post, Spatial Reference Systems in MySQL 8.0, sums it up nicely:
“By default, if we don’t specify an SRID, MySQL will create geometries in SRID 0. SRID 0 is MySQL’s notion of an abstract, unitless, infinite, Catesian plane. While all other SRSs refer to some surface and defines units for the axes, SRID 0 does not.”
Essentially, when performing calculations with SRID’s other than SRID 0, then the shape of our Earth comes into play, is considered, and affects those calculations. This is crucial for any meaningful/accurate computations. For an in-depth rundown and better extrapolation, see this blog post covering geography in MySQL 8.
I also highly recommend the MySQL Server Team blog post, Geographic Spatial Reference Systems in MySQL 8.0, for clarity on SRS’s. Do make sure and give it a read!
Finally, for spatial data upgrade concerns from version 5.7 to 8, visit some of the incompatible changes listed here for more information.
Other Notable Observations
Below are other release enhancements that I must acknowledge, although they are not covered in-depth in this blog post:
- utf8mb4 is now the default character set (previously latin1) – Better support for those must have emojis in addition to some languages…
- Transactional Data Dictionary – MySQL metadata is now housed in InnoDB tables.
- Invisible Indexes – Set the visibility of an index for the optimizer, ultimately determining if adding or removing it (the index), is a good or bad thing. Adding an index to an existing large table can be ‘expensive‘ in terms of locking and resources.
- Descending Indexes – Better performance on indexed values that are stored in descending order.
- Instant Add Column – For schema changes, specify ALGORITHM=INSTANT in ALTER TABLE statements and (if feasible for the operation) avoid metadata locks. ( For more information, see this great post by the MySQL Server Team, and the ALTER TABLE section from the official docs.)
Bonus Section: Something I Had Hoped to See…
Check constraints have not made their way into the MySQL product yet.
As with previous MySQL versions, check constraint syntax is allowed in your CREATE TABLE commands but it is ignored. To my knowledge, most other SQL vendors support check constraints. Come join the party MySQL!
MySQL has significantly ‘stepped up‘ its offering in version 8. Supporting robust spatial capabilities, convenient user management role options, ‘hybrid’ SQL/NoSQL data solutions, and analytical functions among the numerous additional improvements, is truly remarkable.
In my opinion, with version 8, MySQL continues to provide a solid option in the ever-growing, competitive open-source SQL ecosystem, full of relevant and feature-rich solutions.
Thank you for reading.