I am continuously asked by wide-eyed green web 2.0 developers about Master-Master Replication (M-M). Development Managers and CTOs/CIOs aren’t immune to this either. They read a couple of posts about it, on the web, and seem to think that you can put together a couple of commodity servers, through in a Load Balancer and voila! Instant cluster, without the high price tag of Oracle’s RAC.
Unfortunately, its not that simple. MySQL actually has a cluster product which it acquired from Ericsson. It has some limitations. V5.0 needs to be able to have both data and index completely in memory. V5.1 needs to be able to have all index in memory. Either later choices performs a 2 phase commit, so at anytime, querying any node with the same query, will return the same result. In a M-M environment, depending on the load on either node, replication can be lagging, thus returning different results for the same query from each node.
If such a case is not handled by the application, at best, you will have inconsistent results, at worse dire results.
Some applications will simply not be able to use it at all. Here is an example. Say your DB Schema has a column that needs to be unique, however, not part of the Primary Key. If you insert a record on one node, then insert an identical record on the second node, you will end up with two legal records on their respective nodes. When it gets time to replicate that record in both directions, the unique constraint will make replication fail on both servers.
But doesn’t it offer High-Availability?
It does, if you are willing to live with a potentially inconsistent dataset after a node failure. Remember we specified earlier that replication can lag between nodes. If disaster strikes on one of your nodes, you have lost every transaction for the period that the other node was lagging behind the lost one.
The reuslt? Your website or app is still running, however data that was previously on one server will no longer be available. depending on your app, that could have huge consequences.
So Why use it?
If you could mitigate the above mentioned risks, and design your app properly, you could take advantage of M-M. Reads would be shared between both nodes, theoretically allowing you to handle more load. You could also pull out one node at a time, deploy possible long schema changes, rebuild indexes, etc. while the other handles the whole load, then do the same to the other.
I tend to stay away from M-M. It adds an extra, often unnecessary, layer of complexity to your environment. It does have its place, and should be part of your little bag of tricks, but only pull it out when you need to, and when it absolutely fits. Don’t let that green PHP developper tell you he built it at home and it works, then you get stuck supporting a nightmare. His pager isn’t going to go off at 3am…yours is!
I have finished my move to the Bay Area (Silicon Valley) and now have more time to return to the blog. I have learned a few new tricks as well I want to share.
So…..
I am relaunching the site. There will be a new format too. The podcast episodes will still be published for more high level discussions, then supported by a series of blog posts.
I think this will be a more productive format, since most users want to hear the high level discussion and then go to the site and refer to the material.
I just started using MONyog a month ago, and purchased it today. I have used MySQL Enterprise Monitor for a year and a half, and although some believe its a superior product, its 5K per server per year for the Enterprise version.
I bought the Unlimited Pack for $999 and that is a perpetual license for an unlimited amount of servers. It gives me all the monitoring I need and without an agent installed on the MySQL server. That makes the sys-admins I work with very happy.
My prefered feature is the log analyzer. It makes analysing the slow-query-log and the general-log really easy and efficient.
International Business Machines said on Friday it has agreed to buy in-memory database software provider Solid Information Technology from private owners for an undisclosed sum. Solid’s largest owners were private equity firms Apax Partners and CapMan.
Solid is expected to have 2007 sales of around $14.4 million, Vesa Wallden, a member of Solid’s board told Reuters. IBM said the acquisition is expected to close in the first quarter of 2008. “IBM’s acquisition of Solid Information Technology supports the company’s growth strategy and capital allocation model, and it is expected to contribute to the achievement of the company’s objective for earnings-per-share growth through 2010,” IBM said in a statement.
Recently we ran into a wall in one of my customers’ sites. They built an application that processed EDI documents. Each document, contained a list of transactions. Their application would launch a thread for each transaction in the document. On the surface this sounds good and the multi-threaded approach would speed up processing of a document.
InnoDB is the only built-in transactional storage engine and unfortunately has some limitations.
TX1 SET TRANSACTION ISOLATION LEVEL READ COMMITTED
TX2 SET TRANSACTION ISOLATION LEVEL READ COMMITTED
TX1 START TRANSACTION
TX2 START TRANSACTION
TX1 INSERT INTO child
TX2 INSERT INTO child (with same parent)
TX1 UPDATE parent
TX2 UPDATE parent (same parent row)
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
We solved this issue using a third party storage engine; solidDB.
Federated Storage Engine (FSE) allows you to connect to a remote server and “mount” a table on your local server, which links to the data on the remote server, for read-only access.
$ mysql -u root -ppassword -h remote
mysql> CREATE DATABASE test;
mysql> use test;
mysql> CREATE TABLE drivers (id INT,name VARCHAR(100));
mysql> INSERT INTO drivers (id, name) VALUES (1, ‘Chris’);
mysql> INSERT INTO drivers (id, name) VALUES (2, ‘Sheeri’);
mysql> INSERT INTO drivers (id, name) VALUES (3, ‘Elie’);
mysql> select * from drivers;
+------+--------+
| id | name |
+------+--------+
| 1 | Chris |
| 2 | Sheeri |
| 3 | Elie |
+------+--------+
3 rows in set (0.08 sec)
mysql> exit;
$ mysql -u root -ppassword -h local
mysql> CREATE DATABASE test;
mysql> use test;
mysql> CREATE TABLE drivers (id INT,name VARCHAR(100)) ENGINE=FEDERATED
CONNECTION=’mysql://root:password@remote:3306/test/drivers’;
mysql> select * from drivers;
+------+--------+
| id | name |
+------+--------+
| 1 | Chris |
| 2 | Sheeri |
| 3 | Elie |
+------+--------+
3 rows in set (0.14 sec)
mysql> exit;
I wanted to let you know all about a blog and podcast I have been reading/listening to lately. Its called OurSQL and the author is Sheeri Kritzer.I have been in email contact with Sheeri and she seems like a really great person. Why wouldn’t she be? She’s a MySQL DBA after all! The self proclaimed “She-BA”.
Check out her blog at sheeri.net and her Podcast OurSQL on iTunes.
One of the best features of MySQL is the fact that you can have a log of all queries that take longer than n seconds.
Activating the Log
To activate this log start MySQL with the –log-slow-queries[=file_name] option, or add the previous option to your my.cnf or my.ini and restart the server:
[mysqld]
...
log-slow-queries[=file_name]
You can specify a file name for the slow-query-log or it will be called host_name-slow.log. By default the log is in the data directory, unless you specify an absolute path with the file name.
By default a slow query is one that takes 10 seconds. You can change that by specifying long_query_time in either during startup or in the my.cnf or my.ini file.
[mysqld]
...
log-slow-queries
long_query_time=8
Viewing the Log
The log is in text format and can easily be viewed by any text editor and looks like this:
# Time: 070116 5:16:35
# User@Host: nvusr[nvusr] @ app01 [10.30.5.226]
# Query_time: 21 Lock_time: 0 Rows_sent: 20 Rows_examined: 4078677
SELECT ectransactions.*, interchanges.interchange_datetime as transaction_datetime, interchanges.partner_name, interchanges.direction, functional_groups.functional_group_control_numb, fo_name, functional_organization_qualid, partner_name, partner_qualid, interchanges.interchange_control_number, operators.name as operator_name, trading_participants.name as client_name FROM ectransactions left join functional_groups ON ectransactions.functional_group_id=functional_groups.id left join interchanges ON ectransactions.interchange_id=interchanges.id left join trading_participants ON ectransactions.trading_participant_id=trading_participants.id left join operators ON ectransactions.operator_id=operators.id ORDER BY client_name asc, transaction_datetime desc LIMIT 0, 20;
However, its a lot more readable with the mysqldumpslow command, whose output looks like this:
Count: 1 Time=24.00s (24s) Lock=0.00s (0s) Rows=20.0 (20), nvusr[nvusr]@tqapp02
SELECT ectransactions.*, interchanges.interchange_datetime as transaction_datetime, interchanges.partner_name, interchanges.direction, functional_groups.functional_group_control_numb, fo_name, functional_organization_qualid, partner_name, partner_qualid, interchanges.interchange_control_number, operators.name as operator_name, trading_participants.name as client_name FROM ectransactions left join functional_groups ON ectransactions.functional_group_id=functional_groups.id left join interchanges ON ectransactions.interchange_id=interchanges.id left join trading_participants ON ectransactions.trading_participant_id=trading_participants.id left join operators ON ectransactions.operator_id=operators.id ORDER BY transaction_datetime asc, description desc LIMIT N, N
Count: 2 Time=22.00s (44s) Lock=0.00s (0s) Rows=20.0 (40), nvusr[nvusr]@2hosts
SELECT ectransactions.*, interchanges.interchange_datetime as transaction_datetime, interchanges.partner_name, interchanges.direction, functional_groups.functional_group_control_numb, fo_name, functional_organization_qualid, partner_name, partner_qualid, interchanges.interchange_control_number, operators.name as operator_name, trading_participants.name as client_name FROM ectransactions left join functional_groups ON ectransactions.functional_group_id=functional_groups.id left join interchanges ON ectransactions.interchange_id=interchanges.id left join trading_participants ON ectransactions.trading_participant_id=trading_participants.id left join operators ON ectransactions.operator_id=operators.id ORDER BY client_name asc, transaction_datetime desc LIMIT N, N
I Want More!
You can also have MySQL add all queries that don’t use indexes into the slow-query-log. Add –log-queries-not-using-indexes during startup or in the my.cnf or my.ini file.