Long-lasting MySQL connections in Python and Tornado

From a DBMS point of view, the client connection is a valuable resource that must be carefully managed to control the use of system resources. In practice, all DBMS set the connection inactivity timeout, after which the given connection is closed unilaterally. Usually, applications will find out about this fact after the connection is broken.

In the case of MySQL, the client receives the message The MySQL server has gone away (error 2006). In this article, we will look at approaches that allow applications with long-lived connections to keep them alive as long as necessary. Examples will be provided for the standard MySQL database connection interface – mysql.connector. Sample Language is Python3, as an application example, microservice will be used, implemented on the Tornado framework.

Why keep the connection open

Because the reason for keeping the connection open is important, let’s start with it. Each opening and closing of a connection takes time and incurs overhead, both on the server side and on the client side. This factor is especially noticeable when the server and the client are located in geographically spearated networks with a significant delay in packet transmission (RTT > 100+ ms).

For example, one of our applications uses a DBMS server located in Russia and an application server is in the USA. In that case, the connection establishment time takes more than 4 seconds. Obviously, a naive approach with opening a connection on demand does not seem reasonable for a service, the responses from which are expected as quickly as possible via the HTTP protocol.

Naive solution with increased timeouts

The naive approach assumes setting timeouts on the client and server side. This approach can be used as a self-sufficient one rarely, for example, in the case when the applications are deployed in a secure corporate network with a limited number of application users, usually in systems built on a two-tier architecture.

Another example of this approach can be used for applications that rarely access a DBMS, so they require adjusting the default values set for a connection timeout just to decrease the amount of reconnects.

Three MySQL global variables are responsible for connection timeout:

SET GLOBAL connect_timeout=3600
SET GLOBAL wait_timeout=3600
SET GLOBAL interactive_timeout=3600

They also can be set in the configuration file:

[mysqld]
connect_timeout=3600
wait_timeout=3600
interactive_timeout=3600

The values ​​of these variables are:

  • connection_timeout – maximum server wait time for the “connect” packet for the newly opened connection;
  • interactive_timeout – maximum server wait time on an interactive connection;
  • wait_timeout – maximum wait time the server waits for activity on the connection.

It is worth noting that this method does not protect against server restarts, and client applications may still encounter a disconnection situation. We do not recommend using this approach without additional ones, which are described below.

Ping MySQL before the first query and reconnect

This approach is universal and should be implemented in most cases.


connection = None

def init_db():
    return = mysql.connector.connect(
        mysql_host=db_host, 
        mysql_user=db_user, 
        mysql_password=db_passwd, 
        mysql_db_name=db_dbase, 
        mysql_port=db_port)

def get_cursor():
    try:
        connection.ping(reconnect=True, attempts=3, delay=5)
    except mysql.connector.Error as err:
        # reconnect your cursor as you did in __init__ or wherever    
        connection = init_db()
    return connection.cursor()


connection = init_db()

...

# before queries after the delay
cursor = get_cursor()

# queries

Periodic ping calling (Tornado)

The previous method is universal and can be applied everywhere. However, there are times when it is necessary to maintain a connection in high availability in order to minimize response time. Let’s consider a simple solution for Tornado microservice, which uses an additional thread.

It should be noted that MySQL connections are not thread-safe, so you cannot directly call ping from an another thread. To solve this problem, we will implement an additional API (/keepalive), which we will periodically call from an auxiliary thread.

First, consider an additional stream. The implementation is typical:

import logging
import threading
import socket
import os

from time import sleep
from http.client import HTTPConnection


class Watchdog(threading.Thread):
    def __init__(self, ip, port, url, timeout, pause):
        threading.Thread.__init__(self)
        self.ip = ip
        self.port = port
        self.url = url
        self.timeout = timeout
        self.pause = pause

    def run(self):
        while True:
            logging.debug("Sleeping between KeepAlives for %d seconds" % self.pause)
            sleep(self.pause)
            try:
                logging.debug("Calling for KeepAlive [%s:%d, %s]" % (self.ip, self.port, self.url))

                host, port = self.ip, self.port
                ip, port = socket.getaddrinfo(host, port)[0][-1]

                conn = HTTPConnection(ip, port, timeout=self.timeout)
                conn.request('GET', self.url)
                resp = conn.getresponse()
                data = resp.read()
                logging.debug(data)
            except socket.timeout:
                logging.debug("Keepalive timeout happened. Stop whole app.")
                os._exit(1)

In the implementation, it is important to note only the method of connecting to the server via the IP address and port. Only this method ensures the correct processing of the wait timeout with the generation of an exception for HTTP connection.

In Tornado, we implement a KeepAlive handler:

import logging

import tornado.web
from libs.commons import Commons

class KeepaliveHandler(tornado.web.RequestHandler):
    def get(self):
        logging.debug("Checking the state of connections")
        try:
            Commons.connection.ping(reconnect=True, attempts=3, delay=5)
        except mysql.connector.Error as err:
            Commons.connection = init_db()
        self.write('OK')

In the body of the service, run our thread:


app = tornado.web.Application([
        ('/keepalive', KeepaliveHandler),
    ])

tornado.ioloop.IOLoop.current().start()

watchdog = Watchdog(ip='127.0.0.1', port=Commons.server_port, url='/keepalive', pause=60, timeout=60)
watchdog.start()

app.run(Commons.server_port)
watchdog.join()

Now, the connection to the database is checked every minute, and in case of a problem, reconnection will be performed so the high connection availability is ensured. If the watchdog thread is unable to wait for the connection to be activated, it terminates the entire service so that the external control system can restart it (for example, docker run --restart=always).

Conclusion

We observed three ways to handle long-lived client connection to MySQL DBMS, which, when applied together, allow applications to maintain connections to the DBMS for an unlimited time.

Of course, when you use advanced ORMs like SQLAlchemy, then the problem can be effectively solved by ORM itself. In the case when high-level ORMs are unnecessary, use the practices provided in the article for reliable operation of your applications.

If you like the article, share it with friends.