Torando适配Uvloop与Asyncio下的性能简测

Python已经relase3.6版本了,尝试使用PY3来构建服务,由于比较熟悉Tornado,故测试一下tornado在Python3下的常见用法。

业务代码通常需要访问三方服务和数据库,因此针对异步的http和数据库io进行测试。

事件循环

Python3.5+ 的标准库asyncio提供了事件循环用来实现协程,并引入了async/await关键字语法以定义协程。Tornado通过yield生成器实现协程,它自身实现了一个事件循环。由于一些三方库都是基于asyncio进行,为了更好的使用python3新特效带来的异步IO,实际测试了Tornado在不同的事件循环中的性能,以及搭配三方库(motor,asyncpg,aiomysql)的方式。

tornado app基本结构

一个基本的tornado app代码如下:

import tornado.httpserver as httpserver
import tornado.ioloop as ioloop
import tornado.options as options
import tornado.web as web

options.parse_command_line()
class IndexHandler(web.RequestHandler):
    def get(self):
        self.finish("It works")


class App(web.Application):
    def __init__(self):
        settings = {
            'debug': True
        }
        super(App, self).__init__(
            handlers=[
                (r'/', IndexHandler)
            ],
            **settings)


if __name__ == '__main__':
    app = App()
    server = httpserver.HTTPServer(app, xheaders=True)
    server.listen(5010)
    ioloop.IOLoop.instance().start()

使用tornado默认的事件循环驱动app,IOLoop会创建一个事件循环,用于响应epoll事件,并调用响应的handler处理请求。

异步http client

Tornado提供了一个异步的HTTPClient,用于handler中访问三方的api,即使当前的三方api访问被阻塞了,也不会阻塞tornado响应其他的handler。

class GenHttpHandler(web.RequestHandler):
    @gen.coroutine
    def get(self):
        url = 'http://127.0.0.1:5000/'
        client = httpclient.AsyncHTTPClient()
        resp = yield client.fetch(url)
        print(resp.body)
        self.finish(resp.body)

gen是tornado提供的协程模块。python3中还可以使用 async/await的语法

class AsyncHttpHandler(web.RequestHandler):
    async def get(self):
        url = 'http://127.0.0.1:5000/'
        client = httpclient.AsyncHTTPClient()
        resp = await client.fetch(url)
        print(resp.body)
        self.finish(resp.body)

asyncio 事件循环

Aysnc定义协程方式基本符合tornado的协程,但是毕竟不是全兼容了。例如asyncio.sleep 将不会work。

class SleepHandler(web.RequestHandler):
    async def get(self):
        print("hello tornado")
        await asyncio.sleep(5)
        self.write('It works!')

想要上面的asyncio.sleep 能够正常,需要替换I使用asyncio的事件循环替换ioloop。

if __name__ == '__main__':
    tornado_asyncio.AsyncIOMainLoop().install()
    app = App()
    server = httpserver.HTTPServer(app, xheaders=True)
    server.listen(5020)
    asyncio.get_event_loop().run_forever()

使用 tornado_asyncio.AsyncIOMainLoop() 可以替换默认的ioloop。

uvloop 事件循环

除了标准库asyncio的事件循环,社区使用Cython实现了另外一个事件循环uvloop。用来取代标准库。号称是性能最好的python异步IO库。使用uvloop的方式如下:

if __name__ == '__main__':
    asyncio.set_event_loop_policy(uvloop.EventLoopPolicy())
    tornado_asyncio.AsyncIOMainLoop().install()
    app = App()
    server = httpserver.HTTPServer(app, xheaders=True)
    server.listen(5030)
    asyncio.get_event_loop().run_forever()

由于 uvloop依赖 cython,因此需要按照 cython,两者都可以使用pip直接按照。

三种事件循环的性能

三种事件循环中,ioloop对asyncio.sleep 兼容性不好。主要考察后面两者事件循环的性能。测试接口类型为三种:

1.单纯的返回一个子串
2.异步httpclient性能
3.数据库读写性能

单纯返回子串

IOLoop

使用 100并发连接,10000请求量压测

ab -k -c100 -n10000 http://127.0.0.1:5010/

Server Software:        TornadoServer/4.5.1
Server Hostname:        127.0.0.1
Server Port:            5010

Document Path:          /
Document Length:        8 bytes

Concurrency Level:      100
Time taken for tests:   5.615 seconds
Complete requests:      10000
Failed requests:        0
Keep-Alive requests:    10000
Total transferred:      2260000 bytes
HTML transferred:       80000 bytes
Requests per second:    1780.84 [#/sec] (mean)
Time per request:       56.153 [ms] (mean)
Time per request:       0.562 [ms] (mean, across all concurrent requests)
Transfer rate:          393.04 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.2      0       3
Processing:     2   56   5.9     56     154
Waiting:        2   56   5.9     56     154
Total:          5   56   5.8     56     158

Qps 为 1780.84

使用 wrk 压测的结果,并发500线程连接,持续测试一分钟:

➜  ~ wrk -t12 -c500 -d60 http://127.0.0.1:5010/
Running 1m test @ http://127.0.0.1:5010/
  12 threads and 500 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   284.66ms   57.85ms 422.16ms   85.62%
    Req/Sec   139.33     94.69   696.00     64.84%
  99270 requests in 1.00m, 19.12MB read
  Socket errors: connect 0, read 582, write 0, timeout 0
Requests/sec:   1651.92
Transfer/sec:    325.87KB
Asyncio
Concurrency Level:      100
Time taken for tests:   5.616 seconds
Complete requests:      10000
Failed requests:        0
Keep-Alive requests:    10000
Total transferred:      2260000 bytes
HTML transferred:       80000 bytes
Requests per second:    1780.69 [#/sec] (mean)
Time per request:       56.158 [ms] (mean)
Time per request:       0.562 [ms] (mean, across all concurrent requests)
Transfer rate:          393.00 [Kbytes/sec] received

qps 1780.69

Wrk 压测结果

➜  ~ wrk -t12 -c500 -d60 http://127.0.0.1:5020/
Running 1m test @ http://127.0.0.1:5020/
  12 threads and 500 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   265.34ms   32.16ms 453.76ms   83.32%
    Req/Sec   157.85    104.58   696.00     63.36%
  108364 requests in 1.00m, 20.88MB read
  Socket errors: connect 0, read 458, write 2, timeout 0
Requests/sec:   1803.34
Transfer/sec:    355.74KB
uvloop

uvloop的测试结果

Concurrency Level:      100
Time taken for tests:   5.612 seconds
Complete requests:      10000
Failed requests:        0
Keep-Alive requests:    10000
Total transferred:      2260000 bytes
HTML transferred:       80000 bytes
Requests per second:    1781.98 [#/sec] (mean)
Time per request:       56.117 [ms] (mean)
Time per request:       0.561 [ms] (mean, across all concurrent requests)
Transfer rate:          393.29 [Kbytes/sec] received

Wrk 压测结果

➜  ~ wrk -t12 -c500 -d60 http://127.0.0.1:5030/
Running 1m test @ http://127.0.0.1:5030/
  12 threads and 500 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   272.23ms   47.65ms 457.63ms   87.26%
    Req/Sec   148.17    103.62   570.00     63.33%
  104625 requests in 1.00m, 20.16MB read
  Socket errors: connect 0, read 567, write 0, timeout 0
Requests/sec:   1740.76
Transfer/sec:    343.39KB

异步httpclient性能

异步的httpclient性能指在handler中访问别的api,如三方请求。测试的性能大致如下:

- loop asyncio uvloop
ab 571.12 462.64 534.99
wrk 448.11 444.63 411.19

结论

通过一些压测,在三种的横向对比中,其性能大致在一个数量级上,并没有拉开很大的距离,在性能上使用哪一个差不多。考虑到三方库兼容标准的异步IO,并且uvloop驱动的另外一些框架 sanic和 japronto都比较不错,并且还可以使用cython加速,因此下面针对数据库驱动,使用事件循环为 uvloop。

数据库测试

Python中最常用的是 mysqldb,可是mysqldb不支持python3。python3中mysql驱动以pymysql为基础的aiomysql。而postgresql和mongodb都提供了基于asyncio事件循环的驱动。

asyncpg

对于 postgresql,比较好的驱动是 asyncpg,维护的活跃度和性能都比 aiopg更好。使用asyncpg的方式如下:


class DatabaseHandler(web.RequestHandler):
    async def get(self):
        conn = await asyncpg.connect('postgresql://postgres@localhost/test')

        # rows = await conn.fetchrow('select pg_sleep(5)')
        rows = await conn.fetchrow('select * from public.user')
        print(rows[0])
        await conn.close()

        self.finish("ok")


class PoolHandler(web.RequestHandler):
    async def get(self):
        pool = self.application.pool
        async with pool.acquire() as connection:
            # Open a transaction.
            async with connection.transaction():
                # Run the query passing the request argument.
                rows = await connection.fetch("SELECT * FROM public.user ")
                # rows = await connection.fetch("SELECT pg_sleep(1) ")
                print(rows)

        self.finish("ok")


class App(web.Application):
    def __init__(self, pool):
        settings = {
            'debug': True
        }
        self._pool = pool
        super(App, self).__init__(
            handlers=[
                (r'/', IndexHandler),
                (r'/db', DatabaseHandler),
                (r'/pool', PoolHandler),
            ],
            **settings)

    @property
    def pool(self):
        return self._pool


async def init_db_pool():
    return await asyncpg.create_pool(database='test',
                                     user='postgres')


def init_app(pool):
    app = App(pool)
    return app


if __name__ == '__main__':
    asyncio.set_event_loop_policy(uvloop.EventLoopPolicy())
    tornado_asyncio.AsyncIOMainLoop().install()

    loop = asyncio.get_event_loop()
    pool = loop.run_until_complete(init_db_pool())
    app = init_app(pool=pool)
    server = httpserver.HTTPServer(app, xheaders=True)
    server.listen(5040)
    loop.run_forever()

一种方式使用了短链接,即每一个请求,handler会创建一个数据库连接,完成查询再关闭,另外一种方式则是使用数据库连接池。当超过连接池的访问,handler会阻塞,但是不会阻塞整个服务。

aiomysql

class PoolHandler(web.RequestHandler):
    async def get(self):
        pool = self.application.pool
        async with pool.acquire() as conn:
            async with conn.cursor() as cur:
                await cur.execute("SELECT * FROM users_account LIMIT 1")
                ret = await cur.fetchone()
                print(ret)

        self.finish("ok")

class App(web.Application):
    def __init__(self, pool):
        settings = {
            'debug': True
        }
        self._pool = pool
        super(App, self).__init__(
            handlers=[
                (r'/pool', PoolHandler),
            ],
            **settings)

    @property
    def pool(self):
        return self._pool


async def init_db_pool(loop):

    return await aiomysql.create_pool(host='127.0.0.1', port=3306,
                                      user='root', password='root',
                                      db='hydra', loop=loop)

def init_app(pool):
    app = App(pool)
    return app




if __name__ == '__main__':
    asyncio.set_event_loop_policy(uvloop.EventLoopPolicy())
    tornado_asyncio.AsyncIOMainLoop().install()

    loop = asyncio.get_event_loop()
    pool = loop.run_until_complete(init_db_pool(loop=loop))
    app = init_app(pool=pool)
    server = httpserver.HTTPServer(app, xheaders=True)
    server.listen(5070)
    loop.run_forever()

motor

Mongodb的驱动为motor,它也实现了对asyncio的支持,其使用方式如下:


class MongodbHandler(web.RequestHandler):
    async def get(self):
        ret = await self.application.motor_client.hello.find_one()
        # ret = await self.application.motor_client.hello.insert({'hello': 'world'})
        print(ret)
        self.finish("It works !")

class App(web.Application):
    def __init__(self):
        settings = {
            'debug': True
        }
        super(App, self).__init__(
            handlers=[
                (r'/', IndexHandler),
                (r'/mongodb', MongodbHandler),

            ],
            **settings)

    @property
    def motor_client(self):
        client = motor_asyncio.AsyncIOMotorClient('mongodb://localhost:27017')
        return client['test']


if __name__ == '__main__':
    asyncio.set_event_loop_policy(uvloop.EventLoopPolicy())
    tornado_asyncio.AsyncIOMainLoop().install()
    app = App()
    server = httpserver.HTTPServer(app, xheaders=True)
    server.listen(5060)
    asyncio.get_event_loop().run_forever()

读取数据的性能

ab -c100 -n10000

Wrk -t12 -c100 -d60s

asyncpg-db asyncpg-pool aiomysql motor
ab 305.49 898.84 669.75 236.82
wrk 281.60 819.23 655.58 252.51

压测中,使用 wrk 500的连接,压测 db的时候,会出现连接异常(Too Many Connection)。mongodb也会出现Can't assign requested address的异常。

因为数据库读写都是non-block,因此db和mongodb模式都会因请求的增长而增长,当瞬时达到最大连接数将会raise异常。而pool的方式会等待连接释放,再发起数据库查询。而且性能最好。aiomysql的连接池方式与pq类似。

在同步带 mysql 驱动中,经常维护一个mysql长连接。而异步的驱动则不能这样,因为一个连接阻塞了,另外的协程还是无法读取这个连接。最好的方式还是使用连接池管理连接。

结论

Tornado的作者也指出过,他的测试过程中,使用asyncio和tornado自带的epoll事件循环性能差不多。并且tornado5.0会考虑完全吸纳asyncio。在此之前,使用tornado无论是使用自带的事件循环还是asyncio活着uvloop,在性能方面上都差不不大。需要兼容数据库或http库的时候,使用uvloop的驱动方式,兼容性最好~

推荐阅读更多精彩内容