Building RESTful APIs with Tornado

this blog is forward from here

Tornado is a Python Web framework and asynchronous networking library that provides excellent scalability due to its non-blocking network I/O. It also greatly facilitates building a RESTful API quickly. These features are central to Tornado, as it is the open-source version of FriendFeed’s Web server. A few weeks ago, Tornado 3. was released, and it introduced many improvements. In this article, I show how to build a RESTful API with the latest Tornado Web framework and I illustrate how to take advantage of its asynchronous features.

Mapping URL Patterns to Request Handlers

To get going, download the latest stable version and perform a manual installation or execute an automatic installation with pip by running pip install tornado.

To build a RESTful API with Tornado, it is necessary to map URL patterns to your subclasses of tornado.web.RequestHandler, which override the methods to handle HTTP requests to the URL. For example, if you want to handle an HTTP GET request with a synchronous operation, you must create a new subclass of tornado.web.RequestHandler and define the get() method. Then, you map the URL pattern in tornado.web.Application.

Listing One shows a very simple RESTful API that declares two subclasses of tornado.web.RequestHandler that define the get method: VersionHandler and GetGameByIdHandler.

Listing One: A simple RESTful API in Tornado.

from datetime import date
import tornado.escape
import tornado.ioloop
import tornado.web
 
class VersionHandler(tornado.web.RequestHandler):
    def get(self):
        response = { 'version': '3.5.1',
            'last_build':  date.today().isoformat() }
        self.write(response)

class GetGameByIdHandler(tornado.web.RequestHandler):
    def get(self, id):
        response = { 'id': int(id),
            'name': 'Crazy Game',
            'release_date': date.today().isoformat() }
        self.write(response)
        
application = tornado.web.Application([
    (r"/getgamebyid/([0-9]+)", GetGameByIdHandler),
    (r"/version", VersionHandler)
])

if __name__ == "__main__":
    application.listen(8888)
    tornado.ioloop.IOLoop.instance().start()

The code is easy to understand. It creates an instance of tornado.web.Application named application with the collection of request handlers that make up the Web application. The code passes a list of tuples to the Application constructor. The list is composed of a regular expression (regexp) and a tornado.web.RequestHandler subclass (request_class). The application.listen method builds an HTTP server for the application with the defined rules on the specified port. In this case, the code uses the default 8888 port. Then, the call to tornado.ioloop.IOLoop.instance().start() starts the server created with application.listen.

When the Web application receives a request, Tornado iterates over that list and creates an instance of the first tornado.web.RequestHandler subclass whose associated regular expression matches the request path, and then calls the head(), get(), post(), delete(), patch(), put() or options() method with the corresponding parameters for the new instance based on the HTTP request. For example, Table 1 shows some HTTP requests that match the regular expressions defined in the previous code.

HTTP verb and request URL	Tuple (regexp, request_class) that matches the request path	RequestHandler subclass and method that is called
GET http://localhost:8888/getgamebyid/500	(r”/getgamebyid/([0-9]+)”, GetGameByIdHandler)	GetGameByIdHandler.get
GET http://localhost:8888/version	(r”/version”, VersionHandler)	VersionHandler.get

Table 1: Matching HTTP requests.

The simplest case is the VersionHandler.get method, which just receives self as a parameter because the URL pattern doesn’t include any parameter. The method creates a response dictionary, then calls the self.write method with response as a parameter. The self.write method writes the received chunk to the output buffer. Because the chunk (response) is a dictionary, self.write writes it as JSON and sets the Content-Type of the response to application/json. The following lines show the example response for GET http://localhost:8888/version and the response headers:

{"last_build": "2013-08-08", "version": "3.5.1"
Date: Thu, 08 Aug 2013 19:45:04 GMT
Etag: "d733ae69693feb59f735e29bc6b93770afe1684f"
Content-Type: application/json; charset=UTF-8
Server: TornadoServer/3.1
Content-Length: 48</p>

If you want to send the data with a different Content-Type, you can call the self.set_header with “Content-Type” as the response header name and the desired value for it. You have to call self.set_header after calling self.write, as shown in Listing Two. It sets the Content-Type to text/plain instead of the default application/json in a new version of the VersionHandler class. Tornado encodes all header values as UTF-8.

Listing Two: Changing the content type.

class VersionHandler(tornado.web.RequestHandler):
    def get(self):
        self.write("Version: 3.5.1. Last build: " + date.today().isoformat())
            self.set_header("Content-Type", "text/plain")

The following lines show the example response for GET http://localhost:8888/version and the response headers with the new version of the VersionHandler class:

Server: TornadoServer/3.1
Content-Type: text/plain
Etag: "c305b564aa650a7d5ae34901e278664d2dc81f37"
Content-Length: 38
Date: Fri, 09 Aug 2013 02:50:48 GMT

The GetGameByIdHandler.get method receives two parameters: self and id. The method creates a response dictionary that includes the integer value received for the id parameter, then calls the self.write method with response as a parameter. The sample doesn’t include any validation for the id parameter in order to keep the code as simple as possible, as I’m focused on the way in which the get method works. I assume you already know how to perform validations in Python. The following lines show the example response for GET http://localhost:8888/getgamebyid/500 and the response headers:

{"release_date": "2013-08-09", "id": 500, "name": "Crazy Game"}
 
 Content-Length: 63
 Server: TornadoServer/3.1
 Content-Type: application/json; charset=UTF-8
 Etag: "489191987742a29dd10c9c8e90c085bd07a22f0e"
 Date: Fri, 09 Aug 2013 03:17:34 GMT

If you need to access additional request parameters such as the headers and body data, you can access them through self.request. This variable is a tornado.httpserver.HTTPRequest instance that provides all the information about the HTTP request. The HTTPRequest class is defined in httpserver.py.

Working with Asynchronous Code

If you’ve ever worked with callbacks, you know how difficult it is to read and understand code split into different methods. Tornado provides a generator-based interface (tornado.gen) to enable you to write asynchronous code in handlers in a single generator.

You simply need to use the @tornado.gen.coroutine decorator for asynchronous generators in the required method and you don’t need to add the @tornado.web.asynchronous decorator. Listing Three shows a new subclass of tornado.web.RequestHandler and defines the get() method with the @tornado.gen.coroutine decorator. You need to add two imports to add the code to the previous listing: import tornado.gen and import tornado.httpclient.

Listing Three: A different subclass of tornado.web.RequestHandler.

class GetFullPageAsyncHandler(tornado.web.RequestHandler):
    @tornado.gen.coroutine
    def get(self):
        http_client = tornado.httpclient.AsyncHTTPClient()
        http_response = yield http_client.fetch("http://www.drdobbs.com/web-development")
        response = http_response.body.decode().replace(
            "Most Recent Premium Content", "Most Recent Content")
        self.write(response)
        self.set_header("Content-Type", "text/html")

Because I’ve added a new subclass of RequestHandler, it is necessary to map the URL pattern in tornado.Web.Application. Listing Four shows the new code that maps the /getfullpage URL to the GetFullPageAsyncHandler.

Listing Four: Mapping a URL to a handler.

application = tornado.web.Application([
    (r"/getfullpage", GetFullPageAsyncHandler),
    (r"/getgamebyid/([0-9]+)", GetGameByIdHandler),
    (r"/version", VersionHandler),            
])

The GetFullPageAsyncHandler.get method creates a tornado.httpclient.AsyncHTTPClient instance (http_client) that represents a non-blocking HTTP client. Then, the code calls the http_client.fetch method of this instance to asynchronously execute a request. The fetch method returns a Future, whose result is an HTTPResponse, and raises an HTTPError if the request returns a response code other than 200. The code uses the yield keyword to retrieve the HTTPResponse from the Future returned by the fetch method.

The call to fetch retrieves the Dr. Dobb’s Web Development page from http://www.drdobbs.com/web-development with an asynchronous execution. When fetch finishes its execution with a successful response code equal to 200, http_response will be an HTTPRequest instance with the contents of the retrieved HTML page in http_response.body. The method continues its execution with the line after the call to fetch. You have all the code that needs to be executed in the get method with the @tornado.gen.coroutine decorator, and you don’t have to worry about writing a callback for on_fetch. The next line decodes the response body to a string and replaces “Most Recent Premium Content” with “Most Recent Content.” Then, the code calls the self.write method to write the modified string and sets the Content-Type of the response to application/html.

Listing Five is the equivalent code, which uses the @tornado.web.asynchronous decorator instead of @tornado.gen.coroutine. In this case, it is necessary to define the on_fetch method that works as the callback for the http_client.fetch method; therefore, the code is split into two methods.

Listing Five: The equivalent functionality using a decorator.

class GetFullPageAsyncNewHandler(tornado.web.RequestHandler):
    @tornado.web.asynchronous
    def get(self):
        http_client = tornado.httpclient.AsyncHTTPClient()
        http_client.fetch("http://www.drdobbs.com/web-development", callback=self.on_fetch)

    def on_fetch(self, http_response):
        if http_response.error: raise tornado.web.HTTPError(500)
        response = http_response.body.decode().replace("Most Recent Premium Content", "Most Recent Content")
        self.write(response)
        self.set_header("Content-Type", "text/html")
        self.finish()

When the fetch method finishes retrieving the content, it executes the code in on_fetch. Because the get method uses @tornado.web.asynchronous, it is your responsibility to call self.finish() to finish the HTTP request. Thus, the on_fetch method calls self_finish in its last line, after calling self.write and self.set_header. As you can see, it is much easier to use the @tornado.gen.coroutine decorator.

Understanding How Tornado Works with a RequestHandler Subclass

The RequestHandler class defines a SUPPORTED_METHODS class variable with the following code. If you need support for different methods, you need to override the SUPPORTED_METHODS class variable in your RequestHandler subclass:

1	SUPPORTED_METHODS = ("GET", "HEAD", "POST", "DELETE", "PATCH", "PUT", "OPTIONS")

The default code for the head(), get(), post(), delete(), patch(), put(), and options() methods is a single line that raises an HTTPError. Listing Six shows the code for the get method:

Listing Six: The get() method.

1 2	def get(self, args, *kwargs): raise HTTPError(405)

Whenever the Web application receives a request and matches the URL pattern, Tornado performs the following actions:

It creates a new instance of the RequestHandler subclass that has been mapped to the URL pattern.
It calls the initialize method with the keyword arguments specified in the application configuration. You can override the initialize method to save the arguments into member variables.
No matter what the HTTP request, Tornado calls the prepare method. If you call either finish or send_error, Tornado won’t call any additional methods. You can override the prepare method to execute code that is necessary for any HTTP request, then write your specific code in the head(), get(), post(), delete(), patch(), put() or options() method.
It calls the method according to the HTTP request with the arguments based on the URL regular expression that captured the different groups. As you already know, you must override the methods you want your RequestHandler subclass to be able to process. For example, if there is an HTTP GET request, Tornado will call the get method with the different arguments.
If the handler is synchronous, Tornado calls on_finish after the previous method called, according to the HTTP request returns. But if the handler is asynchronous, Tornado executes on_finish after the code calls finish. The previous asynchronous example showed the usage of finish. You can override the on_finish method to perform cleanup or logging. Notice that Tornado calls on_finish after it sends the response to the client.

If the client closes the connection in asynchronous handlers, Tornado calls on_connection_close. You can override this method to clean up resources in this specific scenario. However, the cleanup after processing the request must be included in the on_finish method.

Listing Seven shows a new version of the GetGameByIdHandler class that overrides the initialize method to receive a string specified in the application configuration. The initialize method just saves common_string into a member variable, then the get method uses the string in the response:

Listing Seven: Overriding the initialize() method.

class GetGameByIdHandler(tornado.web.RequestHandler):
    def initialize(self, common_string):
        self.common_string = common_string

    def get(self, id):
        response = { 'id': int(id),
                     'name': 'Crazy Game',
                     'release_date': date.today().isoformat(),
                     'common_string': self.common_string }
        self.write(response)

The following code shows the changes in the arguments passed to the tornado.web.Application constructor to pass a value for common_string in a dictionary for the GetGameByIdHandler request handler:

1
2
3

application = tornado.web.Application([
    (r"/getgamebyid/([0-9]+)", GetGameByIdHandler,
    dict(common_string='Value defined in Application')),

In this case, I’ve used a simple string as the value passed from the application. However, the most common usage will be to pass one or more common objects (for example, a database object).

Returning Errors in a Request Handler

Listing Eight shows the code for the ErrorHandler request handler that demonstrates the simplest use of the three different mechanisms to return errors in a handler method.

Listing Eight: A simple error request handler.

class ErrorHandler(tornado.web.RequestHandler):
    def get(self, error_code):
        if error_code == 1:
            self.set_status(500)
        elif error_code == 2:
            self.send_error(500)
        else:
            raise tornado.web.HTTPError(500)

It is necessary to add the appropriate mapping to the Application constructor parameters with the following line:

1	(r"/error/([0-9]+)", ErrorHandler),

If error_code is equal to 1, the get method calls self.set_status, which sets the status code for the response. It is also possible to specify a reason string as the second parameter. If error_code is equal to 2, the get method calls self.send_error, which sends the specified HTTP error code to the browser. Any other error_code value will make the get method raise a tornado.web.HTTPError exception with the 500 status code.

The three different mechanisms to return errors return the default error page, which you can change by overriding the write_error method in your RequestHandler subclasses.

Conclusion

In this article, I’ve provided many examples to show how to easy it is to start developing RESTful APIs with Tornado. I’ve covered basic features, but you will definitely want to dive deeper into Tornado’s additional useful features when building a complete RESTful API. Tornado’s source code comes with good documentation for the different methods and variables, so you can easily build the methods you need to use as you write code.