Posts Marcados mongrel2
Wsgid is a generic handler for the mongrel2 web server. It acts as a bridge between WSGI and mongrel2 and makes it possible to run any WSGI python application using mongrel2 as the front end. To know more about both projects visit their official websites at: http://wsgid.com and http://mongrel2.org
In this post we will see how to configure mongrel2 and wsgid so your application will be able to handle arbitrary sized requests very easily and with a low memory usage.
The problem of receiving big requests is that depending on how your front end server deals with it, you can easily become out of resources and in a worst case scenario your application can stop responding for a while. If your application exposes any POST endpoint, then you should be prepared to handle some big requests along the way.
The big picture
Mongrel2 has a very clever way to handle big requests without consuming all your server’s resources, in fact without consuming almost nothing, only bandwidth. Basically what it does is to dump the entire request to disk, using almost none additional memory. Here is how it works:
When the request comes in, mongrel2 sends a special message to the back-end handler containing only the original headers. This message notifies the start of this request. It’s up to the handler to accept or deny it. To accept the request, your handler just do nothing. If you want to deny the request, your handler must send a 0-length message back to mongrel2. Due to mongrel2 async model, you can send this deny message at any time, does not need to be imediatelly after receiving the request start message.
If you happen to be using wsgid (I hope you are!) this happens without your WSGI application ever knowing. All your application will see (if at all) is the wsgi.input attribute of the WSGI environ (if you happen to write a WSGI framework) or just request.POST (or something similar depending on what framework you’re using). It’s all transparent.
After the request is completely dumped, mongrel2 sends another message notifying the end of the upload and that’s when wsgid will actually call your application. During all the upload process, your application does not even know that a new request has come and was being handled.
All this happens while mongrel2 is already dumping the request content to disk. If your handler happen to deny the request, mongrel2 will close the connection and remove the temporary file. To know more about how mongrel2 notifies the handlers, you can read the on-line manual: http://mongrel2.org/static/mongrel2-manual.html.
Configuring mongrel2 to handle big requests
The configuration of the server is extremely simple. All you have to do is tell it where to dump any big request that may arrive. You may ask: How big a request must be to be dumped to disk? The truth is that you decide this and tell mongrel2 about it.
To tell it where to dump the request you must set the option upload.temp_store to the correct path. This path must include the filename template. An example could be: /uploads/m2.XXXXXX. This must be a mkstemp(3) compatible template. See the manpage for more details.
The size of the request is set with the limits.content_length. This sets the biggest request mongrel2 will handle without dumping to disk. This size is set in bytes.
Configuring wsgid to understand what mongrel2 is saying
Since mongrel is saving requests on disk wsgid must be able to open these files and pass its contents to the application being run. It’s important to note that the path chosen on the upload.temp_store is always relative to where your mongrel2 is chrooted, so somehow we must tell wsgid where this is.
Fortunately this new release of wsgid comes with a new command line option: --mongrel2-chroot. You just have to pass this to your wsgid instance to be able to handle big requests.
Alternatively you can add this options to your wsgid.json configuration file if you want. This can be complished with a simple command:
$ wsgid config --app-path=/path/to/your/wsgid-app --mongrel2-chroot=/path/to/mongrel2/chroot
This will save this new option on your config file. Just restart your wsgid instance and it will re-read the config file.
Working without knowing where mongrel2 is chrooted
There is another way to handle these big requests. In this approach your don’t need to pass --mongrel2-chroot to all your wsgid instances. The trick here is to mount the same device on many different places. The easiest way to do this is to have mongrel2’s temp_store in a separated partition (or logical volume if you are using LVM). Let’s see an example:
Suppose we have mongrel2 chrooted at /var/mongrel2/ and configured this way:
upload.temp_store = '/uploads/m2.XXXXXX'
Since mongrel2 assumes /uploads is relative to its chroot we must mount this device ate the right place.
# mount /dev/vg01/uploads /var/mongrel2/uploads
Now our logical volume is mounted at /var/mongrel2/uploads. There is one last setp so we can start serving big requests with wsgid. When mongrel2 sends the upload started message to wsgid, which contains the path of the temporary file, the path received by wsgid is the same we put on mongrel2’s config. So wsgid will try to read /uploads/m2.384Tdg (for example) and obviously will fail. So we need a way to make /uploas/ also available to wsgid (that normally is not chrooted anywhere) and the trick to do this is to re-mount the same device at a different place. And this is how we do it:
# mount /dev/vg01/uploads /uploads
So now we have the same device mongrel2 writes the temporary files available to all wsgid processes that need to read them. Remember that mongrel2 does not remove any of these file because obviously it does not know when your app is done. So it’s up to you to clean them.
If your /upload directory is part of a bigger volume or is not on a separated one, you still can accomplish this multi-mount approach. Just use the --bind. This way you can re-mount any directory at another place. Read the mount man page for more details.
This is another cool mongrel2 feature that the latest wsgid release (0.5.0) already supports. So now wsgid supports all major features mongrel2 provides and is starting to be more mature, trying to find its way into a production-ready tool.
Thanks for reading and enjoy!
Some time ago when people talked about webapp deployment they were probably talking about the LAMP stack. Since the appearance of the nginx webserver this has changed quite a bit. In this post you will know another stack, that does not use neither apache nor nginx but that is equally interesting and quite scalable too. We will be talking about mongrel2 for the frontend and wsgid for the backend worker processes.