HOWTO setup Trac with Mercurial and Nginx

Development September 7th, 2009

“A goal is a dream with a deadline.” — Napleon Hill

Besides the configuration for Mercurial the preferred DVCS, A bug-tracking and project management tool is required to manage the timeline. Trac is my all-time favorite and it supports the Mercurial via TracMercurial plugin.

Before we move on, we need to rebuild the python-2.6 to get some missing modules with C extensions back:

# Install the following development package to build python modules
sudo yum install bzip2-devel readline-devel gdbm-devel bsddb-devel ncurses-devel db4-devel sqlite-devel
# Rebuild the Python-2.6
./configure –prefix=/opt/python
make && sudo make install

Install Trac and MercurialTrac

sudo /opt/python/bin/easy_install http://svn.edgewall.org/repos/trac/tags/trac-0.11
PATH=/opt/python/bin:$PATH sudo trac-admin /var/trac/bloggo/ initenv

When repository type is prompted, use hg instead.

Then we install the MercurialTrac plugin globally, so it can be used later for other projects:

svn co http://svn.edgewall.com/repos/trac/sandbox/mercurial-plugin-0.11
cd mercurial-plugin-0.11
python setup.py bdist_egg
sudo /opt/python/bin/easy_install dist/TracMercurial-0.11.0.7-py2.6.egg

Now it is time to run Trac in TracStandalone mode to verify the configuration in Trac side.

Configure Nginx to and FastCGI
The Trac instance is loaded via spawn-fcgi as other web applications using this script. The PATH is overrided to use python-2.6, and TRAC_ENV_PARENT_DIR is exported explicitly. I doubt this may contaminate other Trac instances, I will fix it later if so.

The multiple Trac instances are located in the /projects/ path, and we need HTTP Basic Auth to protect the entry point.

location ~ /projects/[^/]+/login {
auth_basic “Trac”;
auth_basic_user_file “/var/trac/devpasswd”;

fastcgi_split_path_info ^(/projects)(/.*)$;
fastcgi_param PATH_INFO $fastcgi_path_info;
fastcgi_param AUTH_USER $remote_user;
fastcgi_param REMOTE_USER $remote_user;
include fastcgi_params;
fastcgi_pass 127.0.0.1:9004;
}

location /projects/ {
fastcgi_split_path_info ^(/projects)(/.*)$;
fastcgi_param PATH_INFO $fastcgi_path_info;
fastcgi_param AUTH_USER $remote_user;
fastcgi_param REMOTE_USER $remote_user;
include fastcgi_params;
fastcgi_pass 127.0.0.1:9004;
}

Any thoughts to refactor the configuration to eliminate the duplication?

HOWTO setup the Mercurial with Nginx in CentOS 4

Development September 5th, 2009

As my VPS provider Advantagecom Networks gracefully acknowledged the life-time promotion I enrolled in one year ago, I decided to settle in this ISP. Which also means that I am stuck to the current configuration: python-2.3.4 in CentOS 4.

Unlike Gentoo, there is no shortcut to upgrade CentOS 4 to CentOS 5 seamlessly. Some adventurous pilots tried and failed miserably, also the customer service disregarded this approach. I wish I could reinstall the OS in SSH session. So we fallback the plan B to update the essential components only.

Update the python

wget http://www.python.org/ftp/python/2.6.2/Python-2.6.2.tar.bz2
tar xvfj Python-2.6.2.tar.bz2
cd Python-2.6.2
./configure –prefix=/opt
make && sudo make install

Update the setuptools for python-2.6
Temporarily override the PATH to use python-2.6:

wget http://pypi.python.org/packages/2.6/s/setuptools/setuptools-0.6c9-py2.6.egg#md5=ca37b1ff16fa2ede6e19383e7b59245a
PATH=/opt/python/bin:$PATH sudo sh build/setuptools-0.6c9-py2.6.egg

By default, easy_install will be installed into /opt/python/bin

Install Mercurial and flup

sudo /opt/python/bin/easy_install mercurial
sudo /opt/python/bin/easy_install flup

Setup the Mercurial repository

# Setup the user privilege
sudo /usr/sbin/groupadd hg
sudo /usr/sbin/useradd -g hg -s /bin/false hg
sudo mkdir /var/hg
sudo chown hg:hg /var/hg
sudo chmod g+w /var/hg
# Add myself to hg group
sudo /usr/sbin/usermod -G hg bookstack
hg init /var/hg/bloggo

Serve Mercurial via flup
Copy the fastcgi script hgwebdir.fcgi from /usr/share/doc/mercurial-1.3.1/contrib/, and configure the hgweb.config:

[paths]
/ = /var/hg/**

[web]
style = monoblue
allow_push = *
push_ssl = false

Spawn the fastcgi script using this script.
Note: the PATH environment variable needs to be overrided by /opt/python/bin to use python-2.6.

Install Nginx
Just follow this HOWTO.

Now we need to setup the Nginx to expose the Mercurial repository in nginx.conf:

# HG
server {
listen 80;
server_name dev.kunxi.org;
root /var/web/$host;
access_log logs/$host.access.log main;
error_log logs/$host.error.log;

location /hg/ {
fastcgi_split_path_info ^(/hg)(/.*)$;
fastcgi_param PATH_INFO $fastcgi_path_info;
include fastcgi_params;
limit_except GET HEAD {
deny all;
}
fastcgi_pass 127.0.0.1:9003;
}
}

The directive limit_except in Nginx seem not to work as expected. If we add auth_basic in limit_except block, Nginx will ask for the credential first, then tries to server the static content in /hg/project_name instead of passing the request to the underlying fastcgi. I encountered the similar problem in MoinMoin setup, so I just deny all the POST operations, and use SSH instead. This is not the perfect solution for sure, but it works fine for personal source depot.

Authorize the REST web service

Python, Web April 19th, 2009

Once we step into the REST territory, session-based AuthenticationMiddleware is no longer an option due to the violation of stateless principle. Digest authentication seems one of the very few options left for this situation. The basic concept is that the client and server share a private secret key, the client signs the HTTP request and the server validates the signature before further operations.

There is no off-the-shelf digest authentication middleware available yet, let’s roll up sleeves and home-brew our own or more specifically, shameless copy the S3(authentication spec) python library with slightly simplification:

The entities under the radar are cut to five: HTTP verb, Content-Length, Content-Type, Date and the body. The HTTP verb and content type specify the REST request, the date prevents the man-in-the-middle replay attack. The signature is then digested on the stream, and appended into Authentication header.
Update: URL is also essential, the man-in-the-middle may record the REST operation in one entry, and replay in another entry point.

It is also pretty straightforward to wrap up the digest authentication as a middleware: create a new model named as Token and add access_id and acess_key pair, also the User as foreign key. We could just copy AuthenticateMiddleware and override get_user method to integrate the digest validation. You may check the revision 21 to 22 on pattee for more details if you are interested in.

There is another issue worthy our attention: both digest and session-based authentications are required, the latter is the gate keeper to access admin interface to manages the token used for the former, but they may not play well together: the resource for the cookie management in the server side is totally wasted to handle REST requests, and furthermore, it leaves the door open for security exploit by stolen cookies. We will discuss this issue later, stay tune.

Book Review: Django 1.0 Template Development

Development, Python, Web March 12th, 2009

This is not a paid review, I did not receive a dime from author, publisher or the affiliated; however, I did get a free copy of the book, so some harsh critics may be sugar-coated. Read with caution.

Though I am a die-hard RTFM guy, it never hurt to take the advantage of expertise of the peers. Django 1.0 Template Development focuses on a relatively narrow topic, the template system of Django. The author puts himself in an dilemma: he expects the target audience to have basic ideas of Django system, but still has to go over all the hassles(not really though) to kick start a new project to make the book self-contained. I think the author did a great job for a beginners, but I still highly recommend the official DjangoBook, tutorial and documentation.

After two chapters warm-up, Chapter 3 shows the magic of Context and RequestContext. It is interesting to see how the project evolve from the low-level operation to the shortcuts. Chapter 4 introduces the built-in tags in the toolbox. Chapter 5 and 6 demonstrates the template inheritance and how multiple templates are served. In chapter 7, the developers can extend their toolbox by creating new filter tags. Chapter 9 gives series examples for admin UI customization. Chapter 8 and 10 are about the performance, pagination and cache. Last but not the least, L10N in chapter 11.

The examples in each chapter is atomic for easy understanding, but in my humble opinion, most of them do not impress the readers the power of Django. The framework shines to solve BIG and complex questions. I just wonder what if the author starts a much more ambitious project with complicated specification, and later decompose it to small tasks and address them in each chapter to make the point, just like Dive into Python does.

Furthermore, I would appreciate if the author could share more first-hand experience with readers. Engineering is always about question-solving. The framework is naturally easy to learn, otherwise, why bother? But it may suck in the big time if it does not scale. Any real world case would help to establish the confidence for further acceptance.

The bottom line: a good book for beginners, some chapters are quite beefy for the topic. As Django is a fast-evolving project, I hope the author will bring more juicy examples in the future edition.

How to PUT a file in Django

Python, Web January 28th, 2009

Once we decide to go for PUT instead of POST, we step out the comfort zone of django, there is no mapped form filed, no validation, we have to deal with the raw WSGI interface by ourselves. Anyway, we can still use the the django.core.file.File.

If we dig into the source code, the django.core.file.File defines: open, close, read, tell, seek, flush and some other django-specific operations, like chunks, readlines, xreadlines etc. Ticket #8501 glues File and file object when chunks method is missing.

It is interesting that the interface File exposed explicitly requires that the underlying file object supports random access, which is most likely overqualified for general use. Sometimes, less is more. And it implicitly expects read will return EOF, which is also not true for WSGI.input. So we end up to brew our own:

class SocketFile(File):
    # Only forward access is allowed
    def __init__(self, socket, size):
        super(SocketFile, self).__init__(socket)
        self._size = int(size)
        self._pos = 0

    def read(self, num_bytes=None):
        if num_bytes is None:
            num_bytes = self._size – self._pos
        else:
            num_bytes = min(num_bytes, self._size – self._pos)
        self._pos += num_bytes
        return self.file.read(num_bytes)

    def tell(self):
        return self._pos

    def seek(self, position):
        pass

The SocketFile object is initialized with the length of the socket file object, aka CONTENT_LENGTH, the read method gatekeeps the operation to return EOF. seek is inherited from File, so just bypass it. Just wrap the raw WSGI.input with SocketFile, and use it as File. Please check views.py for the usage.