SaltyCrane Blog — Notes on JavaScript and web development

Using Python to write to an Excel / OpenOffice Calc spreadsheet on Ubuntu Linux

Via Matt Harrison's blog post, here is how to write Excel or OpenOffice.org Calc spreadsheet files using Python and the xlwt library. Xlwt is a fork of pyExcelerator which handles only writing spreadsheet files. For reading spreadsheets, see xlrd. Note, these libraries don't use COM, so they will work on non-Windows OSes, such as Linux. For more information, see Matt's blog post. He even has a PDF cheat sheet.

  • Install pip
  • Install xlwt
    sudo pip install xlwt
  • Create an example script:
    import xlwt
    
    DATA = (("The Essential Calvin and Hobbes", 1988,),
            ("The Authoritative Calvin and Hobbes", 1990,),
            ("The Indispensable Calvin and Hobbes", 1992,),
            ("Attack of the Deranged Mutant Killer Monster Snow Goons", 1992,),
            ("The Days Are Just Packed", 1993,),
            ("Homicidal Psycho Jungle Cat", 1994,),
            ("There's Treasure Everywhere", 1996,),
            ("It's a Magical World", 1996,),)
    
    wb = xlwt.Workbook()
    ws = wb.add_sheet("My Sheet")
    for i, row in enumerate(DATA):
        for j, col in enumerate(row):
            ws.write(i, j, col)
    ws.col(0).width = 256 * max([len(row[0]) for row in DATA])
    wb.save("myworkbook.xls")
    
  • Results:

How to install pip on Ubuntu

Pip is a better alternative to Easy Install for installing Python packages. It is most "nutritious" when used with its companion virtualenv. For more information on pip and virtualenv see my blog post: Notes on using pip and virtualenv with Django.

Install pip and virtualenv for Ubuntu 10.10 Maverick and newer

$ sudo apt-get install python-pip python-dev build-essential 
$ sudo pip install --upgrade pip 
$ sudo pip install --upgrade virtualenv 

For older versions of Ubuntu

  • Install Easy Install
    $ sudo apt-get install python-setuptools python-dev build-essential 
    
  • Install pip
    $ sudo easy_install pip 
    
  • Install virtualenv
    $ sudo pip install --upgrade virtualenv 
    

Python setdefault example

I always forget how to use Python's setdefault dictionary operation so here is a quick example.

What I want:

DATA_SOURCE = (('key1', 'value1'),
               ('key1', 'value2'),
               ('key2', 'value3'),
               ('key2', 'value4'),
               ('key2', 'value5'),)

newdata = {}
for k, v in DATA_SOURCE:
    if newdata.has_key(k):
        newdata[k].append(v)
    else:
        newdata[k] = [v]
print newdata

Results:

{'key2': ['value3', 'value4', 'value5'], 'key1': ['value1', 'value2']}

Better way using setdefault:

newdata = {}
for k, v in DATA_SOURCE:
    newdata.setdefault(k, []).append(v)
print newdata

The results are the same.

A hack to copy files between two remote hosts using Python

I sometimes need to copy a file (such as a database dump) between two remote hosts on EC2. Normally this involves a few steps: scp'ing the ssh keyfile to Host 1, ssh'ing to Host 1, looking up the address for Host 2, then scp'ing the desired file from Host 1 to Host 2.

I was excited to read in the man page that scp can copy files between two remote hosts directly. However, it didn't work for me. Apparently, running scp host1:myfile host2: is like running ssh host1 scp myfile host2: so I still need the address of host2 and my ssh keyfile on host1.

My inablility to let go of this small efficiency increaser, led me to (what else?) write a Python script. I know this is a hack so if you know of a better way of doing this, let me know.

The script parses my ~/.ssh/config file to find the ssh keyfile and address for host 2, uses scp to copy the ssh keyfile to host 1, then runs the ssh host1 scp ... command with the appropriate options filled in. The script captures all of the ssh options for host 2 and passes them on the command line to scp via the -o command-line option. Note, I only tested this to set the User option– I don't know if all ssh options will work.

Warning: the script disables the StrictHostKeyChecking SSH option, so you are more vunerable to a man-in-the-middle attack.

Update 2010-02-16: I've found there is already a SSH config file parser in the paramiko library. The source can be viewed on github.

Update 2010-05-04: I modified my code to use the paramiko library and also allow command line options to be passed directly to the scp command. The latest code is available in my github repository remote-tools.

import itertools
import os
import re
import sys

SSH_CONFIG_FILE = '/home/saltycrane/.ssh/config'

def main():
    host1, path1 = sys.argv[1].split(':', 1)
    host2, path2 = sys.argv[2].split(':', 1)

    o = get_ssh_options(host2)
    keyfile_remote = '/tmp/%s' % os.path.basename(o['identityfile'])
    ssh_options = ' -o'.join(['='.join([k, v]) for k, v in o.iteritems()
                              if k != 'hostname' and k != 'identityfile'])

    run('scp %s %s:%s' % (o['identityfile'], host1, keyfile_remote))
    run('ssh %s scp -p -i %s -oStrictHostKeyChecking=no -o%s %s %s:%s' % (
            host1, keyfile_remote, ssh_options, path1, o['hostname'], path2))

def get_ssh_options(host):
    """Parse ~/.ssh/config file and return a dict of ssh options for host
    Note: dict keys are all lowercase
    """
    def remove_comment(line):
        return re.sub(r'#.*$', '', line)
    def get_value(line, key_arg):
        m = re.search(r'^\s*%s\s+(.+)\s*$' % key_arg, line, re.I)
        if m:
            return m.group(1)
        else:
            return ''
    def not_the_host(line):
        return get_value(line, 'Host') != host
    def not_a_host(line):
        return get_value(line, 'Host') == ''

    lines = [line.strip() for line in file(SSH_CONFIG_FILE)]
    comments_removed = [remove_comment(line) for line in lines]
    blanks_removed = [line for line in comments_removed if line]
    top_removed = list(itertools.dropwhile(not_the_host, blanks_removed))[1:]
    goodpart = itertools.takewhile(not_a_host, top_removed)
    return dict([line.lower().split(None, 1) for line in goodpart])

def run(cmd):
    print cmd
    os.system(cmd)

if __name__ == '__main__':
    main()

Here is an example ~/.ssh/config file:

Host testhost1
  User root
  Hostname 48.879.24.567
  IdentityFile /home/saltycrane/.ssh/test_keyfile

Host testhost2
  User root
  Hostname 56.384.58.212
  IdentityFile /home/saltycrane/.ssh/test_keyfile

Here is an example run. It copies /tmp/testfile from testhost1 to the same path on testhost2.

python scp_r2r.py testhost1:/tmp/testfile testhost2:/tmp/testfile

Here is the console output:

scp /home/saltycrane/.ssh/test_keyfile testhost1:/tmp/test_keyfile
test_keyfile                                              100% 1674     1.6KB/s   00:00
ssh testhost1 scp -p -i /tmp/test_keyfile -oStrictHostKeyChecking=no -ouser=root /tmp/testfile 56.384.58.212:/tmp/testfile

One inconvenience is that it doesn't show the progress for the main transfer. If anyone knows how I can fix this, please let me know.

Iterating over lines in multiple Linux log files using Python

I needed to parse through my Nginx log files to debug a problem. However, the logs are separated into many files, most of them are gzipped, and I wanted the ordering within the files reversed. So I abstracted the logic to handle this into a function. Now I can pass a glob pattern such as /var/log/nginx/cache.log* to my function, and iterate over each line in all the files as if they were one file. Here is my function. Let me know if there is a better way to do this.

Update 2010-02-24:To handle multiple log files on a remote host, see my script on github.

import glob
import gzip
import re
 
def get_lines(log_glob):
    """Return an iterator of each line in all files matching log_glob.
    Lines are sorted most recent first.
    Files are sorted by the integer in the suffix of the log filename.
    Suffix may be one of the following:
         .X (where X is an integer)
         .X.gz (where X is an integer)
    If the filename does not end in either suffix, it is treated as if X=0
    """
    def sort_by_suffix(a, b):
        def get_suffix(fname):
            m = re.search(r'.(?:\.(\d+))?(?:\.gz)?$', fname)
            if m.lastindex:
                suf = int(m.group(1))
            else:
                suf = 0
            return suf
        return get_suffix(a) - get_suffix(b)
 
    filelist = glob.glob(log_glob)
    for filename in sorted(filelist, sort_by_suffix):
        if filename.endswith('.gz'):
            fh = gzip.open(filename)
        else:
            fh = open(filename)
        for line in reversed(fh.readlines()):
            yield line
        fh.close()

Here is an example run on my machine. It prints the first 15 characters of every 1000th line of all my syslog files.

for i, line in enumerate(get_lines('/var/log/syslog*')):
    if not i % 1000:
        print line[:15]

File listing:

$ ls -l /var/log/syslog*
-rw-r----- 1 syslog adm 169965 2010 01/23 00:18 /var/log/syslog
-rw-r----- 1 syslog adm 350334 2010 01/22 08:03 /var/log/syslog.1
-rw-r----- 1 syslog adm  18078 2010 01/21 07:49 /var/log/syslog.2.gz
-rw-r----- 1 syslog adm  16700 2010 01/20 07:43 /var/log/syslog.3.gz
-rw-r----- 1 syslog adm  18197 2010 01/19 07:52 /var/log/syslog.4.gz
-rw-r----- 1 syslog adm  15737 2010 01/18 07:45 /var/log/syslog.5.gz
-rw-r----- 1 syslog adm  16157 2010 01/17 07:54 /var/log/syslog.6.gz
-rw-r----- 1 syslog adm  20285 2010 01/16 07:48 /var/log/syslog.7.gz

Results:

Jan 22 23:57:01
Jan 22 14:09:01
Jan 22 03:51:01
Jan 21 17:35:01
Jan 21 14:37:33
Jan 21 08:35:01
Jan 20 22:12:01
Jan 20 11:56:01
Jan 20 01:41:01
Jan 19 15:18:01
Jan 19 04:53:01
Jan 18 18:35:01
Jan 18 08:40:01
Jan 17 22:10:01
Jan 17 11:32:01
Jan 17 01:05:01
Jan 16 14:27:01
Jan 16 04:01:01
Jan 15 17:25:01
Jan 15 08:50:01

Wmii Python script to monitor remote machines

I like to monitor our web servers by ssh'ing into the remote machine and watching "top", tailing log files, etc. Normally, I open a terminal, ssh into the remote machine, run the monitoring command (e.g. "top"), then repeat for the rest of the remote machines. Then I adjust the window sizes so I can see everything at once.

My window manager, wmii, is great for tiling a bunch of windows at once. It is also scriptable with Python, so I wrote a Python script to create my web server monitoring view. Below is my script. I also put a video on YouTube.

#!/usr/bin/env python

import os
import time

NGINX_MONITOR_CMD = "tail --follow=name /var/log/nginx/cache.log | grep --color -E '(HIT|MISS|EXPIRED|STALE|UPDATING|\*\*\*)'"
APACHE_MONITOR_CMD = "top"
MYSQL_MONITOR_CMD = "mysqladmin extended -i10 -r | grep -i 'questions\|aborted_clients\|opened_tables\|slow_queries\|threads_created' "

CMDS_COL1 = ['urxvt -title "Nginx 1" -e ssh -t us-ng1 "%s" &' % NGINX_MONITOR_CMD,
             'urxvt -title "Nginx 2" -e ssh -t us-ng2 "%s" &' % NGINX_MONITOR_CMD,
             ]
CMDS_COL2 = ['urxvt -title "Apache 1" -e ssh -t us-med1 "%s" &' % APACHE_MONITOR_CMD,
             'urxvt -title "Apache 2" -e ssh -t us-med2 "%s" &' % APACHE_MONITOR_CMD,
             'urxvt -title "Apache 3" -e ssh -t us-med3 "%s" &' % APACHE_MONITOR_CMD,
             ]
CMDS_COL3 = ['urxvt -title "MySQL 1" -e ssh -t us-my1 "%s" &' % MYSQL_MONITOR_CMD,
             'urxvt -title "MySQL 2" -e ssh -t us-my2 "%s" &' % MYSQL_MONITOR_CMD,
             ]
COLUMNS = [CMDS_COL1, CMDS_COL2, CMDS_COL3]

def create_windows():
    for i, col in enumerate(COLUMNS):
        cindex = str(i+1)
        for cmd in col:
            os.system(cmd)
            time.sleep(1)
            os.system('wmiir xwrite /tag/sel/ctl send sel %s' % cindex)
        os.system('wmiir xwrite /tag/sel/ctl colmode %s default-max' % cindex)
    os.system('wmii.py 45.5 31.5 23')

if __name__ == '__main__':
    create_windows()

Note 1: The script above uses another script I wrote previously, wmii.py, to set the column widths.

Note 2: The remote server addresses are specified by the nicknames us-ng1, us-ng2, us-med1, etc. configured in my ~/.ssh/config file as described here.

Note 3 (on using ssh and top): I first tried doing ssh host top, but this gave me a TERM environment variable not set. error. I then tried ssh host "export TERM=rxvt-unicode; top", but this gave me a top: failed tty get error. The solution that worked for me was to use the -t option with ssh. E.g. ssh -t host top. This is what I used in the script above.

Note 4 (added 2010-03-05): I used "tail --follow=name" instead of "tail -f" so that tail will follow the log file even after it has been rotated. For more information, see the man page for tail.

Note 5 (added 2010-03-05): To prevent your ssh session from timing out, add the following 2 lines to your ~/.ssh/config file (via):

Host *
  ServerAliveInterval 60

Trying out a Retry decorator in Python

The Python wiki has a Retry decorator example which retries calling a failure-prone function using an exponential backoff algorithm. I modified it slightly to check for exceptions instead of a False return value to indicate failure. Each time the decorated function throws an exception, the decorator will wait a period of time and retry calling the function until the maximum number of tries is used up. If the decorated function fails on the last try, the exception will occur unhandled.

import time
from functools import wraps


def retry(ExceptionToCheck, tries=4, delay=3, backoff=2, logger=None):
    """Retry calling the decorated function using an exponential backoff.

    http://www.saltycrane.com/blog/2009/11/trying-out-retry-decorator-python/
    original from: http://wiki.python.org/moin/PythonDecoratorLibrary#Retry

    :param ExceptionToCheck: the exception to check. may be a tuple of
        exceptions to check
    :type ExceptionToCheck: Exception or tuple
    :param tries: number of times to try (not retry) before giving up
    :type tries: int
    :param delay: initial delay between retries in seconds
    :type delay: int
    :param backoff: backoff multiplier e.g. value of 2 will double the delay
        each retry
    :type backoff: int
    :param logger: logger to use. If None, print
    :type logger: logging.Logger instance
    """
    def deco_retry(f):

        @wraps(f)
        def f_retry(*args, **kwargs):
            mtries, mdelay = tries, delay
            while mtries > 1:
                try:
                    return f(*args, **kwargs)
                except ExceptionToCheck, e:
                    msg = "%s, Retrying in %d seconds..." % (str(e), mdelay)
                    if logger:
                        logger.warning(msg)
                    else:
                        print msg
                    time.sleep(mdelay)
                    mtries -= 1
                    mdelay *= backoff
            return f(*args, **kwargs)

        return f_retry  # true decorator

    return deco_retry

Try an "always fail" case

@retry(Exception, tries=4)
def test_fail(text):
    raise Exception("Fail")

test_fail("it works!")

Results:

Fail, Retrying in 3 seconds...
Fail, Retrying in 6 seconds...
Fail, Retrying in 12 seconds...
Traceback (most recent call last):
  File "retry_decorator.py", line 47, in 
    test_fail("it works!")
  File "retry_decorator.py", line 26, in f_retry
    f(*args, **kwargs)
  File "retry_decorator.py", line 33, in test_fail
    raise Exception("Fail")
Exception: Fail

Try a "success" case

@retry(Exception, tries=4)
def test_success(text):
    print "Success: ", text

test_success("it works!")

Results:

Success:  it works!

Try a "random fail" case

import random

@retry(Exception, tries=4)
def test_random(text):
    x = random.random()
    if x < 0.5:
        raise Exception("Fail")
    else:
        print "Success: ", text

test_random("it works!")

Results:

Fail, Retrying in 3 seconds...
Success:  it works!

Try handling multiple exceptions

Added 2010-04-27

import random

@retry((NameError, IOError), tries=20, delay=1, backoff=1)
def test_multiple_exceptions():
    x = random.random()
    if x < 0.40:
        raise NameError("NameError")
    elif x < 0.80:
        raise IOError("IOError")
    else:
        raise KeyError("KeyError")

test_multiple_exceptions()

Results:

IOError, Retrying in 1 seconds...
NameError, Retrying in 1 seconds...
IOError, Retrying in 1 seconds...
IOError, Retrying in 1 seconds...
NameError, Retrying in 1 seconds...
IOError, Retrying in 1 seconds...
NameError, Retrying in 1 seconds...
NameError, Retrying in 1 seconds...
NameError, Retrying in 1 seconds...
IOError, Retrying in 1 seconds...
Traceback (most recent call last):
  File "retry_decorator.py", line 61, in 
    test_multiple_exceptions("hello")
  File "retry_decorator.py", line 14, in f_retry
    f(*args, **kwargs)
  File "retry_decorator.py", line 56, in test_multiple_exceptions
    raise KeyError("KeyError")
KeyError: 'KeyError'

Unit tests

Added 2013-01-22. Note: Python 2.7 is required to run the tests.

import logging
import unittest

from decorators import retry


class RetryableError(Exception):
    pass


class AnotherRetryableError(Exception):
    pass


class UnexpectedError(Exception):
    pass


class RetryTestCase(unittest.TestCase):

    def test_no_retry_required(self):
        self.counter = 0

        @retry(RetryableError, tries=4, delay=0.1)
        def succeeds():
            self.counter += 1
            return 'success'

        r = succeeds()

        self.assertEqual(r, 'success')
        self.assertEqual(self.counter, 1)

    def test_retries_once(self):
        self.counter = 0

        @retry(RetryableError, tries=4, delay=0.1)
        def fails_once():
            self.counter += 1
            if self.counter < 2:
                raise RetryableError('failed')
            else:
                return 'success'

        r = fails_once()
        self.assertEqual(r, 'success')
        self.assertEqual(self.counter, 2)

    def test_limit_is_reached(self):
        self.counter = 0

        @retry(RetryableError, tries=4, delay=0.1)
        def always_fails():
            self.counter += 1
            raise RetryableError('failed')

        with self.assertRaises(RetryableError):
            always_fails()
        self.assertEqual(self.counter, 4)

    def test_multiple_exception_types(self):
        self.counter = 0

        @retry((RetryableError, AnotherRetryableError), tries=4, delay=0.1)
        def raise_multiple_exceptions():
            self.counter += 1
            if self.counter == 1:
                raise RetryableError('a retryable error')
            elif self.counter == 2:
                raise AnotherRetryableError('another retryable error')
            else:
                return 'success'

        r = raise_multiple_exceptions()
        self.assertEqual(r, 'success')
        self.assertEqual(self.counter, 3)

    def test_unexpected_exception_does_not_retry(self):

        @retry(RetryableError, tries=4, delay=0.1)
        def raise_unexpected_error():
            raise UnexpectedError('unexpected error')

        with self.assertRaises(UnexpectedError):
            raise_unexpected_error()

    def test_using_a_logger(self):
        self.counter = 0

        sh = logging.StreamHandler()
        logger = logging.getLogger(__name__)
        logger.addHandler(sh)

        @retry(RetryableError, tries=4, delay=0.1, logger=logger)
        def fails_once():
            self.counter += 1
            if self.counter < 2:
                raise RetryableError('failed')
            else:
                return 'success'

        fails_once()


if __name__ == '__main__':
    unittest.main()

Code / License

This code is also on github at: https://github.com/saltycrane/retry-decorator. It is BSD licensed.

Using Nginx as a caching proxy with Wordpress+Apache

We have been evaluating caching reverse proxy servers at work. We looked at Nginx+memcached, Squid, and Varnish. Most recently, we found that Nginx version 0.7 has support for caching static files using the proxy_cache directive in the NginxHttpProxyModule. This allows us to use Nginx as a caching proxy without having to handle the complication (or flexibility depending on how you look at it) of setting and invalidating the cache as with the Nginx+memcached setup. Here are my notes for setting it up with an Apache+Wordpress backend.

Update 2010-01-05: Over a couple months, we switched to Nginx 0.8 and we made a few tweaks to our Nginx configuration. Here is our updated conf file: nginx_wordpress_100105.conf.

Install Nginx 0.7

The version of Nginx in Ubuntu is an older version so we used a PPA created by Jeff Waugh: https://launchpad.net/~jdub/+archive/ppa. (He also has a development PPA which contains Nginx 0.8.)

  • Add the following to /etc/apt/sources.list:
    deb http://ppa.launchpad.net/jdub/ppa/ubuntu hardy main 
    deb-src http://ppa.launchpad.net/jdub/ppa/ubuntu hardy main
  • Tell Ubuntu how to authenticate the PPA
    apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E9EEF4A1

    Alternively, if the keyserver is down, you can follow the instructions for copying the public key from http://forum.nginx.org/read.php?2,5177,11272.

  • Install Nginx from new PPA
    apt-get update
    apt-get install nginx
  • Check the version of Nginx
    nginx -V
    nginx version: nginx/0.7.62
    configure arguments: --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --pid-path=/var/run/nginx.pid --lock-path=/var/lock/nginx.lock --http-log-path=/var/log/nginx/access.log --http-client-body-temp-path=/var/lib/nginx/body --http-proxy-temp-path=/var/lib/nginx/proxy --http-fastcgi-temp-path=/var/lib/nginx/fastcgi --with-debug --with-http_stub_status_module --with-http_flv_module --with-http_ssl_module --with-http_dav_module --with-http_gzip_static_module --with-ipv6 --with-http_realip_module --with-http_xslt_module --with-http_image_filter_module --with-sha1=/usr/include/openssl

Configure Nginx cache logging

Within the http {} block, add:

    log_format cache '***$time_local '
                     '$upstream_cache_status '
                     'Cache-Control: $upstream_http_cache_control '
                     'Expires: $upstream_http_expires '
                     '"$request" ($status) '
                     '"$http_user_agent" ';
    access_log  /var/log/nginx/cache.log cache;

Nginx configuration for backend servers

Within the http {} block, add:

    include /etc/nginx/app-servers.include;

And /etc/nginx/app-servers.include looks like:

upstream backend {
        ip_hash;

	server 10.245.275.88:80;
	server 10.292.150.34:80;
}

Configure cache path/parameters

Within the http {} block, add:

    proxy_cache_path /var/www/nginx_cache levels=1:2
                     keys_zone=one:10m;
                     inactive=7d max_size=200m;
    proxy_temp_path /var/www/nginx_temp;

More proxy cache configuration

We added the username from the wordpress_logged_in_* cookie as part of the cache key so that different logged in users will get the appropriate page from the cache. However, our Wordpress configuration sends HTTP headers disabling the cache when a user is logged in so this is actually not used. But it does not hurt to include this, in case we change our Wordpress configuration in the future.

Within the server {} block, add:

        location / {
            # capture cookie for use in cache key
            if ($http_cookie ~* "wordpress_logged_in_[^=]*=([^%]+)%7C") {
                set $my_cookie $1;
            }

            proxy_pass http://backend;
            proxy_cache one;
            proxy_cache_key $scheme$proxy_host$uri$is_args$args$my_cookie;
            proxy_cache_valid  200 302 304 10m;
            proxy_cache_valid  301 1h;
            proxy_cache_valid  any 1m;
        }

Configure locations that shouldn't be cached

If WordPress sends the appropriate HTTP Cache-Control headers, this step is not necessary. But we have added it to be on the safe side. Within the server {} block, add:

        location /wp-admin { proxy_pass http://backend; }
        location /wp-login.php { proxy_pass http://backend; }

Restart Nginx

The Nginx reverse proxy cache should work without modification to the Apache configuration. In our case, we had to disable WP Super Cache because we had been using that previously.

/etc/init.d/nginx restart

View the log

Check the /var/log/nginx/cache.log to see if everything is working correctly. The log should diplay HIT, MISS, and EXPIRED appropriately. If the log shows only misses, check the Cache-Control and Expires HTTP headers that are sent from Apache+Wordpress.

Example Apache/Wordpress configuration that disabled the Nginx cache

Part of the WP Super Cache configuration included the following in the .htaccess file. It had to be removed for Nginx cache the pages. (In particular, the must-revalidate part had to be removed.)

     Header set Cache-Control 'max-age=300, must-revalidate'

How to make urxvt look like gnome-terminal

My terminal of choice is rxvt-unicode (urxvt) because it is fast and lightweight. However, I recently opened up gnome-terminal and it was so much prettier than my urxvt. Here's how I made my urxvt look like gnome-terminal. The last step involves compiling urxvt from source because the latest source includes a patch to configure horizontal spacing of letters.

Set up colors

Add the following to your ~/.Xdefaults file:

! to match gnome-terminal "Linux console" scheme
! foreground/background
URxvt*background: #000000
URxvt*foreground: #ffffff
! black
URxvt.color0  : #000000
URxvt.color8  : #555555
! red
URxvt.color1  : #AA0000
URxvt.color9  : #FF5555
! green
URxvt.color2  : #00AA00
URxvt.color10 : #55FF55
! yellow
URxvt.color3  : #AA5500
URxvt.color11 : #FFFF55
! blue
URxvt.color4  : #0000AA
URxvt.color12 : #5555FF
! magenta
URxvt.color5  : #AA00AA
URxvt.color13 : #FF55FF
! cyan
URxvt.color6  : #00AAAA
URxvt.color14 : #55FFFF
! white
URxvt.color7  : #AAAAAA
URxvt.color15 : #FFFFFF

Select font

Also add the following to your ~/.Xdefaults file:

URxvt*font: xft:Monospace:pixelsize=11

Don't use a bold font

Also add the following to your ~/.Xdefaults file:

URxvt*boldFont: xft:Monospace:pixelsize=11

Fix urxvt font width

This is the most difficult thing to fix. It requires installing urxvt from CVS source.

  • Install prerequisites:
    apt-get build-dep rxvt-unicode
  • Get CVS source code:
    cvs -z3 -d :pserver:[email protected]/schmorpforge co rxvt-unicode
  • Configure:
    cd rxvt-unicode
    ./configure --prefix=/home/saltycrane/lib/rxvt-unicode-20091102
  • Make & make install:
    make
    make install
  • Link urxvt executable to your ~/bin directory:
    cd ~/bin
    ln -s ../lib/rxvt-unicode-20091102/bin/urxvt .
  • Edit ~/.Xdefaults once again:
    URxvt*letterSpace: -1

Also cool: Open links in Firefox

Here is another trick (thanks to Zachary Tatlock) to make clicking on URLs open in your Firefox browser. Add the following to your ~/.Xdefaults (yes there's Perl in your urxvt!):

URxvt.perl-ext-common : default,matcher
URxvt.urlLauncher     : firefox
URxvt.matcher.button  : 1

See also

Screenshots

Urxvt (default):

ugly urxvt screenshot

Gnome-terminal:

gnome-terminal screenshot

Urxvt (modified):

pretty urxvt screenshot

If you're interested, here is how I printed the terminal colors:

#!/bin/bash
echo -e "\\e[0mCOLOR_NC (No color)"
echo -e "\\e[1;37mCOLOR_WHITE\\t\\e[0;30mCOLOR_BLACK"
echo -e "\\e[0;34mCOLOR_BLUE\\t\\e[1;34mCOLOR_LIGHT_BLUE"
echo -e "\\e[0;32mCOLOR_GREEN\\t\\e[1;32mCOLOR_LIGHT_GREEN"
echo -e "\\e[0;36mCOLOR_CYAN\\t\\e[1;36mCOLOR_LIGHT_CYAN"
echo -e "\\e[0;31mCOLOR_RED\\t\\e[1;31mCOLOR_LIGHT_RED"
echo -e "\\e[0;35mCOLOR_PURPLE\\t\\e[1;35mCOLOR_LIGHT_PURPLE"
echo -e "\\e[0;33mCOLOR_YELLOW\\t\\e[1;33mCOLOR_LIGHT_YELLOW"
echo -e "\\e[1;30mCOLOR_GRAY\\t\\e[0;37mCOLOR_LIGHT_GRAY"

Notes on switching my Djangos to mod_wsgi

I'm slowly trying to make my Django web servers conform to current best practices. I've set up an Nginx reverse proxy for serving static files, started using virtualenv to isolate my Python environments, and migrated my database to PostgreSQL. I ultimately want to implement memcached+Nginx caching in my reverse proxy, but the next task on my to-do list is switching from mod_python to mod_wsgi.

Within the past year (or maybe before), mod_wsgi has become the preferred method for serving Django applications. I also originally thought switching from mod_python to mod_wsgi would save me some much needed memory on my 256MB VPS. But after trying it out, running with a single Apache process in each case, the memory footprint was about the same. Even switching from mod_wsgi's embedded mode to daemon mode didn't make a significant difference. Likely the performance is better with mod_wsgi, though.

Here are my notes on installing mod_wsgi.

Configuration References

Advice from mod_wsgi author Graham Dumpleton

Install mod_wsgi and apache mpm-worker

I'm not 100% sure about prefork vs. worker mpm, but Graham Dumpleton favors worker mpm.

sudo apt-get install libapache2-mod-wsgi
sudo apt-get install apache2-mpm-worker

Create .wsgi application file

My virtualenv is located at /srv/python-environments/saltycrane. My Django settings files is at /srv/SaltyCrane/iwiwdsmi/settings.py.

/srv/SaltyCrane/saltycrane.wsgi:

import os
import sys
import site

site.addsitedir('/srv/python-environments/saltycrane/lib/python2.5/site-packages')

os.environ['DJANGO_SETTINGS_MODULE'] = 'iwiwdsmi.settings'

sys.path.append('/srv/SaltyCrane')

import django.core.handlers.wsgi
application = django.core.handlers.wsgi.WSGIHandler()

Edit Apache's httpd.conf file

I went back and forth between using embedded mode or daemon mode. I've ended up with embedded mode for now since it seems to use a tad less memory and is supposed to be a little bit faster. However, Graham Dumpleton seems to recommend daemon mode for people on VPSs. I may change my mind again later. To use daemon mode, I just need to uncomment the WSGIDaemonProcess and WSGIProcessGroup lines. I have StartServers set to 1 because I can only afford to have one Apache process running. This is assuming nginx is proxying requests to apache. For more on my nginx setup, see here.

Edit /etc/apache2/httpd.conf:

<IfModule mpm_worker_module>
    StartServers 1
    ServerLimit 1
    ThreadsPerChild 5
    ThreadLimit 5
    MinSpareThreads 5
    MaxSpareThreads 5
    MaxClients 5
    MaxRequestsPerChild 500
</IfModule>

KeepAlive Off
NameVirtualHost 127.0.0.1:8080
Listen 8080

<VirtualHost 127.0.0.1:8080>
    ServerName www.saltycrane.com
    # WSGIDaemonProcess saltycrane.com processes=1 threads=5 display-name=%{GROUP}
    # WSGIProcessGroup saltycrane.com
    WSGIScriptAlias / /srv/SaltyCrane/saltycrane.wsgi
</VirtualHost>

<VirtualHost 127.0.0.1:8080>
    ServerName supafu.com
    # WSGIDaemonProcess supafu.com processes=1 threads=5 display-name=%{GROUP}
    # WSGIProcessGroup supafu.com
    WSGIScriptAlias / /srv/Supafu/supafu.wsgi
</VirtualHost>

<VirtualHost 127.0.0.1:8080>
    ServerName handsoncards.com
    # WSGIDaemonProcess handsoncards.com processes=1 threads=5 display-name=%{GROUP}
    # WSGIProcessGroup handsoncards.com
    WSGIScriptAlias / /srv/HandsOnCards/handsoncards.wsgi
</VirtualHost>

Restart Apache

sudo /etc/init.d/apache2 restart