Options for listing the files in a directory with Python
I do a lot of sysadmin-type work with Python so I often need to list the contents of directory on a filesystem. Here are 4 methods I've used so far to do that. Let me know if you have any good alternatives. The examples were run on my Ubuntu Karmic machine.
OPTION 1 - os.listdir()
¶
This is probably the simplest way to list the contents of a directory in Python.
import os
dirlist = os.listdir("/usr")
from pprint import pprint
pprint(dirlist)
Results:
['lib', 'shareFeisty', 'src', 'bin', 'local', 'X11R6', 'lib64', 'sbin', 'share', 'include', 'lib32', 'man', 'games']
OPTION 2 - glob.glob()
¶
This method allows you to use shell-style wildcards.
import glob
dirlist = glob.glob('/usr/*')
from pprint import pprint
pprint(dirlist)
Results:
['/usr/lib', '/usr/shareFeisty', '/usr/src', '/usr/bin', '/usr/local', '/usr/X11R6', '/usr/lib64', '/usr/sbin', '/usr/share', '/usr/include', '/usr/lib32', '/usr/man', '/usr/games']
OPTION 3 - Unix "ls" command using subprocess
¶
This method uses your operating system's "ls" command. It allows you to sort the output based on modification time, file size, etc. by passing these command-line options to the "ls" command. The following example lists the 10 most recently modified files in /var/log
:
from subprocess import Popen, PIPE
def listdir_shell(path, *lsargs):
p = Popen(('ls', path) + lsargs, shell=False, stdout=PIPE, close_fds=True)
return [path.rstrip('\n') for path in p.stdout.readlines()]
dirlist = listdir_shell('/var/log', '-t')[:10]
from pprint import pprint
pprint(dirlist)
Results:
['auth.log', 'syslog', 'dpkg.log', 'messages', 'user.log', 'daemon.log', 'debug', 'kern.log', 'munin', 'mysql.log']
OPTION 4 - Unix "find" style using os.walk
¶
This method allows you to list directory contents recursively in a manner similar to the Unix "find" command. It uses Python's os.walk
.
import os
def unix_find(pathin):
"""Return results similar to the Unix find command run without options
i.e. traverse a directory tree and return all the file paths
"""
return [os.path.join(path, file)
for (path, dirs, files) in os.walk(pathin)
for file in files]
pathlist = unix_find('/etc')[-10:]
from pprint import pprint
pprint(pathlist)
Results:
['/etc/fonts/conf.avail/20-lohit-gujarati.conf', '/etc/fonts/conf.avail/69-language-selector-zh-mo.conf', '/etc/fonts/conf.avail/11-lcd-filter-lcddefault.conf', '/etc/cron.weekly/0anacron', '/etc/cron.weekly/cvs', '/etc/cron.weekly/popularity-contest', '/etc/cron.weekly/man-db', '/etc/cron.weekly/apt-xapian-index', '/etc/cron.weekly/sysklogd', '/etc/cron.weekly/.placeholder']
Related posts
- How to get the filename and it's parent directory in Python — posted 2011-12-28
- How to remove ^M characters from a file with Python — posted 2011-10-03
- Monitoring a filesystem with Python and Pyinotify — posted 2010-04-09
- os.path.relpath() source code for Python 2.5 — posted 2010-03-31
- A hack to copy files between two remote hosts using Python — posted 2010-02-08
Comments
Adding a regexp to your option #1 is a quick way to get python's re module into play when sh regexps won't cut it:
import os, pprint, re
pat = re.compile(r".+\d.+")
dirlist = filter(pat.match, os.listdir("/usr/local"))
pprint.pprint(dirlist)
gives me (on my FreeBSD box)
['diablo-jdk1.6.0',
'netbeans68',
'openoffice.org-3.2.0',
'i386-portbld-freebsd7.3']
Keith: That's a good tip. I will give it a try the next time I get a chance. Thanks!
...and how about an easy way for listing contents of a WEB directory? Could any of the above techniques be used?
I'm just learning python for my job and this has been a really useful reference page for me!! I realise it's only really useful for one thing - but the methods you've shown are perfect for particular types of directory listings in my code ;).
I recently started learning python and i love your blog i'm constantly looking for best practices and "solved" problems
I'm also just learning python for my job and this has been a really useful reference page for me.
I hope you can post more about system administration booth Unix and Windows.
Keep up the good work man ;)
how to getting files from three different dirctory in reverse manner....please give idea..