Migrating Excel data to SQLite using Python
In a previous post, I described how I designed a SQLite relational database from an Excel table. It was a small example, so I hardcoded the data into the Python script. For my actual problem, I need to convert my Excel data into a SQLite database automatically. To do this, I used the win32com
module and the sqlite3
module included in Python 2.5.
Here is the table from my previous post. It shows some variables in in my C program. It shows the variable name, type, the module it belongs to, and a short description. Here is the table from my previous post. It shows some variables in in my C program. It shows the variable name, type, the module it belongs to, and a short description.
id | name | module | type | desc |
1 | foo | ModuleExt | double | Description of foo |
2 | bar | ModuleExt | double | Description of bar |
3 | knark | Module1 | int | Description of knark |
4 | wert | Module1 | double | Description of wert |
5 | jib | Module1 | double | Description of jib |
6 | laz | Module2 | double | Description of laz |
7 | kew | Module2 | double | Description of kew |
After installing the win32com
module from http://sourceforge.net/project/platformdownload.php?group_id=78018, I used the following code.
import os
import sqlite3
from win32com.client import constants, Dispatch
#----------------------------------------
# get data from excel file
#----------------------------------------
XLS_FILE = os.getcwd() + "\\example.xls"
ROW_SPAN = (14, 21)
COL_SPAN = (2, 7)
app = Dispatch("Excel.Application")
app.Visible = True
ws = app.Workbooks.Open(XLS_FILE).Sheets(1)
exceldata = [[ws.Cells(row, col).Value
for col in xrange(COL_SPAN[0], COL_SPAN[1])]
for row in xrange(ROW_SPAN[0], ROW_SPAN[1])]
#----------------------------------------
# create SQL table and fill it with data
#----------------------------------------
conn = sqlite3.connect('example.db')
c = conn.cursor()
c.execute('''CREATE TABLE exceltable (
id INTEGER,
name TEXT,
module TEXT,
type TEXT,
desc TEXT
)''')
for row in exceldata:
c.execute('INSERT INTO exceltable VALUES (?,?,?,?,?)', row)
conn.commit()
#----------------------------------------
# display SQL data
#----------------------------------------
c.execute('SELECT * FROM exceltable')
for row in c:
print row
The Excel filename is example.xls and the table data begins at B14 (2nd column, 14th row) and ends at F20 (6th column, 20th row) in Sheet 1. The script uses a declarative approach to store the data in a Python list of lists. It creates a SQLite database named example.db and creates a connection to it. It then fills the database using the Python list data structure. Finally, it displays the newly created database. The screen output is shown below.
(1, u'foo', u'ModuleExt', u'double', u'Description of foo') (2, u'bar', u'ModuleExt', u'double', u'Description of bar') (3, u'knark', u'Module1', u'int', u'Description of knark') (4, u'wert', u'Module1', u'double', u'Description of wert') (5, u'jib', u'Module1', u'double', u'Description of jib') (6, u'laz', u'Module2', u'double', u'Description of laz') (7, u'kew', u'Module2', u'double', u'Description of kew')
If I want to interact with the newly created database using SQLite directly, I can run sqlite3 example.db from my Cygwin bash command line. (Note the conn.commit()
line in my Python script is very important for this step to work. For some reason, I didn't see this in the Python documentation.) Here is an example session using SQLite directly with the new database.
$ sqlite3 example.db SQLite version 3.5.1 Enter ".help" for instructions sqlite> .schema CREATE TABLE exceltable ( id INTEGER, name TEXT, module TEXT, type TEXT, desc TEXT ); sqlite> select * from exceltable; 1|foo|ModuleExt|double|Description of foo 2|bar|ModuleExt|double|Description of bar 3|knark|Module1|int|Description of knark 4|wert|Module1|double|Description of wert 5|jib|Module1|double|Description of jib 6|laz|Module2|double|Description of laz 7|kew|Module2|double|Description of kew sqlite>
Comments
You just saved me hours of time, and it worked straight off, and having not used Python before I'm very impressed by that too. Took me about 20 minutes to import 140 lines into SQLite which would have been loads of typing. thank you!
nice work, and fine for .xls files where you know the format, or everything is text.
You'd quickly come unstuck with dates/floats etc..
So if you have a more complex spreadsheet, and want to have fine control over the datatypes in your sqlite database, consider using the 3rd party python module xlrd.
http://pypi.python.org/pypi/xlrd
Thanks man.. thanku very much
Could you send me a machine readable copy of this code? I found a few small errors in what is listed on the web page and when I execute the code, it opens the Excel file "example.xls" OK but then leaves it open and the output from the last lines of code all produce 'None' for each piece of data. Either its not reading the data correctly from Excel or not printing it correctly. As you might have guessed, I am somewhat of a newbie with Python.
Thanks,
Thanks a lot! really helped a lot.. :)
Thank you for that tutorial. It helps me a lot
I am leanring Python and this code and explanaiton is very helpful. Thanks a lot!
If you are trying to run this in Python 3 you need to change xrange to range.
The code as written expects the data to start in cell B13. If you want your data to start in A1 you need to change:
ROW_SPAN = (14, 21)
COL_SPAN = (2, 7)
to
ROW_SPAN = (2, 9)
COL_SPAN = (1, 6)
If you want to run the code several times you will need to add the following lines to the end of the code:
c.execute('DROP TABLE exceltable')
conn.close()
And if you want to edit your xls file after running the script you will need to add:
ws = app.Workbooks.Close()
If your file path contains backslashes, preface the string with the letter 'r' to indicate that it is a Raw string.
Nice work, Eliot, saved me a whole bunch of time!
disqus:2439278363