Python data object motivated by a desire for a mutable namedtuple with default values
UPDATE 2016-08-12: Read Glyph's post and use the attrs library instead.
Reasons to use this instead of a namedtuple:
- I want to change fields at a later time (mutability)
- I want to specify a subset of the fields at instantiation and have the rest be set to a default value
Reasons to use this instead of a dict:
- I want to explicitly name the fields in the object
- I want to disallow setting fields that are not explicitly named*
- I want to specify a subset of the fields at instantiation and have the rest be set to a default value
- I want to use attribute style access (dot notation to access fields)
Reasons to use this instead of a regular Python class:
- I don't want to duplicate field names in the
__init__()
method signature and when setting instance attributes of the same name. - I want to disallow setting fields that are not explicitly named*
- I want to be able to easily convert the object to a
dict
or atuple
- I want to save memory
*Note: This Stack Overflow answer warns against using __slots__
for my goal of disallowing setting fields that are not explicitly named. It says metaclasses or decorators should be abused by us control freaks and static typing weenies instead. To comply with that advice, if you don't care about saving memory, __slots__
could be replaced with a non-special attribute, such as _fields
. If that is done, attribute creation would no longer be limited.
See also:
- recordtype on PyPI
- Why Python does not support record type i.e. mutable namedtuple - Stack Overflow
- Managing Records in Python (Part 1 of 3)
- http://stackoverflow.com/questions/472000/python-slots/472024#472024
- http://stackoverflow.com/questions/1816483/python-how-does-inheritance-of-slots-in-subclasses-actually-work
class DataObject(object):
"""
An object to hold data. Motivated by a desire for a mutable namedtuple with
default values. To use, subclass, and define __slots__.
The default default value is None. To set a default value other than None,
set the `default_value` class variable.
Example:
class Jello(DataObject):
default_value = 'no data'
__slots__ = (
'request_date',
'source_id',
'year',
'group_id',
'color',
# ...
)
"""
__slots__ = ()
default_value = None
def __init__(self, *args, **kwargs):
# Set default values
for att in self.__slots__:
setattr(self, att, self.default_value)
# Set attributes passed in as arguments
for k, v in zip(self.__slots__, args):
setattr(self, k, v)
for k, v in kwargs.items():
setattr(self, k, v)
def asdict(self):
return dict(
(att, getattr(self, att)) for att in self.__slots__)
def astuple(self):
return tuple(getattr(self, att) for att in self.__slots__)
def __repr__(self):
return '{}({})'.format(
self.__class__.__name__,
', '.join('{}={}'.format(
att, repr(getattr(self, att))) for att in self.__slots__))
Tests:
import unittest
class DataObjectTestCase(unittest.TestCase):
def test_instantiation_using_args(self):
class MyData(DataObject):
__slots__ = ('att1', 'att2')
md = MyData('my attr 1', 'my attr 2')
self.assertEqual(md.att1, 'my attr 1')
self.assertEqual(md.att2, 'my attr 2')
def test_instantiation_using_kwargs(self):
class MyData(DataObject):
__slots__ = ('att1', 'att2')
md = MyData(att1='my attr 1', att2='my attr 2')
self.assertEqual(md.att1, 'my attr 1')
self.assertEqual(md.att2, 'my attr 2')
def test_default_default_value(self):
class MyData(DataObject):
__slots__ = ('att1', 'att2')
md = MyData(att1='my attr 1')
self.assertEqual(md.att1, 'my attr 1')
self.assertEqual(md.att2, None)
def test_custom_default_value(self):
class MyData(DataObject):
default_value = 'custom default value'
__slots__ = ('att1', 'att2')
md = MyData(att1='my attr 1')
self.assertEqual(md.att1, 'my attr 1')
self.assertEqual(md.att2, 'custom default value')
def test_set_value_after_instantiation(self):
class MyData(DataObject):
__slots__ = ('att1', 'att2')
md = MyData(att1='my attr 1')
self.assertEqual(md.att1, 'my attr 1')
self.assertEqual(md.att2, None)
md.att1 = 5
md.att2 = 9
self.assertEqual(md.att1, 5)
self.assertEqual(md.att2, 9)
def test_attribute_not_defined_in__slots__(self):
class MyData(DataObject):
__slots__ = ('att1', 'att2')
with self.assertRaises(AttributeError):
MyData(att3='my attr 3')
with self.assertRaises(AttributeError):
md = MyData()
md.att3 = 45
def test_asdict(self):
class MyData(DataObject):
__slots__ = ('att1', 'att2')
md = MyData(att1='my attr 1', att2='my attr 2')
self.assertEqual(
md.asdict(), {'att1': 'my attr 1', 'att2': 'my attr 2'})
def test_tuple(self):
class MyData(DataObject):
__slots__ = ('att1', 'att2')
md = MyData(att1='my attr 1', att2='my attr 2')
self.assertEqual(md.astuple(), ('my attr 1', 'my attr 2'))
def test___repr__(self):
class MyData(DataObject):
__slots__ = ('att1', 'att2')
md = MyData(att1='my attr 1', att2='my attr 2')
self.assertEqual(repr(md), "MyData(att1='my attr 1', att2='my attr 2')")
Note: previously, I included the following method in the class. However, this is not necessary. If __slots__ is defined in DataObject and the subclass, any attribute not in __slots__ will automatically raise an AttributeError.
# def __setattr__(self, name, value):
# if name not in self.__slots__:
# raise AttributeError("%s is not a valid attribute in %s" % (
# name, self.__class__.__name__))
# super(DataObject, self).__setattr__(name, value)
Related posts
- An example using Python's groupby and defaultdict to do the same task — posted 2014-10-09
- python enum types — posted 2012-10-10
- How to sort a list of dicts in Python — posted 2010-04-02
- Python setdefault example — posted 2010-02-09
- How to conditionally replace items in a list — posted 2008-08-22
Comments
Thanks Eliot, Once again, google led me to you blog. I used this today :D
Hey Angel! Glad you found this useful. We have been using this in our codebase for a little while, and I like it. Another option for modifying the behavior of a namedtuple is to create a factory function that returns a customized namedtuple. I got this idea from @andrewwatts.