One model, many db_tables

On July 08, 2007 in database, development, django, python

Foo.objects.get(keywords__contains='bar') will make your database cry when you run it against a 30 million row table. At times it becomes necessary to segment a large table into many smaller tables.

Say you have these models:

class City(models.Model):
  city = models.CharField(maxlength=255)
  state = models.USStateField()

class Listings(models.Model):
  name = models.CharField(maxlength=100)
  street = models.CharField(maxlength=100)
  city = models.ForeignKey(City)
  zip = models.CharField(maxlength=10)
  keywords = models.CharField(maxlength=255)

The listings table has gotten huge. Users only search within a single city, so you want to break the listings table into a table for each city. Instead of yourapp_listing, you now have yourapp_listing_1, yourapp_listing_2, etc. All the listings are in the matching table for their city.

I couldn’t find a documented way to make db_table dynamic in Django. So, how do you get Django to use the right table when you’re querying for listings? Here’s how I did it:

class ListingManager(models.Manager):
  def get_table_for(self, city):
    '''
    someapp_listing if city == None
    someapp_listing_012 if city.id == 12
    '''
    table = '_'.join((self.model._meta.app_label,
                      self.model._meta.module_name))
    if city:
      table += '_%03d' % city.id
    return table

  def in_city(self, city):
    self.city = city
    self.model._meta.db_table = self.get_table_for(city)
    return self

class Listing(models.Model):
  city = models.ForeignKey(City)
  objects = ListingManager()

  def delete(self):
    Listing.objects.in_city(self.city)
    super(Listing, self).delete()

  def save(self):
    Listing.objects.in_city(self.city)
    super(Listing, self).save()

It’s simple to use:

>>> city = City.objects.get(id=2)
>>> Listings.objects.in_city(city).all()

Edit 2007/08: Simplified the code. When I first wrote this I replaced db_table with a subclass of str that called ListingManager.get_table_for. It was overkill, and didn’t always work as expected (e.g., 'foo' + Listing._meta.db_table didn’t work).