Bennyland this server is running in my bedroom – benny’s learning how to run a linux server

17Mar/100

Extending Piston’s BaseHandler for ForeignKey support

Recently I've started working on an API using Django and Piston for an update to our internal timekeeping stuff (something I've been in charge of for most of the time I've been at VV). I quickly ran into a problem with Piston when trying to create or update models that have ForeignKeys in them. As it turns out, Piston does not support ForeignKey. What follows is my attempt at adding this support as well as support for better REST urls that allow for searching and such. I went this route after reading through this post's responses about this very problem.

A note of caution: I've only very recently started learning Python, Django, and Piston - so it's likely that this isn't the best approach... It's also pretty dirty as I'm currently using a combination of rc and {'status':0, 'statusmsg':'message here'} whenever errors happen... what would be better would be to always use rc.SOMETHING and throw errors that get caught and formatted correctly - I just haven't had time to clean this all up just yet...

For now, I'm just going to paste heavily commented code - if you have any questions just post in the comments. Later I'll add some more detailed explanation on what I'm doing and why I went this direction.

handlers.py snippit

# extend the BaseHandler class so we can add some additional
# features to it (mostly to support ForeignKey and fancy urls
class BaseApiHandler(BaseHandler):
    # post_name will serve as the name of the field in our output
    # data.  My output is formatted to contain a status
    # int, status message, and the actual data
    # default names this key 'data'
    post_name = 'data'

    # this method should be overridden in derived classes to process kwargs when reading data
    # This allows for urls to contain all sorts of search stuff rather than just /pkname/number
    def process_kwarg_read(self, key, value, d_post, b_exact):
        pass

    # override 'read' so we can better handle our api's searching capabilities
    def read(self, request, *args, **kwargs):
        # d_post holds the return data along with status and statusmsg
        # i default it here just for the sake of having something to look at
        d_post = {'status':0,'statusmsg':'Nothing Happened'}
        try:
            # setup the named response object
            # select all employees then filter - querysets are lazy in django
            # the actual query is only done once data is needed, so this may
            # seem like some memory hog slow beast, but it's actually not.
            d_post[self.post_name] = self.queryset(request)
            # s_query is used to hold debug information to be shown in the statusmsg
            s_query = ''

            # b_exact allows me to search with __exact or __contains depending on what the
            # person accessing the api wants
            b_exact = False
            if 'exact' in kwargs and kwargs['exact'] <> None:
                b_exact = True
                s_query = '\'exact\':True,'

            # iterate over the kwargs passed in through urls to process the search
            for key,value in kwargs.iteritems():
                # the regex url possibilities will push None into the kwargs dictionary
                # if not specified, so just continue looping through if that's the case
                if value == None or key == 'exact':
                    continue

                # write to the s_query string so we have a nice debug message
                s_query = '%s\'%s\':\'%s\',' % (s_query, key, value)

                # now process this key/value kwarg in the derived class
                self.process_kwarg_read(key=key, value=value, d_post=d_post, b_exact=b_exact)

            # when we're done looping, this runs:
            else:
                if d_post[self.post_name].count() == 0:
                    d_post['status'] = 0
                    d_post['statusmsg'] = '%s not found with query: {%s}' % (self.post_name, s_query)
                else:
                    d_post['status'] = 1
                    d_post['statusmsg'] = '%s %s found with query: {%s}' % (d_post[self.post_name].count(), self.post_name, s_query)
        except:
            e = sys.exc_info()[1]
            d_post['status'] = 0
            d_post['statusmsg'] = 'error: %s' % e
            d_post[self.post_name] = []

        return d_post

    # this method is used in derived classes to handle modal specific data
    def process_attr_item_create(self, inst, k, v):
        pass

    def flatten_data_request(self, request):
        # note: also have to support user matching for this method as well as the update method
        attrs = self.flatten_dict(request.data)
        # support posting json data inside a 'data' post variable
        if 'data' in attrs:
            ext_posted_data = simplejson.loads(request.data.get('data'))
            attrs = self.flatten_dict(ext_posted_data)
        return attrs

    # override 'create' to allow for the 'data' POST variable, as well as do some
    # fancy ForeignKey stuff
    def create(self, request, *args, **kwargs):
        if not self.has_model():
            return rc.NOT_IMPLEMENTED

        # get the actual data that wants to be created from the api request
        attrs = self.flatten_data_request(request)

        # we need to fix ForeignKeys because self.model.objects.get(**attrs) breaks
        # when one of the fields is a dictionary.  what happens when you call 'get(...)'
        # is Django creates a MySQL query with key=value.  In cases where the value is
        # a dict, you end up with '`key`={...}' which breaks the MySQL query
        # so here we convert ForeignKeys into their corresponding pk:
        for field in self.model._meta.fields:
            if field.rel:
                # this is a ForeignKey (as far as I can tell... _meta isn't documented)
                # so we need to convert it to it's pk equivilent... there's most likely
                # a better way of doing this than hardcoding stuff...
                if field.name in attrs:
                    pkfield = field.rel.to._meta.pk.name
                    if pkfield not in attrs[field.name]:
                        return {'status':0,'statusmsg':'pk field %s missing from foreignkey field %s' % (pkfield, field.name)}
                    else:
                        attrs[field.name] = attrs[field.name][pkfield]

        try:
            inst = self.model.objects.get(**attrs)
            return rc.DUPLICATE_ENTRY
        except self.model.DoesNotExist:
            inst = self.model()
            #inst = self.model(**attrs)
            # piston does not support ForeignKey, so we need to do this by hand :(
            for k,v in attrs.iteritems():
                self.process_attr_item_create(inst, k, v)
            inst.save()
            return inst
        except self.model.MultipleObjectsReturned:
            return rc.DUPLICATE_ENTRY

    # override 'update' to allow for the 'data' POST variable, as well as do some
    # fancy ForeignKey stuff
    def update(self, request, *args, **kwargs):
        if not self.has_model():
            return rc.NOT_IMPLEMENTED

        pkfield = self.model._meta.pk.name

        if pkfield not in kwargs:
            # No pk was specified
            response = rc.BAD_REQUEST
            response.write(", " + "pkfield not in kwargs")
            return response

        try:
            inst = self.queryset(request).get(pk=kwargs[pkfield])
        except ObjectDoesNotExist:
            return rc.NOT_FOUND
        except MultipleObjectsReturned: # should never happen, since we're using a PK
            response = rc.BAD_REQUEST
            response.write(", " + "MultipleObjectsReturned")
            return response

        # now grab the data from the request and update the row accordingly
        attrs = self.flatten_data_request(request)
        for k,v in attrs.iteritems():
            # fix to ForeignKey
            field = self.get_field(k)
            if field.rel:
                # this is a foreignkey, set _id attr instead...
                # note: not sure if it's correct to set _id, or if there's a way to
                # get the real name of the id like there is to get the pkname
                pkfield = field.rel.to._meta.pk.name
                if pkfield not in v:
                    return {'status':0,'statusmsg':'pk field %s missing from foreignkey field %s' % (pkfield, field.name)}
                else:
                    setattr(inst, '%s_id'%k, v[pkfield])
            else:
                setattr( inst, k, v )

        inst.save()
        return rc.ALL_OK

    # delete has to be overridden to remove 'None' url params... then run the base delete
    def delete(self, request, *args, **kwargs):
        # store the keys... you can't remove things while itterating
        del_keys = []
        for key,value in kwargs.iteritems():
            # the regex url possibilities will push None into the kwargs dictionary
            # if not specified, so just continue looping through if that's the case
            if value == None:
                del_keys.append(key)
        # now remove the bad keys from kwargs
        for key in del_keys:
            del kwargs[key]
        # now run the base delete
        return BaseHandler.delete(self, request, *args, **kwargs)

    def get_field(self, f_name):
        return [f for f in self.model._meta.fields if f.name == f_name][0]

# so here's an example handler that uses BaseApiHandler's new functionality to deal with foreign keys and better urls:
class HoursDetailHandler(BaseApiHandler):
    #allowed_methods = ('GET', 'PUT', 'POST', )
    model = HoursDetail
    exclude = ()
    post_name = 'hours_detail'

    def process_kwarg_read(self, key, value, d_post, b_exact):
        # each query is handled slightly differently... when keys are added
        # handle them in here.  python doesn't have switch statements, this
        # could theoretically be performed using a dictionary with lambda
        # expressions, however I was affraid it would mess with the way the
        # filters on the queryset work so I went for the less exciting
        # if/elif block instead

        # querying on a specific row
        if key == 'id':
            d_post[self.post_name] = d_post[self.post_name].filter(pk=value)

        # filter based on employee id - this is guaranteed to happen once
        # per query (see read(...))
        elif key == 'empid':
            d_post[self.post_name] = d_post[self.post_name].filter(emp__id__exact=value)

        # look for a specific project by id
        elif key == 'projid':
            d_post[self.post_name] = d_post[self.post_name].filter(proj__id__exact=value)

        elif key == 'datestamp' or key == 'daterange':
            d_from = None
            d_to = None
            # first, regex out the times in the case of range vs stamp
            if key == 'daterange':
                m = re.match('(?P<daterangefrom>\d{4,}[-/\.]\d{2,}[-/\.]\d{2,})(?:to|/-)(?P<daterangeto>\d{4,}[-/\.]\d{2,}[-/\.]\d{2,})', \
                             value)
                d_from = datetime.strptime(m.group('daterangefrom'), '%Y-%m-%d')
                d_to = datetime.strptime(m.group('daterangeto'), '%Y-%m-%d')
            else:
                d_from = datetime.strptime(value, '%Y-%m-%d')
                d_to = datetime.strptime(value, '%Y-%m-%d')

            # now min/max to get midnight on day1 through just before midnight on day2
            # note: this is a hack because as of the writing of this app,
            # __date doesn't yet exist as a queryable field thus any
            # timestamps not at midnight were incorrectly left out
            d_from = datetime.combine(d_from, time.min)
            d_to = datetime.combine(d_to, time.max)

            d_post[self.post_name] = d_post[self.post_name].filter(clock_time__gte=d_from)
            d_post[self.post_name] = d_post[self.post_name].filter(clock_time__lte=d_to)

        else:
            raise NameError

    def read(self, request, *args, **kwargs):
        # empid is required, so make sure it exists before running BaseApiHandler's read method
        if not('empid' in kwargs and kwargs['empid'] <> None and kwargs['empid'] >= 0):
            return {'status':0,'statusmsg':'empid cannot be empty'}
        else:
            return BaseApiHandler.read(self, request, *args, **kwargs)

    # handle foreign keys in the create process by looking them up by id
    # the 'create' method truncates the dictionary of a foreignkey down to it's
    # pk, so we need to expand that again by searching for the id in the table
    def process_attr_item_create(self, inst, k, v):
        if k == 'emp':
            setattr(inst, k, Employee.objects.get(pk=v))
        elif k == 'proj':
            setattr(inst, k, Project.objects.get(pk=v))
        else:
            setattr(inst, k, v)

urls.py

    # formatting your url like this might seem crazy, but it allows your end api user to format their api call
    # in any order they want to and you can have a single url handler here and easily add to it as you need more
    # things - note: always add before the /exact line.
    #hours_detail/id/{id}/empid/{empid}/projid/{projid}/datestamp/{datestamp}/daterange/{fromdate}to{todate}/exact
    #empid is required
    url(r'^api/hours_detail/(?:' + \
        '(?:[/]?id/(?P<id>\d+))?' + \
        '(?:[/]?empid/(?P<empid>\d+))?' + \
        '(?:[/]?projid/(?P<projid>\d+))?' + \
        '(?:[/]?datestamp/(?P<datestamp>\d{4,}[-/\.]\d{2,}[-/\.]\d{2,}))?' + \
        '(?:[/]?daterange/(?P<daterange>(?:\d{4,}[-/\.]\d{2,}[-/\.]\d{2,})(?:to|/-)(?:\d{4,}[-/\.]\d{2,}[-/\.]\d{2,})))?' + \
        ')+(?:/(?P<exact>exact))?$', hours_detail_resource),

models.py (for reference)


# here's what the HoursDetail model looks like, you can see it has several foreign keys
class HoursDetail(models.Model):
    #id = models.IntegerField(primary_key=True)
    emp = models.ForeignKey(Employee)
    proj = models.ForeignKey(Project)
    datestamp = models.DateTimeField(null=True, blank=True)
    entry_type = models.CharField(max_length=192)
    hours_per_day = models.FloatField(default=0, blank=True)
    clock_time = models.DateTimeField(null=True, blank=True)
    is_clockin = models.BooleanField(default=False)
    status = models.CharField(max_length=255, null=True, blank=True)
    def __unicode__(self):
        return '%s, %s on %s worked on %s' % (self.emp.lname, self.emp.fname, self.datestamp, self.proj.name)
    class Meta:
        db_table = u'hours_detail'
Comments (0) Trackbacks (0)

No comments yet.


Leave a comment


No trackbacks yet.