Accurate Atomic Counters with Amazon DynamoDB

3 minute read

Amazon DynamoDB is a fully managed key-value database at AWS. It’s great in a lot of ways: it’s serverless, it scales out incredibly wide, and it’s incredibly cheap for what it is. Good use cases for Dynamo are player preferences, game status, and low-cost global leaderboards.

But if you’re using Dynamo for an atomic counter under heavy load, that absolutey has to be accurate, it can be tricky to get right. In this post we’ll give an easy solution for incrementing a counter without losing updates.

To show this, we’ll imagine a common use case: adding one to a “likes” counter on an image, say. In this example, we have a DynamoDB table images which has a single primary key of id, an integer. Each item in this table has, among other things, a likes field which is supposed to store the number of likes that images has.

How can we implement an atomic increment to that likes column for any image id?

The Easy But (Sometimes) Wrong Way

Google DynamoDB atomic counters and you’ll immediately get to AWS’s documentation for atomic counters. The advice is to use ADD where you simply add one to the likes field of an images item.

In Python it looks like this:

import boto3
dynamodb = boto3.client('dynamodb')

def add_like(id):
    """Add one like to image `id`"""
    dynamodb.update_item(
        TableName='images',
        Key={'id': {'N': id}},
        UpdateExpression='ADD likes :num',
        ExpressionAttributeValues={':num': {'N': '1'}}
    )

So all that means is, “Add one to likes where id = id”. Simple! And we like simple over here at Kickbox Frownyface.

However, there’s a problem here: this update isn’t actually atomic, in that if you have two Lambda functions (say) calling this function at exactly the same time, one of the ADD functions is effectively dropped.

For example let’s say that the likes for this adorable kitten picture is at 5 and both Bob and Alice click “like” at pretty much the same time.

Cute kitten

If you are unlucky, you may end up with likes just set to 6 instead of 7, because both Alice and Bob were adding one to 5 at precisely the same time. (If you don’t believe me, try it out – fan out 1000 or so Lambdas and you’re pretty likely to end up with likes less than the expected 1000.)

Is that bad? Well, that really depends. If you’re OK with the occasional dropped update, then leave well enough alone and move on – you have bigger fish to fry!

But if your counters just must be correct, let’s make a Conditional Write.

Adding a Conditional Expression

What we want to do is change the function from “just add one” to “just add one to what I think it is right now”. In our example, when Bob clicks, the function will run “Add one to likes so long as the current number of likes is five” and when Alice clicks, the function will also run “Add one to likes so long as the current number of likes is five.” If both calls happen simultaneously, one will go through, and the other will be rejected because the conditional is no longer satisfied.

The code is pretty similar, we just do a fetch then a conditional update:

import boto3
dynamodb = boto3.client('dynamodb')

def add_like(id):
    """Add one like to image `id`"""
    existing = dynamodb.get_item(
            TableName='images',
            Key={'id': {'N': id}}
        )
    dynamodb.update_item(
        TableName=self.table_name,
        Key={'id': {'N': id}},
        UpdateExpression='ADD likes :num',
        ConditionExpression='likes = :current',
        ExpressionAttributeValues={':num': {'N': '1'},
            ':current': {'N': existing['Item']['likes']['N']
        }
    )

This code is fetching the current record for image id and storing that in existing. Then it does an update on that image, adding 1 to likes so long as the current likes value is equal to the value we just fetched.

Now, if Alice and Bob both add their likes at precisely the same time, one of them will go through and the other will have a ConditionalCheckFailedException thrown.

Are we done? Not quite! We still have the number of likes for that image set to six! Either Alice’s or Bob’s like wasn’t recorded. But now at least we know it was dropped, so let’s retry the failed one with code like this:

def add_like_safely(id, tries=10):
    attempts, success = 0, False
    while attempts <= tries and not success:
        try:
            attempts = attempts + 1
            results = add_like(id)
            success = True
       except ClientError as e:
            if e.response['Error']['Code'] == 'ConditionalCheckFailedException': 
                time.sleep(0.01)
            else:
                raise
    return success

This way, if both Alice and Bob call add_like_safely at precisely the same time, one of them will get through first… but the exception will be caught and the other will try again after a short delay to give the data time to propagate out.

Categories:

Updated: