DynamoDB local in Docker

Franck Pachot - Jan 25 '23 - - Dev Community

Here's a quick guide on running DynamoDB locally for testing without needing to connect to the cloud. DynamoDB local stores its table items in an SQLite database. It's quite intriguing to note that a NoSQL database is stored in an SQL database - this really highlights the power of SQL.

I'll run DynamoDB Local in a Docker container and define aliases to access it with AWS CLI and SQLite:

# Start DynamoDB local with a SQLite file (not in memory)
docker run --rm -d --name dynamodb -p 8000:8000 amazon/dynamodb-local -jar DynamoDBLocal.jar -sharedDb -dbPath /home/dynamodblocal

# alias to run `sqlite3` on this file
alias sql='docker exec -it dynamodb \
 sqlite3 /home/dynamodblocal/shared-local-instance.db \
'

# alias to run AWS CLI with linked to the DynamoDB entrypoint and exposing the current directory as /aws (which is the container home directory) 
alias aws='docker run --rm -it --link dynamodb:dynamodb -v $PWD:/aws \
 -e AWS_DEFAULT_REGION=xx -e AWS_ACCESS_KEY_ID=xx -e AWS_SECRET_ACCESS_KEY=xx \
 public.ecr.aws/aws-cli/aws-cli --endpoint-url http://dynamodb:8000 \
'
Enter fullscreen mode Exit fullscreen mode

Create table

I create a table from the create-table example, and query the internal SQLite DB:

aws dynamodb create-table \
    --table-name MusicCollection \
    --attribute-definitions AttributeName=Artist,AttributeType=S AttributeName=SongTitle,AttributeType=S \
    --key-schema AttributeName=Artist,KeyType=HASH AttributeName=SongTitle,KeyType=RANGE \
    --provisioned-throughput ReadCapacityUnits=5,WriteCapacityUnits=5 \
    --tags Key=Owner,Value=blueTeam

sql -line -echo "select * from dm;"

Enter fullscreen mode Exit fullscreen mode

Image description

Image description

Insert Items

I insert some data from the batch-write-items example

cat > request-items.json <<'JSON'
{
    "MusicCollection": [
        {
            "PutRequest": {
                "Item": {
                    "Artist": {"S": "No One You Know"},
                    "SongTitle": {"S": "Call Me Today"},
                    "AlbumTitle": {"S": "Somewhat Famous"}
                }
            }
        },
        {
            "PutRequest": {
                "Item": {
                    "Artist": {"S": "Acme Band"},
                    "SongTitle": {"S": "Happy Day"},
                    "AlbumTitle": {"S": "Songs About Life"}
                }
            }
        },
        {
            "PutRequest": {
                "Item": {
                    "Artist": {"S": "No One You Know"},
                    "SongTitle": {"S": "Scared of My Shadow"},
                    "AlbumTitle": {"S": "Blue Sky Blues"}
                }
            }
        }
    ]
}
JSON

aws dynamodb batch-write-item \
    --request-items file://request-items.json \
    --return-consumed-capacity INDEXES \
    --return-item-collection-metrics SIZE

Enter fullscreen mode Exit fullscreen mode

Image description

Here is what is stored in the SQLite table:
Image description

Transact Write

Here is an update of 'Happy Day' and Delete of 'Call Me Today' as in the transact-write example:


cat > transact-items.json <<'JSON'
[
    {
        "Update": {
            "Key": {
                "Artist": {"S": "Acme Band"},
                "SongTitle": {"S": "Happy Day"}
            },
            "UpdateExpression": "SET AlbumTitle = :newval",
            "ExpressionAttributeValues": {
                ":newval": {"S": "Updated Album Title"}
            },
            "TableName": "MusicCollection",
            "ConditionExpression": "attribute_not_exists(Rating)"
        }
    },
    {
        "Delete": {
            "Key": {
                "Artist": {"S": "No One You Know"},
                "SongTitle": {"S": "Call Me Today"}
            },
            "TableName": "MusicCollection",
            "ConditionExpression": "attribute_not_exists(Rating)"
        }
    }
]

JSON

aws dynamodb transact-write-items \
 --transact-items file://transact-items.json \
 --return-consumed-capacity TOTAL \
 --return-item-collection-metrics SIZE

Enter fullscreen mode Exit fullscreen mode

Image description

Scan

I have two items remaining, 'Scared of My Shadow' that has not been touched and 'Happy Day' where the title has been updated with 'Updated Album Title':

aws dynamodb scan     --table-name MusicCollection --output table
Enter fullscreen mode Exit fullscreen mode

Image description

Testing failures

I put back the initial data with 'Call Me Today', 'Happy Day' and 'Scared of My Shadow':

aws dynamodb batch-write-item --request-items file://request-items.json
Enter fullscreen mode Exit fullscreen mode

I have those 3 Items:

aws dynamodb scan --table-name MusicCollection --output text | grep ALBUMTITLE

ALBUMTITLE      Songs About Life
ALBUMTITLE      Somewhat Famous
ALBUMTITLE      Blue Sky Blues
Enter fullscreen mode Exit fullscreen mode

To simulate something that can go wrong, I lock the row for
'Call Me Today' which is the one that by Transac Write should delete ('Somewhat Famous'):

sql
 begin transaction;
 select rangeKey from MusicCollection;
 update MusicCollection set rangeKey='x' where rangeKey like '%Today%';
 select rangeKey from MusicCollection;
Enter fullscreen mode Exit fullscreen mode

I try my Transact Write:

aws dynamodb transact-write-items  --transact-items file://transact-items.json
Enter fullscreen mode Exit fullscreen mode

It fails with:

An error occurred (InternalFailure) when calling the TransactWriteItems operation (reached max retries: 2): The request processing has failed because of an unknown error, exception, or failure.
Enter fullscreen mode Exit fullscreen mode

and all is back to normal:

aws dynamodb scan --table-name MusicCollection --output text | grep ALBUMTITLE

ALBUMTITLE      Songs About Life
ALBUMTITLE      Somewhat Famous
ALBUMTITLE      Blue Sky Blues
Enter fullscreen mode Exit fullscreen mode

This is a simple test of atomicity. But this runs on software that is different from the real DynamoDB.

Can we test race conditions?

Why did I do that? The Transact API was subject to discussions:

DynamoDB is a closed-source managed service, so it's not possible to look at its internals. The behavior I'm observing seems good, but how can we be certain?

By "certain," I mean:

  • making sure it works as designed and documented without any bugs
  • confirming that my understanding of the documentation is accurate
  • ensuring that a simple test case can be reproduced later

If you follow my blog, you know that's how I learn and explain things. I often revisit a past blog post and copy and paste the simple test to see if anything has changed with a new version.

So, how can you achieve the same with proprietary software that operates on a platform you don't have access to and has no internal documentation?"

You can read the documentation and trust it, like Alex:

You can stress test and see what happens, like Rick:

I can try on DynamoDB local as suggested by Pete:

So... that's what I did. But now I have to think about possible ways to reproduce race conditions on a small test case. I can do that with Open-Source software, like PostgreSQL (or YugabyteDB, the distributed SQL fork of PostgreSQL). I was able to do that with proprietary software like Oracle Database because you can download and run the same software that they use for their managed services. However, for AWS services that are not running an open-source product, this is not possible.

Experience and expertise are essential, and Alex, Rick, and Pete have proven to be reliable in this regard, but their experience isn't limited to troubleshooting live systems. I haven't had much insight into the inner workings when dealing with production problems due to time constraints. On the other hand, I have learned a great deal from reproducing these issues in a controlled environment, creating test cases, preparing demonstrations, leading training sessions, and exploring out of curiosity. You truly understand a topic when you can explain it, and grasp the fundamentals when you can demonstrate it. Experts accumulate experience from dealing with production issues and reproduce them to comprehend all the details. Even if you are using a black-box managed service, having the capability to execute the equivalent in a local environment is crucial.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .