Introduction
Over the course of series we've covered various aspects of the Data API for Amazon Aurora Serverless v2 with AWS SDK for Java (statement, batch statement and transaction management) including cold and warm start optimization techniques in case of using AWS Lambda, compared performance to DynamoDB and explored anomaly detection with Amazon DevOps Guru. In this part of the series we'd like to cover logging and monitoring aspects of the Data API in general which can help us to troubleshoot our application.
Logging RDS Data API calls with AWS CloudTrail
Data API is integrated with AWS CloudTrail, a service that provides a record of actions taken by a user, role, or an AWS service in Data API. CloudTrail captures all API calls for Data API as events, including calls from the Amazon RDS console and from code calls to Data API operations. If you create a trail, you can enable continuous delivery of CloudTrail events to an Amazon S3 bucket, including events for Data API. Using the data collected by CloudTrail, you can determine a lot of information. This information includes the request that was made to Data API, the IP address the request was made from, who made the request, when it was made, and additional details.
For Aurora PostgreSQL Serverless v2 and provisioned databases, the following Data API operations are logged to AWS CloudTrail as data events :
- BatchExecuteStatement
- BeginTransaction
- ommitTransaction
- ExecuteStatement
- RollbackTransaction
Data events are high-volume data-plane API operations that CloudTrail doesn't log by default. Additional charges apply for data events. For information about CloudTrail pricing, see AWS CloudTrail Pricing
You can use the CloudTrail console, AWS CLI, or CloudTrail API operations to log these Data API operations. In the CloudTrail console, choose RDS Data API - DB Cluster for the Data event type.
The following example shows a CloudTrail log entry that demonstrates the ExecuteStatement operation for Aurora PostgreSQL Serverless v2 and provisioned databases. For these databases, all Data API events are data events where the event source is rdsdataapi.amazonaws.com and the event type is Rds Data Service.
{
"eventVersion": "1.05",
"userIdentity": {
"type": "IAMUser",
"principalId": "AKIAIOSFODNN7EXAMPLE",
"arn": "arn:aws:iam::123456789012:user/johndoe",
"accountId": "123456789012",
"accessKeyId": "AKIAI44QH8DHBEXAMPLE",
"userName": "johndoe"
},
"eventTime": "2019-12-18T00:49:34Z",
"eventSource": "rdsdataapi.amazonaws.com",
"eventName": "ExecuteStatement",
"awsRegion": "us-east-1",
"sourceIPAddress": "192.0.2.0",
"userAgent": "aws-cli/1.16.102 Python/3.7.2 Windows/10 botocore/1.12.92",
"requestParameters": {
"continueAfterTimeout": false,
"database": "**********",
"includeResultMetadata": false,
"parameters": [],
"resourceArn": "arn:aws:rds:us-east-1:123456789012:cluster:my-database-1",
"schema": "**********",
"secretArn": "arn:aws:secretsmanager:us-east-1:123456789012:secret:dataapisecret-ABC123",
"sql": "**********"
},
"responseElements": null,
"requestID": "6ba9a36e-b3aa-4ca8-9a2e-15a9eada988e",
"eventID": "a2c7a357-ee8e-4755-a0d0-aed11ed4253a",
"eventType": "Rds Data Service",
"recipientAccountId": "123456789012"
}
Monitoring RDS Data API queries with Performance Insights
If your Aurora cluster is running Aurora Serverless v2 or provisioned instances, you can use Performance Insights with RDS Data API.
With Data API, your Aurora cluster processes queries based on Data API calls that you submit from your application. Data API also performs some SQL statements as part of its own internal workings, such as canceling queries that exceed the timeout threshold. Both kinds of SQL operations are shown in Performance Insights statistics and charts.
For Data API queries that you submit to an Aurora cluster, the Host field in the PI dashboard is marked as RDS Data API. For Aurora PostgreSQL, the application_name field has the value rds-data-api. Look for these labels when you analyze database load using Top hosts or Top Applications as a dimension.
All internal queries that Data API runs to manage database aspects such as the connection pool and query timeouts are annotated with a prefix RDS Data API. Example: /* RDS Data API / select * from my_table; Looks for these prefixes when you analyze database load by Top SQL as a dimension. statements are annotated with a SQL comment of / RDS Data API */.
See the example of the "Database Load" bar sliced by SQL statement for our sample application introduced in the part 1 of our article series.
Conclusion
In this part of the series we covered logging RDS Data API calls with AWS CloudTrail and monitoring RDS Data API queries with Performance Insights aspects which can help us to troubleshoot our application.
Sources :