Introduction
With the prevalence of generative AI (gen AI), I've been keeping abreast on AWS' AI offerings for the past while. My journey started with Amazon Q Business, a fully managed service for building gen AI assistants. While the idea is great, it seems to be too basic as it is today and lacks the advanced features to improve the user experience in practice.
I then ventured into the more advanced use cases using Amazon Bedrock and went through many workshops such as Building with Amazon Bedrock and LangChain. The challenge I find is that these workshops still tend to be basic, and they don't answer my questions about complex use cases. I came to learn about agents while going through LangChain literatures, but developing a full workflow felt like a daunting task when my full-time job is DevOps, not software development. Things always seem too simple that it doesn't provide enough business value, or too complex that it becomes too costly.
After attending a recent AWS PartnerCast webinar on building intelligent enterprise apps using gen AI on AWS, I learned about Agents for Amazon Bedrock and some recent new features added to the service. The service seems to be within the Goldilocks zone matching my current skillsets, so I decided to dive heads-first to learn all about it. I decided to build something realistic and figured that I should share my journey with folks in this blog post.
About Agents for Amazon Bedrock
Agents for Amazon Bedrock is a service that enables gen AI applications to execute multi-step tasks across company systems and data sources. It is effectively a managed service for agents and retrieval-augmented generation (RAG), which are common patterns to extend the capabilities of large language models (LLMs).
Agents for Amazon Bedrock assumes the complexity of orchestrating the interactions between different components in such workflows, which must otherwise be programmed into your gen AI application. While you can use frameworks such as LangChain or LlamaIndex to develop these workflows, Agents for Amazon Bedrock makes it much more efficient for common use cases. Agents can also integrate with knowledge bases to enable RAG, as shown in the following diagram from the AWS documentation:
Coming up with a basic but representative use case
To help with brainstorming ideas for an agent, I decided to on these principles:
The idea must be practical and with real-life data.
Follow the KISS principle.
For inspirations on what type of agents I should build, I turned to the Public APIs GitHub repository which has a curated lists of free APIs. I narrowed my search for an API that does not require sign-up or an API key and returns useful information. I ultimately decided to use the Free Currency Exchange Rates API, which seemed promising upon some basic testing.
Naturally, the idea was steered towards a forex rate assistant which helps users look up rates from the API. The API supports lookup by dates, however to keep it simple I decided to limit the lookup to only the latest rates for now. This also leaves some room for enhancing the agent later.
Requesting for model access
Agents for Amazon Bedrock is a relatively new feature, so it is supported only in limited regions with limited model support. At the time of writing this blog post, it is only supported in US East (N. Virginia) (us-east-1
) and US West (Oregon) (us-west-2
) and only supports Anthropic models. We will use the us-west-2
region for our evaluation.
You should also be aware of the pricing for different Anthropic models. With the recent addition of the Claude 3 model family, Haiku emerges as highly competitive with great price-to-performance balance. Thus we will use Haiku as the model for our agent.
When you first use Amazon Bedrock, you must request for access to the models. This can be done in the Amazon Bedrock console using the Model access page which can be opened in the left menu. On that page, you will see the list of base models by vendor and their access status similar to the following:
To request for access, do the following:
Click on the Manage model access button.
On the Request model access page, scroll down to the Anthropic models in the list.
If this is the first time you are request access to Anthropic models, you will be required to submit use case details. Click on the Submit use case details button to open the form, then fill it in as appropriate and click Submit.
Check the box next to the models to which you wish to request access. Since we might compare different Anthropic models, let's check the box next to Anthropic to request access to all of them. Lastly, click Request model access at the end of the page.
The access status should now show "In progress" and the request will only take a few minutes to be approved if all goes well. Once available, the access status should change to "Access granted".
Creating the OpenAPI schema for the currency exchange API
In our agent, we will be using an action group that defines an action that the agent can help the user perform by calling APIs via a Lambda function. Consequently, the action group in our agent requires the following:
An OpenAPI schema that provides the specifications of the API
A Lambda function to which the action group makes API requests
That is also to say, the Lambda function is effectively a "proxy" API that calls the actual APIs, which in our case is the free currency exchange rates API. Based on the API documentation, we know the following:
Since we will only support the latest exchange rate, the base URI for our API would be
https://cdn.jsdelivr.net/npm/@fawazahmed0/currency-api@latest/v1
.We need to use the
/currencies.min.json
API, which gets the list of available currencies in minified JSON format. This helps minimize the number of tokens (and thus cost and limit) processed by the model.We also need to use the
/currencies/{code}.min.json
API, gets the currency exchange rates with{code}
as the base currency.
Since this API does not provide the OpenAPI schema, we need to create it ourselves. I figured that this might be a regular exercise if I start testing Bedrock agents with different APIs, so I started looking for a tool that can generate OpenAPI schema, such as those listed in in OpenAPI.Tools. One category of tools seems to use network traffic, often in the HAR format, to generate the OpenAPI schema. I tried the OpenAPI DevTools which is a Chrome extension, however it did not work for the currency exchange rates API.
After wrestling with it for a bit and eventually giving up, I instead turned to ChatGPT to see if it is smart enough for the task. With my free plan, I asked ChatGPT 3.5 the following:
Can you generate the OpenAPI spec YAML from this API GET URL: https://cdn.jsdelivr.net/npm/@fawazahmed0/currency-api@latest/v1/currencies.min.json
To my surprise, it did generate a somewhat decent API spec:
While it is not usable as-is because the URL is missing the /v1
part and it is lacking some descriptions, it has almost everything that I need. However, it struck me as odd that the response has uppercase currency code which is NOT what the API returns. So I started a new ChatGPT session and ask the same question, only to get a very different spec:
At this point, I was certain that ChatGPT is not calling the API to generate the spec but rely on what its knowledge to generate an answer. It is probably experiencing hallucination, but it is good enough as a starting point 🤷
I did the same for the other API and adjusted the spec using the Swagger Editor. Specifically, I added detailed descriptions that should help the agent understand the API usages. The resulting OpenAPI YAML file is as follows:
openapi: 3.0.0
info:
title: Currency API
description: Provides information about different currencies.
version: 1.0.0
servers:
- url: https://cdn.jsdelivr.net/npm/@fawazahmed0/currency-api@latest/v1
paths:
/currencies.min.json:
get:
description: |
List all available currencies
responses:
'200':
description: Successful response
content:
application/json:
schema:
type: object
description: |
A map where the key refers to the three-letter currency code and the value to the currency name in English.
additionalProperties:
type: string
/currencies/{code}.min.json:
get:
description: |
List the exchange rates of all available currencies with the currency specified by the given currency code in the URL path parameter as the base currency
parameters:
- in: path
name: code
required: true
description: The three-letter code of the base currency for which to fetch exchange rates
schema:
type: string
responses:
'200':
description: Successful response
content:
application/json:
schema:
type: object
description: |
A map where the key refers to the three-letter currency code of the target currency and the value to the exchange rate to the target currency.
additionalProperties:
type: number
format: float
Creating the agent
Now let's create the agent in the Amazon Bedrock console following the steps below:
Select Agents in the left menu.
On the Agents page, click Create Agent.
-
In the Create Agent dialog, enter the following information and click Create:
- Name: ForexAssistant
- Description: An assistant that provides forex rate information.
-
On the Agent builder page, enter the following information and click Save:
- Agent resource role: Create and use a new service role
- Select model: Anthropic, Claude 3 Haiku
- Instructions for the Agent: You are an assistant that looks up today's currency exchange rates. A user may ask you what the currency exchange rate is for one currency to another. They may provide either the currency name or the three-letter currency code. If they give you a name, you may first need to first look up the currency code by its name.
Note that I try to provide concise instructions for the agent to help it reason up front. Depending on the test results, we might need to adjust it later with more prompt engineering.
Creating the action group
While still in the agent builder, we will create the action group that calls our APIs. Let's perform the following steps:
In the Action groups section, click Add.
-
On the Create Action group page, enter the following information and click Create:
- Enter Action group name: ForexAPI
- Description: The currency exchange rates API
- Action group type: Define with API schemas
- Action group invocation: Quick create a new Lambda function
- Action group schema: Define via in-line schema editor
- In-line OpenAPI schema: Copy and paste the OpenAPI YAML from previous section
After 15 seconds or so, you should receive a success message and be returned to the agent builder page. A dummy Lambda function should have been created, so our next step would be to add the logic to call the actual currency exchange rates API.
Updating the Lambda function to call the API
Let's go back into the action group page by clicking on the name of the action group (i.e. ForexAPI) in the list. In the edit page, click on the View button near the Select Lambda function field, which should take you to the function page in the Lambda console.
On the function page, you will see the code template that has been generated for you, which provides some basic processing of the input event and the response event.
After examining the input event format, we will recognize that the attributes that we need to use are:
apiPath
, which should provide the path to the API as defined in the OpenAPI YAML (namely/currencies.min.json
or/currencies/{code}.min.json
).httpMethod
, which should always beget
in our case. We thus won't make use of this attribute directly in our example.parameters
, which we need to provide for the rate lookup API which expects thecode
URI path parameter to be a three-level currency code.
I will spare you the gory details on writing the Lambda function, so here's is the code and some implementation details provided in comments:
import json
import urllib.parse # urllib is available in Lambda runtime w/o needing a layer
import urllib.request
def lambda_handler(event, context):
agent = event['agent']
actionGroup = event['actionGroup']
apiPath = event['apiPath']
httpMethod = event['httpMethod']
parameters = event.get('parameters', [])
requestBody = event.get('requestBody', {})
# Read and process input parameters
code = None
for parameter in parameters:
if (parameter["name"] == "code"):
# Just in case, convert to lowercase as expected by the API
code = parameter["value"].lower()
# Execute your business logic here. For more information, refer to: https://docs.aws.amazon.com/bedrock/latest/userguide/agents-lambda.html
apiPathWithParam = apiPath
# Replace URI path parameters
if code is not None:
apiPathWithParam = apiPathWithParam.replace("{code}", urllib.parse.quote(code))
# TODO: Use a environment variable or Parameter Store to set the URL
url = "https://cdn.jsdelivr.net/npm/@fawazahmed0/currency-api@latest/v1{apiPathWithParam}".format(apiPathWithParam = apiPathWithParam)
# Call the currency exchange rates API based on the provided path and wrap the response
apiResponse = urllib.request.urlopen(
urllib.request.Request(
url=url,
headers={"Accept": "application/json"},
method="GET"
)
)
responseBody = {
"application/json": {
"body": apiResponse.read()
}
}
action_response = {
'actionGroup': actionGroup,
'apiPath': apiPath,
'httpMethod': httpMethod,
'httpStatusCode': 200,
'responseBody': responseBody
}
api_response = {'response': action_response, 'messageVersion': event['messageVersion']}
print("Response: {}".format(api_response))
return api_response
You can copy and paste this code into the editor and click Deploy to update it. At this point, we should test the Lambda function before returning to the Amazon Bedrock console. To do this, you can use the following event template to test the /currencies.min.json
API (note that some irrelevant fields are omitted):
{
"messageVersion": "1.0",
"agent": {
"name": "TBD",
"id": "TBD",
"alias": "TBD",
"version": "TBD"
},
"inputText": "TBD",
"sessionId": "TBD",
"actionGroup": "TBD",
"apiPath": "/currencies.min.json",
"httpMethod": "get"
}
You should see the success response with the list of currencies:
You can then use the following event template to test the /currencies/{code}.min.json
API:
{
"messageVersion": "1.0",
"agent": {
"name": "TBD",
"id": "TBD",
"alias": "TBD",
"version": "TBD"
},
"inputText": "TBD",
"sessionId": "TBD",
"actionGroup": "TBD",
"apiPath": "/currencies/{code}.min.json",
"httpMethod": "get",
"parameters": [
{
"name": "code",
"type": "string",
"value": "usd"
}
]
}
You should see the success response with the list of exchange rates from US dollar to other currencies:
With the Lambda function verified, we can close the Lambda console and return to the Bedrock console to test the agent.
Testing the agent
It is imperative that we test the agent thoroughly to ensure that it provides accurate answers. Back to the agent builder, we need to click on the Prepare button to prepare it, which is required whenever the agent is changed. We can then test the agent using the built-in chat interface to the right of the console using the following prompt:
What is the forex rate from US Dollar to Japanese Yen?
Interestingly, I got the following response from the agent:
Sorry, I do not have the capability to look up the current forex rate from US Dollar to Japanese Yen. I can only provide a list of available currencies, but cannot retrieve the specific exchange rate you requested.
⚠ When I was validating the solution from scratch, the agent was able to return the correct answer. This could be caused by the model parameters that affects variability of responses among other things - the model is a bit of a black box after all! If you cannot reproduce this problem, try a few prompt sessions and ask the same question.
This is seemingly implying that the agent only knows of one API but not the other. So we need to troubleshoot the problem, which is where the ever-important trace feature come into play. The trace helps you follow the agent's reasoning that leads it to the response it gives at that point in the conversation.
When we show the trace using the link below the agent response, we can see the traces for each orchestration steps. There are four traces under the Orchestration and knowledge base tab:
-
Trace step 1 indicates the agent's rationale of first getting the currency code from the list then calling the
/currencies/{code}.min.json
API to get the rate, which seems correct. It is also able to call the/currencies.min.json
API to get the list of currencies to look up the code. So far so good. -
Trace step 2 indicates that it was able to get the currency code for US Dollar as
USD
, however we are not sure why it's in uppercase. It also indicates thatget::ForexAPI::/currencies/USD.min.json
is not a valid function, which is not true. It is unclear about the logic behind the decision. -
Trace step 3 indicates that it is calling the
/currencies.min.json
API again for whatever reason. Lastly trace step 4 indicates that it cannot get the currency exchange rate and therefore gave up with the response we saw in the chat.
Since LLM is for the most part a black box, unfortunately we likely won't be able to get to the root cause. The only wild guess I could make is that the .min.json
part is throwing it off because it doesn't resemble a normal RESTful API, so perhaps we can try to adjust the API specifications to remove that part.
Adjusting the API specs and re-testing
Let's make the adjustment in the OpenAPI YAML by stripping out the .min.json
part from both API URLs:
openapi: 3.0.0
info:
title: Currency API
description: Provides information about different currencies.
version: 1.0.0
servers:
- url: https://cdn.jsdelivr.net/npm/@fawazahmed0/currency-api@latest/v1
paths:
/currencies:
get:
description: |
List all available currencies
responses:
'200':
description: Successful response
content:
application/json:
schema:
type: object
description: |
A map where the key refers to the three-letter currency code and the value to the currency name in English.
additionalProperties:
type: string
/currencies/{code}:
get:
description: |
List the exchange rates of all available currencies with the currency specified by the given currency code in the URL path parameter as the base currency
parameters:
- in: path
name: code
required: true
description: The three-letter code of the base currency for which to fetch exchange rates
schema:
type: string
responses:
'200':
description: Successful response
content:
application/json:
schema:
type: object
description: |
A map where the key refers to the three-letter currency code of the target currency and the value to the exchange rate to the target currency.
additionalProperties:
type: number
format: float
This will cause the agent to pass the API URL without the .min.json
part to the Lambda function in the event, so we need to add it to the URL before calling the currency exchange rates API in line 27. The resulting Lambda code is thus:
import json
import urllib.parse # urllib is available in Lambda runtime w/o needing a layer
import urllib.request
def lambda_handler(event, context):
agent = event['agent']
actionGroup = event['actionGroup']
apiPath = event['apiPath']
httpMethod = event['httpMethod']
parameters = event.get('parameters', [])
requestBody = event.get('requestBody', {})
# Read and process input parameters
code = None
for parameter in parameters:
if (parameter["name"] == "code"):
# Just in case, convert to lowercase as expected by the API
code = parameter["value"].lower()
# Execute your business logic here. For more information, refer to: https://docs.aws.amazon.com/bedrock/latest/userguide/agents-lambda.html
apiPathWithParam = apiPath
# Replace URI path parameters
if code is not None:
apiPathWithParam = apiPathWithParam.replace("{code}", urllib.parse.quote(code))
# TODO: Use a environment variable or Parameter Store to set the URL
url = "https://cdn.jsdelivr.net/npm/@fawazahmed0/currency-api@latest/v1{apiPathWithParam}.min.json".format(apiPathWithParam = apiPathWithParam)
# Call the currency exchange rates API based on the provided path and wrap the response
apiResponse = urllib.request.urlopen(
urllib.request.Request(
url=url,
headers={"Accept": "application/json"},
method="GET"
)
)
responseBody = {
"application/json": {
"body": apiResponse.read()
}
}
action_response = {
'actionGroup': actionGroup,
'apiPath': apiPath,
'httpMethod': httpMethod,
'httpStatusCode': 200,
'responseBody': responseBody
}
api_response = {'response': action_response, 'messageVersion': event['messageVersion']}
print("Response: {}".format(api_response))
return api_response
Once you updated both, you can prepare and test the agent again. Interestingly, we now get a proper response:
The current forex rate from US Dollar (USD) to Japanese Yen (JPY) is 1 USD = 158.3147925 JPY.
To ensure that the number is from the API and not other sources, we can review the agent's rationale from the trace like before. In trace step 2, we can see the right rationale and the invocation of the /currencies/{code}
API with USD
as the code parameter (again, not sure why it's in uppercase) as expected:
Hurray, we have successfully build a basic forex rate assistant using Agents for Amazon Bedrock! Naturally, you should test the agent extensively since LLMs are sometimes unpredictable and may require adjustments.
💡 In a follow-up blog post How To Manage an Amazon Bedrock Agent Using Terraform, I provide details on how to automate the deployment of this solution using Terraform. Please feel free to read it or refer directly to the configuration in this repository.
Testing reveals more limitations
For sake of experimentation, let's see what happens when we ask the assistant to do the reverse conversion. We will continue with the conversation in the same chat session and enter the following prompt:
What about the other way around?
The agent responded with the following:
The current forex rate from Japanese Yen (JPY) to US Dollar (USD) is 1 JPY = 0.0063163 USD.
However, a quick check shows that the number is a bit off. The response from https://cdn.jsdelivr.net/npm/@fawazahmed0/currency-api@latest/v1/currencies/jpy.json (at the time of writing) shows 0.0063165291 which is also what I got from the calculator for 1 / 158.3147925. Again, we will need to review the trace to see what the agent is up to. The trace revealed that the agent is doing an inverse calculation but the calculation is incorrect for some reason:
My expectation is that the agent should do another lookup from the API to get the right number. If the API were developed for a business and has a spread between the two exchange rates for profit, the agent would have given the wrong information. Putting that aside, the calculation is simply wrong.
After doing some reading online, it seems that LLMs in general are bad at math because their design is to predict words and not performing computations. So the exchange right 0.0063163 might just be a predication by Haiku based on the data that it was trained with.
Additional thoughts and summary
While we have built a functional forex rate assistant using Agents for Amazon Bedrock, it is certainly not production grade since it is not super accurate and it is a bit slow. Improving its accuracy is where the bulk of the effort for gen AI lies. AWS recommends the following strategies which developers should sequentially employ to improve their gen AI application:
For instance, my next iteration of improvement could start with adjusting the model inference parameters and prompt engineering, perhaps to ensure that it always calls the API instead of trying to do calculations. We also ought to look at why the LLM provide uppercase currency codes. Prompt engineering is admittedly more of an art and will require many rounds of trial and error, so be prepared for that.
I hope you learn something new from this blog post and has a better understanding of the features, potentials, and limitations of Agents for Amazon Bedrock. We are only scratching the surface here, so you are encouraged to use this forex agent as a start point for more improvements or develop your own agent. You would also need to expose the agent to end-users with a new frontend or an existing application. For me, the next step is to look into how to manage Bedrock agents using Terraform with the hot-off-the-press resources.
If you enjoyed this blog post, please be sure to check out other contents related to AWS and DevOps in the Avangards Blog. Thanks for your time and have fun with gen AI!