If you ask ten developers to define an 'AI agent,' you'll get fifteen different answers. And if you ask a senior developer, they will say, "It depends."
However, the definition problem goes deeper than that of developer pedantry. Our industry shifted rapidly from rule-based systems to machine learning to large language models before we could agree on standard definitions. For example, in my last blog post, I wrote about our struggle to define 'Open Source AI.' We seem to face a similar challenge with autonomous agents.
Many of our AI definitions came from theory. Since the 1950s, academia was forward looking. They defined terms for AI systems that people weren't using yet. Today, these systems are real and practical, but reality doesn't match our theoretical frameworks. I noticed the term "agent" is applied to everything from workflows to Large Language Models (LLMs), so I asked my community for clarity. Kurt Kemple ( @theworstdev ) , a Senior Director of Developer Relations at Slack, shared a perspective that resonates with me:
"This is a pretty big issue honestly! For me, autonomous agents are those that can take actions (with or without human interaction) as part of their response. Like a genAI app can generate text based on a question, an autonomous agent could also run code, kick off workflows, make API calls, etc. as part of the response to an event or interaction."
I also appreciated my coworker Max Novich's definition, which he describes in the video below.
Max says,
"So to me, autonomous, an agent is some form of software, not specifically AI, that can execute actions on your behalf, from a simple request to a more complex action. So, like, you don't have to hold its hand allough, like, every step of the way. Like, you know, you don't have to go, go open that, click that, go there. Like, okay, I need to log in into GitHub. Just go do right. And like, it can extrapolate what actions need to be taken, and it can actually take those actions."
To maintain clarity throughout this post, I'll define an autonomous agent as a tool that can execute operations without human intervention.
Introducing Goose
I'm experimenting with an AI developer agent called Goose. Many AI programming tools increase development speed, but Goose is unique because it's semi-autonomous. This means it independently executes tasks from start to finish but knows when to ask for human assistance.
You can tell Goose once to develop a web application, perform a migration, or create a test suite, and it will handle everything else—from planning to execution—without needing more input from you.
By default, Goose:
Creates a plan
It shows you the plan
Executes the plan
In this context, a plan is your prompt broken down into a series of concrete steps. I think it’s particularly cool that Goose can retry steps or update its plan when a step in the plan fails.
Here's video example of Goose creating an executing a plan:
In a more advanced example, Max prompts Goose to browse the internet for him and do a little online shopping:
The Logic that Makes Goose Semi-Autonomous
Since Goose is open source, we can examine the repository to better understand its planning and execution capabilities. Please note that what I describe below is subject to change as Goose is still in its early stages.
Goose adopts a “Bring Your Own LLM” approach, meaning you select and connect it to any of LLM providers below:
Anthropic
Azure
Bedrock
Databricks
Google
Ollama
OpenAI
This flexibility allows developers to experiment with different models. To connect Goose to an LLM, you’ll need your personal API key, and you will need to configure a profile.yaml file.
If you wanted to use OpenAI’s GPT-4 mini, you might configure your profile.yaml to look like this:
Goose connects to the LLM through a class called Exchange, which handles communication with the AI model. When you prompt Goose, Goose uses the ask_an_ai method to consult the LLM and create a plan. Goose then follows the plan by executing a set of shell commands.
Here's the flow: User writes prompt → Goose communicates goal with LLM → LLM and Goose work together to create a plan. → Goose executes the plan via shell commands.
When a step fails, Goose may decide to:
Tell the LLM about failures and ask for an updated plan
Let you know if commands run too long
Alert you when it can’t run certain commands because it needs elevated permissions
This process is what makes Goose semi-autonomous because it works independently but knows when to ask for help.
Building Trust and Maintaining Control
For me, as a developer, at first I had a few trust issues with it. I’m more accustomed to working with AI tools where I can check the work each step of the way, so this was a workflow shift for me. I always believe in checking the work of an AI first before you commit the code, which I still continue to do with Goose. I already double check my code and my coworkers' code, so AI is not an exception.
Creating User-Generated Plans
One option that provided control for me is that I could create a custom plan for Goose to follow instead of relying on Goose to create a plan. This works great when I know what needs to happen but don't want to do it all manually.
You can create a plan in a markdown or yaml file. User-generated plans start with a kickoff message to provide Goose with some initial context.
Here’s an example kickoff message to guide a database migration:
kickoff_message:We're initiating a database migration to transfer data from the legacy database system to the new architecture. This migration requires us to ensure data integrity, minimize downtime, and verify successful migration at various stages.
Then, you can list concrete steps for the plan delineated by dashes -:
tasks:-Backup the current database:Ensure all data is backed up to a secure location before starting the migration process.-Set up the new database environment:Deploy the necessary infrastructure and configurations for the new database system.-Export data from the legacy database:Use database export tools to create data dump files.-Transfer data files to the new system:Securely copy the data dump files to the environment of the new database.-Import data into the new database:Utilize database import tools to load the data into the new database structure.-Validate data integrity:Run checks to compare and verify that data in the new database matches the legacy database.-Update database connections in application:Modify application settings topoint to the new database.-Monitor performance:Observe the new database's performance and configuration for any anomalies post-migration.-Document the migration process:Record detailed steps taken during the migration for future reference.
After you save the plan in a file, you can run it using the following command:
LLMs changed the way we work, but they only scratch the surface of what’s possible. Technologists are pushing the boundaries and automating our workflows via agents.
Some people may not like AI, but I appreciate that AI has helped move my programming career forward. As a neurodivergent developer, AI developer tools like Goose help me stay focused and productive despite my ADHD. There's so much more that we can do to use AI to build AI tools that make coding more accessible!
I hope the open source nature of tools like Goose empowers developers to make coding more accessible. For example, my coworker, Max Novich developed “Goose Talk to Me”, which lets developers use voice commands to work with code. This lowers the barrier for developers with visual impairments or limited mobility.
As we push the boundaries of what agents can do, we need to prioritize ethics and accessibility. And I invite you to get involved in the movement.
Get involved
Goose is still in its infancy, so there’s many opportunities for you to help us improve it! (I also think it's cool to get involved with a project at the ground level).
A voice interaction plugin for your goose. This project
leverages a local copy of Whisper for voice interaction and transcription.
Project Description
Goose-Talk-To-Me is a project dedicated to enabling voice interactions using state-of-the-art AI
technologies. It uses tools and libraries like goose-ai, openai-whisper, sounddevice, and
others to provide seamless voice processing capabilities.
Features
Voice Interaction using goose-ai
Voice to text transcription
Real-time voice processing
Text to speech using pyttsx4
Requirements
Python >= 3.12
goose-ai
openai-whisper
sounddevice
soundfile
numpy
scipy
torch
numba
more-itertools
ffmpeg
pyttsx4
Installation
Install the dependencies and prepare your environment:
block-open-source.github.io/goose-plugins is our community content showcase for Goose. We've created this hub of content creation tasks for the community to share their experiences and help others learn about Goose.
🤝 Pick ONE of the following issues to contribute to this project
❗You must only assign yourself one task at a time to give everyone a chance to participate.❗
You may assign yourself your next task after your current task is reviewed & accepted.
🚫 You must not steal an issue assigned to another person. If you submit a PR for an issue not assigned to you, you will not receive points. 🚫