Agentbrisk
Python MIT multi-agentorchestration

Agency Swarm

Multi-agent framework built on OpenAI Assistants API with role-based agents and structured communication


Agency Swarm is a Python multi-agent framework built on top of the OpenAI Assistants API. You define agents with specific roles and tools, then wire them into an Agency that controls which agents can talk to which. The framework handles thread management, tool execution, and the Assistants API boilerplate so you can focus on defining what each agent does and who it can hand off to.

Most multi-agent frameworks make you choose between flexibility and simplicity. Agency Swarm makes a different bet: go deep on one provider (OpenAI), use the Assistants API's native capabilities aggressively, and give developers a clean way to define who talks to whom. For teams already committed to OpenAI, that bet pays off. For everyone else, the coupling is the ceiling.

I want to be honest about the audience here. Agency Swarm is not trying to compete with CrewAI or AutoGen across all use cases. It's specifically designed for OpenAI Assistants API workflows, and if that's where you're operating, it does that job well.

What Agency Swarm is

Agency Swarm was created by Arsenii Shatokhin (GitHub: VRSEN) in 2023 as a personal project to make building multi-agent systems with OpenAI's Assistants API less painful. The OpenAI Assistants API was released in late 2023 and brought persistent threads, built-in code interpreter, and native file handling to OpenAI's models. The problem was that building multi-agent systems directly on the Assistants API required a lot of boilerplate: thread management, run polling, tool dispatching, and message handling all had to be managed manually.

Agency Swarm abstracts that boilerplate. You define agents as Python classes with instructions and tools. You wire them into an Agency. The framework handles the rest: creating threads, polling run status, routing messages between agents, and dispatching tool calls.

The project sits at roughly 5,200 GitHub stars, which is modest compared to CrewAI or AutoGen but represents a dedicated community building specifically on this tech stack. The framework is MIT-licensed and free to use.

Core concepts

Agents

An agent in Agency Swarm is a Python class that inherits from Agent. You define its name, description, instructions, and tools. Instructions are a string or path to a text file describing the agent's role and behavior. Tools are Python classes that inherit from BaseTool and define what the agent can do.

from agency_swarm import Agent
from agency_swarm.tools import CodeInterpreter

class Analyst(Agent):
    def __init__(self):
        super().__init__(
            name="Analyst",
            description="Analyzes data and produces reports.",
            instructions="You analyze data files and produce clear summaries.",
            tools=[CodeInterpreter],
            model="gpt-4o",
        )

The CodeInterpreter tool is a first-class built-in that maps directly to the OpenAI Assistants API's code interpreter. You don't have to set up a sandbox, manage execution, or handle output parsing. The OpenAI API does it natively.

Tools

Custom tools inherit from BaseTool, which is a Pydantic model. You define the tool's inputs as class fields, then implement a run method that executes the logic.

from agency_swarm.tools import BaseTool
from pydantic import Field

class SearchDatabase(BaseTool):
    query: str = Field(..., description="The search query to run against the database.")

    def run(self):
        # your search logic here
        results = db.search(self.query)
        return str(results)

The Pydantic model approach has a real advantage: the field descriptions become the parameter descriptions sent to the OpenAI function calling API. The LLM sees well-described parameters and makes better tool calls as a result. It's a cleaner pattern than writing JSON schemas by hand.

The Agency

The Agency is the container that puts agents together and defines who can talk to whom.

from agency_swarm import Agency

ceo = CEO()
analyst = Analyst()
developer = Developer()

agency = Agency(
    [
        ceo,                    # CEO is the entry point (user talks to CEO)
        [ceo, analyst],         # CEO can message Analyst
        [ceo, developer],       # CEO can message Developer
        [analyst, developer],   # Analyst can message Developer
    ],
    shared_instructions="All agents operate with professional communication."
)

The communication chart is the key design decision in Agency Swarm. The list of pairs defines a directed graph: agent A can send messages to agent B if [A, B] appears in the list. Agents not connected in this chart cannot communicate directly. This explicit topology prevents the chaotic all-to-all communication patterns that can cause infinite loops in less structured multi-agent frameworks.

The first element in the agency list that isn't a pair is the entry point: the agent that receives the initial user message. In the example above, the CEO receives user input and decides whether to handle it or delegate to Analyst or Developer.

How Assistants API integration works

This is the part that differentiates Agency Swarm from frameworks that use chat completions directly.

The Assistants API is stateful. It maintains a thread (a conversation history) that persists between calls. Runs are asynchronous: you submit a run, poll for its status, and collect the result when it completes. Tool calls happen within a run: the API responds with a list of tools it wants to call, you execute them and submit the results, and the run continues.

Agency Swarm manages all of this automatically. When you call agency.run_demo() or agency.get_completion(), the framework:

  1. Creates an OpenAI thread if one doesn't exist for this conversation
  2. Creates a run on the appropriate assistant
  3. Polls for run status with exponential backoff
  4. Dispatches tool calls by instantiating the corresponding BaseTool classes and calling run()
  5. Submits tool outputs back to the run
  6. Returns the completed message when the run finishes

When one agent needs to send a message to another, the framework uses a special SendMessage tool that the LLM calls to initiate inter-agent communication. The receiving agent's assistant processes the message in its own thread and returns a response.

The thread persistence means that agents accumulate context across a conversation. A CEO agent that delegated to an analyst yesterday can reference that conversation today because the Assistants API thread is still there. This is a genuine capability advantage over stateless chat completion patterns.

The built-in demo UI

Agency Swarm includes a Gradio-based demo interface. Call agency.demo_gradio() and a chat UI opens in your browser where you can talk to the entry point agent, see agent-to-agent communication in a thread view, and test the system interactively.

agency.demo_gradio(server_name="0.0.0.0", server_port=7860)

This is useful for rapid validation and for sharing demos with stakeholders who don't want to run Python. It's not a production UI, and you shouldn't use it as one, but for "show me it works" purposes it delivers immediately.

The pricing reality

Agency Swarm is MIT-licensed and free. What costs money is the OpenAI Assistants API.

Assistants API pricing is per token, same as chat completions. But the Assistants API has some cost considerations that aren't obvious at first:

  • Thread storage: OpenAI charges for storing thread content. Long conversations in persistent threads accumulate storage costs.
  • Run overhead: Each run has a minimum context: the system prompt, the thread history, and the tool definitions. Complex agencies with many tools per agent can have expensive system contexts.
  • Code interpreter: Using the code interpreter tool adds $0.03 per session (as of early 2026). For workflows that frequently invoke it, that adds up.
  • File search: Vector store indexing for file search has separate per-GB and per-query pricing.

In practice, for a CEO agent delegating to two specialist agents handling a few tasks per day, the costs are modest. For high-volume production workflows processing thousands of requests per day, model your actual costs before committing.

Limitations

The tight OpenAI coupling is the most significant constraint. Agency Swarm was designed around OpenAI's Assistants API, and that API's specific capabilities (persistent threads, code interpreter, file search) are not available from Anthropic, Google, Mistral, or open-source providers. If you start building on Agency Swarm and later need to switch models for cost, policy, or capability reasons, you're rewriting the core of your system.

The community is smaller than the major frameworks. CrewAI has 51,000+ stars and an active enterprise product. AutoGen has 58,000+ stars and years of community contributions. Agency Swarm's 5,000 stars means you'll find fewer tutorials, fewer Stack Overflow answers, and fewer people who've hit the same edge case you're hitting.

Debugging is less transparent than code-based frameworks. When something goes wrong in a multi-agent flow, the state is distributed across OpenAI threads that you have to inspect through the API or the OpenAI dashboard. The framework doesn't ship an integrated debugger or trace viewer.

Agency Swarm vs the alternatives

Agency Swarm vs CrewAI

CrewAI is a better default for most teams. It's model-agnostic, has a larger community, and the role-based mental model maps naturally to multi-agent orchestration. Agency Swarm wins on Assistants API-specific features: thread persistence, native code interpreter, and built-in file handling. If you're heavily invested in OpenAI and those specific capabilities matter to your use case, Agency Swarm is worth evaluating. Otherwise, CrewAI is the safer bet.

Agency Swarm vs AutoGen

AutoGen is more flexible and supports async concurrent execution, multiple model providers, and Docker-based code execution. The group chat model is more expressive than Agency Swarm's explicit communication topology, but also harder to reason about. Agency Swarm's topology definition is cleaner for workflows where you know exactly which agents should talk to which. AutoGen is better for research-style workflows where emergent agent interaction is part of the point.

Agency Swarm vs OpenAI Swarm

OpenAI Swarm is a lightweight experimental library from OpenAI for agent handoffs, also built on the chat completions API (not Assistants). It's even simpler than Agency Swarm but much more limited: no persistent threads, no built-in tools, minimal orchestration support. Agency Swarm has more structure and production-oriented features. OpenAI Swarm is better treated as reference code than a production framework.

Getting started

pip install agency-swarm

Set your API key:

export OPENAI_API_KEY=sk-...

A minimal two-agent agency:

from agency_swarm import Agency, Agent

ceo = Agent(
    name="CEO",
    description="Manages the team and delegates tasks.",
    instructions="You manage the team. Delegate technical tasks to the Developer.",
    model="gpt-4o-mini",
)

developer = Agent(
    name="Developer",
    description="Writes and reviews code.",
    instructions="You write clean Python code for the tasks delegated to you.",
    model="gpt-4o-mini",
)

agency = Agency([ceo, [ceo, developer]])
agency.demo_gradio()

Run it, open the browser UI, and ask the CEO to build something. Watch it delegate to the Developer and return the result. From here, add custom tools by subclassing BaseTool, extend to more agents, and define more communication edges.

The official docs at vrsen.github.io/agency-swarm/ cover tool creation, thread management, and advanced patterns including shared file access between agents.

Who it's for

Agency Swarm is the right choice for a specific audience: Python developers building multi-agent systems who are committed to the OpenAI ecosystem and want the Assistants API's stateful capabilities without writing the plumbing themselves.

If you need thread persistence across sessions, built-in code execution, and file handling, and you're comfortable with OpenAI's pricing and API stability, Agency Swarm removes a lot of boilerplate and gives you a clean abstraction for multi-agent communication flows.

If you need model flexibility, a larger community, or more sophisticated orchestration patterns, CrewAI or AutoGen will serve you better. The explicit communication topology is one of Agency Swarm's genuinely good ideas, but the rest of the ecosystem around those frameworks is substantially larger.

Key features

  • Role-based agent definitions with instructions, tools, and personality
  • Agency class that manages communication flows between agents
  • Structured agent-to-agent messaging with explicit routing rules
  • OpenAI Assistants API integration with persistent thread management
  • Code interpreter, file search, and custom function tools built in
  • Gradio-based demo UI included out of the box
  • Agent memory via OpenAI thread persistence across sessions

Frequently Asked Questions

What is Agency Swarm?
Agency Swarm is a Python framework for building multi-agent systems on top of the OpenAI Assistants API. You define agents as roles with specific instructions, tools, and capabilities. You then group them into an Agency that specifies which agents can communicate with which. The Agency class manages conversation threads, routes messages between agents, and handles the Assistants API boilerplate. It was created by Arsenii Shatokhin (VRSEN) and has around 5,000 GitHub stars as of 2026.
Does Agency Swarm work with models other than OpenAI?
Agency Swarm is fundamentally built on the OpenAI Assistants API, which is proprietary to OpenAI. The framework uses Assistants-specific features like persistent threads, built-in code interpreter, and file search, which are not available from other providers. Using it with Anthropic, Google, or open-source models would require replacing the core execution layer, at which point you'd be better served by a model-agnostic framework like CrewAI or AutoGen.
How does Agency Swarm compare to CrewAI?
Both use role-based agent definitions, but they diverge in execution. Agency Swarm runs on the OpenAI Assistants API, which gives you persistent threads, built-in file handling, and native code interpreter support. CrewAI uses direct chat completions and works with any LLM provider. Agency Swarm's Assistants API integration is a genuine convenience if you're already on OpenAI and need thread persistence. CrewAI is more flexible across providers and has a larger community. If you might ever want to use a non-OpenAI model, CrewAI is the safer choice.
What is an Agency in Agency Swarm?
An Agency is the top-level container that defines which agents exist and how they can communicate. You create an Agency by passing a list of agents and a communication chart: a list of pairs that specifies which agent can send messages to which. For example, [[ceo, developer], [ceo, analyst]] means the CEO agent can message the Developer and Analyst, but Developer and Analyst cannot message each other directly. This explicit topology prevents chaotic all-to-all communication that can cause loops in less structured frameworks.
Is Agency Swarm production-ready?
It depends on your definition of production. Agency Swarm works reliably for OpenAI-backed workflows, and teams have shipped real systems with it. The limitations to watch: the OpenAI Assistants API has its own reliability and latency characteristics, costs can be higher than equivalent chat completion flows, and the framework's smaller community means fewer answers to edge case problems. For production systems that need multi-provider flexibility or advanced observability, you'll want to add external logging and consider whether the Assistants API thread model fits your use case before committing.
How do I install Agency Swarm?
Install it with pip: pip install agency-swarm. You'll also need an OpenAI API key set as the OPENAI_API_KEY environment variable. The framework requires Python 3.10 or later. The quickstart in the docs at vrsen.github.io/agency-swarm/ walks you through creating your first agent and agency in about 30 lines of code.
Search