Posted On04-02-2025 08:45 AM

Current + Future Trends in AI Agent Action Spaces

Posted by Moonsong Labs

Defining the Action Space of an AI Agent
Different Kinds of Actions
Registries of Built-in Actions
Custom Actions
Actions as Tools
Inter-Agent Actions
Building and Maintaining These Actions
Future Trends in AI Agent Actions
Dynamic and Discoverable Actions: The Big Unlock
Toward the Agent Economy
Why Web3 Matters Here
Conclusion: Toward Evolving, Dynamic Agents

Over the last few years, generative AI and agent-based systems have stirred up tremendous interest among developers and entrepreneurs alike. From conversation-driven virtual assistants to specialized coding agents, more products today are built around one core insight: let the machine reason in a human-like way, then let it take actions in the world to achieve a user’s or company’s goal. At Moonsong Labs, we’ve been researching and building extensively at the intersection of AI and Web3, aiming to create a future in which these AI agents not only reason effectively, but also engage autonomously with decentralized infrastructures and marketplaces to unlock new types of services. The action spaces of these AI agents, meaning the sets of possible actions they can take, determine how powerful and flexible they become. If we want agents that can truly help us solve problems—whether they involve analyzing data, coordinating enterprise systems, or even sending blockchain transactions—we need to give them the right repertoire of capabilities, plus ways to discover and use new ones safely.

In this blog, we’ll walk through how we define an AI agent’s action space, why it matters, and how it’s evolving. I’ll explore the types of actions that agents can take, highlight what it takes for developers to build them, and discuss why “dynamic” or “discoverable” action spaces are likely the next big frontier. Finally, we will discuss how these evolving action spaces will power form of digital commerce, a concept we at Moonsong Labs are calling the Agent Economy. Throughout, we will also reflect on how this ties into Web3’s broader properties—such as digital ownership and trust-minimized interactions or coordination—and why those properties are so relevant to advanced agentic systems.

Defining the Action Space of an AI Agent

Most people are now familiar with “prompting” a large language model (LLM), like ChatGPT, by giving it text instructions and having it respond with more text. That is an example of no action space: an LLM can only produce text, it can’t do anything else. Even so, generating text is often enough for a user to glean insights or summarize a new idea. But if we want a more capable AI system—one that can take steps on our behalf, query data in our databases, or invoke a specific piece of software—a single action of “reply with text” is too limiting.

An AI agent’s action space is essentially the set of “moves” it can execute in the real (or digital) world. Each distinct action is something the agent can do intentionally to further its goals, from “fetch customer information from a CRM” to “execute a transaction on a blockchain.” A well-defined action space is crucial: it determines how the agent can help. By giving an agent the ability to do more (for instance, send an email, parse web pages, or generate and run code), you significantly expand its utility but you also expand potential risks, such as picking the wrong tool amongst a large list of potential candidates.

When we talk about an “agentic loop,” we mean the agent’s cycle of (1) perceiving the environment, (2) reasoning about what to do, (3) taking actions, and then (4) evaluating results. The final step leads the agent to update its plan and act again, continuing until it meets the user’s goal or fails in a defined, safe manner. The actions it can take shape what is feasible within that loop. An agent that can only produce text is stuck in a more static feedback cycle where the only observation can come from an external user input or a self-reflection that constructs or externalizes a scaffold of thoughts, which the agent can then use to augment its reasoning ability to work through tasks. An agent that can call specialized tools or manipulate code dynamically can transform the entire problem-solving process. This shift is part of why AI “agents” feel qualitatively different from simple chatbots: they don’t just talk, they do.

Different Kinds of Actions

Although we talk about “agent action spaces” in general, there are many types of actions. Some come baked into existing agent development frameworks. Others are custom-coded. Some are as simple as an API call; others might entail orchestrating multi-step protocols across different software systems.

Registries of Built-in Actions

Agent-building platforms such as LangChain or other tool-augmented frameworks often have registries of built-in actions like “web search,” “send a message to Slack,” or “retrieve records from a vector database.” We can use these actions to get an agent quickly up and running. But as we begin to expect more specialized tasks, we inevitably outgrow these standard actions and need to develop custom ones that are relevant to our use case.

Custom Actions

You might be building an AI assistant that integrates deeply with a supply-chain platform or that navigates an enterprise CRM. The built-in “web search” or “send tweet” actions won’t help if you actually need “read all order statuses from the ERP system” or “issue an inventory restock request.” Custom actions like these are crucial for specialization: if your agent can’t call those functions, it can’t do its job. And as tasks get more complex, the variety of needed custom actions grows rapidly.

Actions as Tools

We can also think of actions as wrappers around external APIs or services. If your agent can “POST to this API endpoint with these parameters,” that becomes a well-defined step in the agentic loop. Another approach is a headless browser, where the agent effectively “drives” a regular browser session using scripts, filling forms and clicking buttons just as a human would. Anthropic’s “Computer Use” feature provides an example of an AI agent that navigates a simulated desktop environment. These browser-based techniques mimic human actions in existing web interfaces without requiring direct integration. While more flexible, they can be more brittle if the site changes layout or if subtle authentication flows break.

Inter-Agent Actions

One of the most exciting frontiers is an agent calling on another agent for help. In the near future, if your agent needs a specific capability—like complex scheduling or specialized analytics—it might call a “supplier” agent that offers exactly that service. The supplier agent charges your agent, does the job, and returns a result. This dynamic, multi-agent scenario is part of what we call “the Agent Economy,” in which agents buy and sell services on behalf of their human owners or organizations, or even autonomously, forming an entire network of specialized, collaborating AIs. In some sense, each action in your agent’s repertoire would then be “call that agent to perform X.” But for this to be reliable, we need frameworks for trust, verification, identity, and payments—areas where Web3 can play a pivotal role.

Building and Maintaining These Actions

Despite the promise of wide-ranging agentic capabilities, implementing agent actions still involves plenty of engineering work behind the scenes. Developers must decide exactly what steps are needed, and they must code them, or at least define an interface to them. This means:

Identifying the Actions: Even though LLMs can “hallucinate” solutions, it’s up to a human developer (or an advanced meta-agent) to figure out what real-world steps are relevant, a situation that reflects our current, relatively primitive state of LLM usage, while in more autonomous systems, the developer sets the objective and constraints rather than defining the action space, leaving the agent to determine that space on its own. If your agent is an internal helpdesk assistant, you might need actions like “reset user password,” “modify user role in the enterprise directory,” and “file an IT ticket.” Skipping any of these might cripple the assistant’s usefulness.
Implementing the Actions: If your agent is allowed to place an order in an e-commerce system, you have to write the code that does so. You must handle authentication, error handling, and security checks. This can be time-intensive. Because each environment is different, building robust custom actions is typically not trivial.
Maintaining and Updating: Software changes: APIs get version bumps, parameters shift, login flows are updated. Over time, you have to keep each action consistent with the external environment or service. As the agent or the environment changes, you might need to re-certify that the action still performs correctly.

All these considerations can make agent building as or more resource-intensive than building traditional software. Granted, the hope is that the agent, once given the right actions, can orchestrate them in sophisticated ways. But we’re not yet at a stage where the entire scope of “figure out how to do tasks, discover new tools, wire them up safely, and handle edge cases” is automated. That’s where the future leads us, but the present requires developers to define a good chunk of these possibilities.

Future Trends in AI Agent Actions

Still, the pace of progress in LLM-based systems has been so rapid that we see major leaps each quarter—particularly in how agents can incorporate new tools. Below are a few key emerging trends that we believe will reshape how AI agents discover, adopt, and refine their actions.

1. Action Marketplaces

Today, many agent frameworks come with a handful of built-in or user-submitted actions. In the near future, we expect to see robust “action marketplaces,” reminiscent of app stores. Agents or agent developers could browse or automatically discover new actions—be they analytics connectors, real-time data feeds, or custom services from specialized providers. We’re also seeing the rise of MCPs as a standard that easily packages and delivers these actions,started by Anthropic, later adopted by OAI, and planned for Google, acting as a wrapper for arbitrary services behind an agentic client<>host interface. Not only would this expedite agent development, it would mean that no single developer has to code or maintain an entire suite of actions in isolation. The synergy from a marketplace, where each new action can help hundreds of agents, should accelerate innovation.

2. Actions as Code Actions

One reason LLM-driven coding has taken off is that large models are surprisingly adept at writing or modifying code. If we represent each action as a snippet of code—like a Python function that accomplishes a task—then an agent can not only reason about using that code but also about improving or composing it with other code actions on the fly. This approach potentially unlocks “code reasoning” features, in which the LLM weighs multiple candidate implementations or modifies existing ones for better performance. The synergy between code-centric approaches and AI-based reasoning is already visible in next-generation developer tools, but for agentic systems, it becomes even more powerful.

3. Generated Actions vs. Discovered Actions

What if your agent needs an action it doesn’t yet have? Today, the developer must scramble to write or supply that action. In the future, we can imagine multiple strategies:

Wait for the dev to code it. This remains the surest path to reliability but can introduce significant latency for any new or missing feature.
Generate it on the fly. The agent uses its LLM to propose brand-new code. This is error-prone, but as the models get better, we’ll see more immediate if imperfect solutions.
Discover it in real time. If an agent can search an open registry, shared memory, or marketplace of actions, it might find a relevant code snippet or “micro-service” that does exactly what it needs, then integrate it automatically.

All three paths have trade-offs in reliability, cost, and speed, so the question becomes how best to unify them. Without standard ways for agents to search for external capabilities or to verify trust and safety constraints, “discoverable action spaces” remain an unsolved challenge.

Dynamic and Discoverable Actions: The Big Unlock

When an AI agent can truly expand its own set of possible actions—either by discovering existing code from a public registry or generating new code with only minimal oversight—something transformative begins to happen. Instead of being locked into a pre-built set, an agent can adapt to new tasks and contexts more fluidly. Over time, it can become far more capable than a static agent with no ability to evolve. At Moonsong Labs, we see this shift as fundamental to building a more flexible, networked AI ecosystem, where capabilities are not only more widely shared but are also evolving continuously.

However, the practical obstacles remain daunting: you need consistent metadata describing each action’s function, cost, requirements, and safety constraints. You need a standardized way to evaluate new actions, along with robust governance so the agent doesn’t inadvertently adopt malicious or subpar code. And you need a distributed or decentralized approach so that no single entity is capable of compromising and thus putting a risk to the entire network. We have written about these network-level frameworks previously, in which specialized agents supply or consume discrete services from one another, underpinned by Web3 for identity, security, and on-chain payments.

The big payoff is that, once we solve these standardization and discovery challenges, the agent’s capabilities no longer stagnate. As the world changes or as new tools appear, the agent can keep pace, discovering or generating new actions in an ongoing cycle of improvement. The more open and participatory the ecosystem, the faster that improvement curve may be.

Toward the Agent Economy

Imagine a future scenario: you run a mid-sized logistics company, and your day is filled with tasks like scheduling shipments, adjusting routes due to weather, filing customs paperwork, and negotiating rates with carriers. You adopt an AI agent to automate part of this overhead. Over time, your agent obtains new actions, from route-optimization modules to real-time port congestion data to specialized customs-documentation software. If it can’t find the right tool in its repertoire, it queries a global “action memory.” There, it finds an advanced forecasting sub-agent, which has access to proprietary information to increase the accuracy of its forecasting, that charges a small usage fee. The main agent pays that sub-agent, obtains the forecast, and then adjusts your logistics accordingly. Next, the sub-agent itself updates its route-optimization code after noticing an even more specialized path used by a competitor. The entire network of agents is effectively learning from each other via collectively improved code.

This example points to a future where economic transactions can occur between two agentic systems in need of collaboration. In short, software can consume services from other software, with minimal human oversight, each paying and being paid, orchestrating their tasks on blockchain based trust-minimized networks.

This agent collaboration model which has access to proprietary information to increase the accuracy of its forecasting, is especially compelling for enterprise contexts, where specialized tasks can be arrayed among multiple agents in a secure environment. But it also resonates with consumer applications, where you might have a personal agent coordinating an entire digital presence—automatically purchasing items, analyzing your personal finances, or scheduling complex travel—by working with a range of specialized sub-agents.

Why Web3 Matters Here

At Moonsong Labs, we’ve spent years focused on Web3 because it allows software-based entities to hold assets, form contracts, and do so with a high level of transparency and verifiability. If each agent needs to prove it is reputable, or if you want to ensure that a sub-agent is paid only if it delivers the correct result, blockchains provide an excellent software-native and trust-minimized environment to handle identity, reputation, and settlement. It is a great match for creating agent infrastructure.

When agents become economic actors—buying specialized capabilities from each other or selling sub-services—fraud or misuse can also arise. Web3’s verifiable state can mitigate some of these concerns, and tokens or stablecoins can serve as the medium for micropayments. Additionally, a decentralized critical resources for agents, being it a network memory, an action registry, or an agents or service marketplace is more resilient than any single company-run repository, reminiscent of how open-source software communities maintain libraries collectively. Combined with standard protocols for discovering and validating actions, we can see how the agent’s ability to “grow” its action space is inextricably tied to the properties of a distributed environment.

Conclusion: Toward Evolving, Dynamic Agents

We stand at a juncture where AI systems are already surprising us with their creative problem-solving, yet remain limited by rigid definitions of what they can do. The next great leap forward is to equip these agents with more flexible, more dynamic, and more discoverable action spaces—so they can truly adapt to the tasks and challenges we face in real time. This shift will require a robust approach to standardizing, discovering, and verifying new actions. It will involve building marketplaces and networks that agents can tap into spontaneously. It will also demand that we get serious about safe execution and governance, because an agent that can do anything is only useful if it doesn’t do something unintended or damaging.

At Moonsong Labs, we like to say that “the network is the gym”—the place where agents collectively train, learning from shared action histories and discovering new “moves” in real time.

We see an emerging architecture in which dynamic action spaces become the beating heart of the Agent Economy. Agents will not merely be static software modules or ephemeral chatbots, but living participants in a decentralized ecosystem. Each agent’s action repertoire can be extended on the fly, either by generating new code or by pulling from a global marketplace. Over time, these agents will shape their own environment by building or refining the very tools they rely on.

For businesses, the ability to harness such agents means rethinking how software is planned, built, and deployed. Instead of enumerating every feature in advance, you let the agent discover and compose capabilities in a more emergent way. Instead of forcing the developer to foresee every integration, your agent might find or craft the relevant integration on its own and proceed with a test driven development until quality is assured. One can imagine that as the reliability of code generation improves, the need for manual oversight shrinks, letting your agent’s “action space” grow organically with minimal friction. This drives down marginal costs of new features while potentially tackling challenges that no human-coded, static system could keep up with.

Yet, to make that real, we need the right underlying infrastructure: identity and payments for inter-agent commerce, decentralized data and code registries for discoverability, robust policies for safety, and standardized ways of representing actions so that an AI can reason about them. The combination of AI and Web3 here isn’t a gimmick but a genuine synergy, allowing software-based agents to navigate a complex environment of trust, transactions, and peer-to-peer collaboration.

If you’re as excited about the future of AI-driven agents as we are, or if your organization wants to explore how to build advanced agentic solutions with robust custom actions, reach out to us. We have unique and deep expertise combining AI and Web3 protocol engineering. We are keen to push forward the boundaries of the agentic frameworks that will define tomorrow’s digital economy. We believe these emergent systems have the power to transform entire industries, and we’re eager to work with partners who share our vision.