Two business professionals in formal suits with black silhouettes surrounded by downward and upward trending arrows, bar graphs, and data charts representing financial decline and growth.

Just How Important Are AI Agents?

An Interview with Yunpu Ma

Yunpu Ma delves into the rapid evolution of AI agents, showing how they are moving beyond simple tools to become sophisticated collaborators that can plan, debate, and carry out independent research. He explains how frameworks such as Model Context Protocol (MCP) and Agent-to-Agent (A2A) are reshaping the way machines interact with the web, and reflects on the far-reaching industrial and societal shifts that their widespread adoption could set in motion.

DAILOGUES: Andrew Ng predicted last year that the development of AI agents would have a greater impact than the new generation of foundation models. Would you share this assumption?

Yunpu Ma: Yes, AI agents will have a great impact. They have memories, can plan tasks, and reflect on their predictions. In this respect, they are different from foundation models. While AI agents are currently used for coding or to enhance chatbots through tool calls, they will be capable of handling many more complex tasks in principle. For example, several AI labs aim to equip agents with deep research abilities and internet access, enabling them to conduct scientific research independently.

DAILOGUES: Research and experiments with AI agents also suggest that collaboration and structured debate improve performance. For example, BlackRock has explored using groups of AI agents to support stock selection and portfolio management, where the quality of investment decisions improved when agents engaged in thorough debates. This raises an open question: What kind of collaboration framework is most effective for AI agents?

Yunpu Ma: There are two approaches to AI agent collaboration. In the first approach, we assign different roles to the agents and let them handle subtasks within one workflow. The second approach would use multiple agents for the same task. These agents are then brought to a debate, hoping that their different perspectives lead to a better overall solution. Typically, a final agent summarizes the debate and extracts the most valuable solutions from all contributions. You can also combine the two approaches and iterate over them for several rounds. Rumor has it that xAI’s Grok Heavy incorporates the second approach, which is also referred to as “parallel thinking”.

DAILOGUES: Marvin Minsky coined the term “society of mind” based on the idea that intelligence emerges from the interactions of many simple agents working together, like members of a society. Do you think the collaboration and interaction between agents that perform different cognitive tasks represent the future of creating intelligence?

Yunpu Ma: I am of the belief that scaling agentic systems will create more capable and therefore more intelligent systems. However, this strategy also comes with increasing costs. With any additional communication between the agents, a potential information loss is introduced, and more practically, any additional exchange implies more token predictions, which are costly, too.

DAILOGUES: AI agents have become more tangible for many users through ChatGPT’s new agentic mode, which is particularly strong in web search and browsing tasks. You developed a similar web-pilot system that was state of the art not long ago. Today, however, companies like OpenAI and IBM dominate benchmarks such as WebArena. In your view, what key improvements or innovations have these companies made that have led to higher-performing web agents?

Yunpu Ma: This is difficult to say because they keep their methods secret. What I can say is that I’m also working on new strategies for web agents. One idea is to further integrate memory modules, where previous experience will help the agent navigate through websites. I want to emulate the procedural experience that humans have. Knowing procedures is different from factual knowledge. The goal is to equip agents with enough procedural know-how for web navigation so that they don’t need to explore each website in fine-grained detail to plan their next steps.

DAILOGUES: Do you think future users will primarily rely on web agents that can browse and navigate the web on their behalf, or will they instead use specialized agents that interact directly with specific protocols or services, bypassing much of the complexity of web navigation?

Yunpu Ma: Future agents won’t navigate and explore websites on their own. Instead, they will use interfaces such as Model Context Protocol (MCP) servers to interact with the web. This means that the agents don’t need to understand a website and click on buttons. Instead, the website will be turned into a different representation based on the Model Context Protocol, with which the agents can interact. That is exactly the kind of project that my research team and I are bringing to life in collaboration with a company. Interactions with MCP servers are also more reliable. For that reason, I would not use a web agent to search the internet.

DAILOGUES: Could you briefly explain what the Agent-to-Agent (A2A) framework and Model Context Protocol are about? Why are these protocols important for agentic AI?

Yunpu Ma: Both frameworks are interfaces that facilitate interactions between AI agents and other software or services. The Agent-to-Agent framework was introduced by Google and is geared more towards a communication protocol for agents. The Model Context Protocol developed by Anthropic helps us set up servers that provide agents with tools, data, and convenient prompts. They make these accessible to the agents in a standardized way. This means that we don’t need to implement different solutions for each service we would like our agents to interact with. This is very convenient!

DAILOGUES: In your view, where do companies see the greatest return on investment when implementing agentic AI? Are there particular use cases, industries, or functions where the impact is especially strong?

Yunpu Ma: The possibilities are so vast that I believe that each company can highly profit from deploying AI agents. Think of introducing AI agents as a new wave of digitalization.

DAILOGUES: What ethical challenges do you foresee once the world is populated with many AI agents?

Yunpu Ma: I discuss this question a lot with my friends and students. I am of the opinion that AI agents are unlike other technological revolutions because I can currently not see how they will lead to the creation of new occupations or jobs for humans. This is, of course, very challenging. We’ll need to consider what will be left to humans and how we handle the outcome if many jobs are replaced by AI agents.

DAILOGUES: While AI agents are currently getting a lot of attention, what other areas of AI do you see as particularly promising in the near future?

Yunpu Ma: I find diffusion language models particularly exciting. In contrast to auto-regressive models that are currently powering most large language models, diffusion models can generate texts holistically. This means that they don’t generate text token by token, but many tokens in parallel. Another area that I find quite interesting is the field of AI for science, in particular, AI for medicine. For example, I believe that we’ll design medicine through AI in the future, which I think will also be very beneficial for humanity.

We thank Yunpu Ma for the DAILOGUE.

About the Author

Black and white portrait of a young Asian man with short hair, wearing a collared shirt, looking directly at the camera with a slight smile.

Dr. Yunpu Ma

University of Munich