Agentplex Weekly - Issue #9

AI Agents in enterprise: High expectations. AutoGen full course. Deep dive into llama-agents. LangGraph agents and HIL. AI Agent startups for enterprise apps. Evaluating AI agents in the real-world.

Jul 11, 2024

AI Agents in enterprise: High expectations. If you talk to CEOs in many companies, you can tell they have very high expectations on what AI agents can/will do. But if you talk to the CTOs, the level of expectations is a bit lower. Beyond the typical enterprise IT concerns like: compliance, governance, security, scalability, robustness… AI Agents add more levels of complexity. Berkeley AI Research recently posted about The Shift from AI Models to Compound AI Systems and its implications in terms of developing AI apps. Here’s a few concerns CTOs have on AI Agents:

Agentic workflows and complexity of enterprise workflows. Enterprise workflows are more complex than you think, often involving intricate, disparate, multi-tasks workflows that span across several business processes. An AI agent should know the difference between tasks, workflows and business process, and the human decision-making aspects associated to all those. Many VCs are starting to get interested in the rise of agentic workflows. Some startups are starting to develop AI agents capable of handling enterprise workflows complexity: Gradient is a startup building custom agents to power enterprise workloads.

Enterprise context and explainable AI agents’ decisions. Talk to any Sr. business manager and they’ll confide that “AI still feels like a black-box that can’t be trusted”. Developing enterprise AI Agents requires explaining how and why the agents execute tasks and decisions. What is the context to make an agentic decision? Why? Explainability will enable AI agents adoption in enterprise. This blogpost is a nice non-tech intro to The Rise of Explainable AI (XAI) in enterprise. For a more in-depth tech guide checkout: Explainable Deep Learning: A Field Guide for the Uninitiated.

Tapping enterprise knowledge and advanced agentic RAG. Enterprise AI agents will have to make decisions within highly specific business contexts and many different types of information sources. This will require AI agents that deeply understand company policies, industry regulations, and business client-customer relationships. The AI agents will also have to retrieve and produce reports very accurately, reliably and consistently. Developing complex agentic RAG systems will untap enterprise knowledge. But this will require new, advanced RAG techniques for enterprise production.

AI Agents accessing enterprise data silos. This is a classic: enterprise data is messy, dirty and kept in silos hidden everywhere. The most treasured, valuable enterprise data is in those silos. AI agents will need to access those enterprise data silos; and it won’t be easy. This is a great post on that: Can AI Solve Data Silos Challenge? New Challenges To The Multi-AI Agents Era.

Better human-AI agents collaboration interfaces. Today there are a lot of AI Agent apps, tools and platforms that have very basic UIs. There will be a lot of human-agent collaboration until fully autonomous agents are eventually adopted in enterprise. This means there will be a need to develop new, very intuitive agent-human interfaces that can enable human feedback, and smooth handoffs between the AI agents and the human workers. The latest announcement from Anthropic on UI Artifacts is very interesting. And here’s a nice blog on UX Design for Agentic Systems.

Integration of AI agents with enterprise prod systems. It won’t be just about a bit of “function calling & tools” here and there. CTOs can tell you about their diabolical enterprise IT landscape, plagued by a myriad of systems (e.g. ERP, CRM, SCM) custom dev apps, and legacy systems that are stitched together with safety pins. Developing AI agents that smoothly integrate at scale with enterprise systems will be key. This means that AI agents apps will need to adopt modern enterprise architecture patterns like event-driven, micro-services patterns. The latest announcement on llama-agents as distributed micro-services in prod looks promising.

Come to our AI Agents meetup in London, July 17. We are hosting our meetup at UCL London on July 17! Come and join us to learn about Camel & AutoGen AI Agents frameworks, and the AI Agents Global Challenge. If you are interested in giving a talk or do a demo at our meetups, please contact Carlos here.

Hands-on, tutorials and practical guides

AutoGen full beginner course. This a free curse that covers all the main aspects of AutoGen in a practical way. Two examples are provided: 1) an agents group chat, and 2) a Reddit project. By the end of this course, you’ll understand how AutoGen works, and how create your own Multi-Agent workflow.
Deep dive into the new llama-agents as services. llama-agents is a new, open-source, multi-agents framework designed as a distributed micro-services architecture. In this video, Mervin reviews llama-agents, its benefits, how to setup it up step-by-step, some examples, and compares it with other frameworks like Crew AI and Autogen.
Intro to agents and Human-in-the-Loop (HIL) breakpoints. HIL are crucial in agentic systems. In this video, you’ll learn about HIL interactions and how to create them with LanGraph. Breakpoints are a common HIL interaction pattern, allowing the graph to stop at specific steps and seek human approval before proceeding.

Tools, platforms, and frameworks

New Databricks Mosaic AI Agent Framework. This new framework provides: 1) integration with MLFlow for fast, end-to-end development workflow 2) a simplified SDK for managing the lifecycle of agentic apps 3) Easy evaluation of agentic apps on accuracy, hallucination, harmfulness, and helpfulness, using AI judges 4) Quick and easy way to get human feedback
Swarms multi-agent collaboration framework for enterprise apps and production ready, enables you to orchestrate many agents to work collaboratively at scale to automate real-world activities. Simple, seamless, and reliable multi-agents collaboration.

Thank you for reading Agentplex AI Agents. This post is public share it with your friends!

AI Agents Startups for Enterprise Applications

a16z: Why We Invested in Hebbia- AI Agents for financial services
Sema4.ai - A platform for developing enterprise agents
Enso.bot - Teams of AI agents for running small business tasks
Kalendar.ai - AI agents for driving sales opportunities

Research Papers - Evaluating AI Agents in real-world scenarios

Evaluating agents for business analytics. This paper introduces 1) AgentPoirot, a data analytics agent, and 2) Insight-bench a benchmark dataset designed to evaluate agents' ability to perform comprehensive data analysis across 31 diverse business use cases. Paper, dataset & repo: Evaluating Business Analytics Agents Through Multi-Step Insight Generation.
Evaluating agents for forecasting events. This paper introduces MIRAI, a new benchmark designed for evaluating LLM agents for temporal forecasting of international events. Paper, code, data and demo video: Evaluating LLM Agents for Event Forecasting.
Evaluating agents for travel planning. This paper introduces TravelPlanner, a new planning benchmark that focuses on travel planning, a common real-world planning scenario. Paper: TravelPlanner: A Benchmark for Real-World Planning with Language Agents.

Are you building an AI agent? Compete in the AI Agents Global Challenge funded with a $1 Million pool prize. It’s easy to apply, with fairly broad scope, and there is no registration fee. Click here to learn about the challenge and how to apply.

Join our Discord channel to meet other people building AI agents, discuss ideas and collaborate in projects.

Thank you for reading Agentplex newsletter. Have a great day.

Enjoyed reading this post? Share Agentplex newsletter with your friends. Thank you!

Agentplex AI Agents

Agentplex Weekly - Issue #9

AI Agents in enterprise: High expectations. AutoGen full course. Deep dive into llama-agents. LangGraph agents and HIL. AI Agent startups for enterprise apps. Evaluating AI agents in the real-world.

Discussion about this post