AIMultiple ResearchAIMultiple ResearchAIMultiple Research
LLM
Updated on Apr 28, 2025

Top 20+ Agentic RAG  Frameworks in 2025

Headshot of Cem Dilmegani
MailLinkedinX

Agentic RAG enhances traditional RAG by boosting LLM performance and enabling greater specialization. We conducted a benchmark to assess its performance on routing between multiple databases and generating queries.

Explore agentic RAG frameworks and libraries, key differences from standard RAG, benefits, and challenges to unlock their full potential.

Agentic RAG benchmark: multi-database routing and query generation

In many real-world enterprise scenarios, data is often distributed across multiple databases, each containing specialized information relevant to specific domains or tasks. For example, one database might store financial records, while another holds customer data or inventory details.

An effective Agentic RAG system must intelligently route a user’s query to the most relevant database to retrieve accurate information. This process involves analyzing the query, understanding the context, and selecting the appropriate data source from a set of available databases.

Figure 1: Overview of Agentic RAG system routing a query to one of five distinct databases

We used our agentic RAG benchmark methodology to demonstrate the system’s ability to select the correct database from a set of five distinct databases, each with unique contextual information, and generate semantically accurate SQL queries to retrieve the correct data:

In the agentic RAG benchmark, we used:

  • Agent Framework: Langchain – ReAct
  • Vector database: ChromaDB

Agent’s thought process

At the heart of an Agentic RAG system lies the LLM’s ability to autonomously reason and act to achieve a goal. The Langchain ReAct agent used in this benchmark manages this process through a “Thought-Action-Action Input” cycle.

Figure 2: Thought process of the Agentic RAG system.

1. Thought: The agent analyzes the incoming user query (“input”) and any provided evidence. It identifies keywords, entities, and the core information needed. It attempts to match the query against the descriptions of the available tools (databases). It determines which database is most relevant and what specific information (or SQL query) is required. This internal reasoning is visible in logs when `verbose=True` is enabled.

2. Action: Based on the conclusion reached in the Thought step, the agent selects the specific Tool (representing the target database) it intends to use.

3. Action Input: The agent determines the input to send to the selected tool. If it has formulated a specific SQL query, that query becomes the input. If it’s performing a more general lookup or hasn’t yet formed a query, it might be a descriptive phrase.

Thought process of agentic RAG in terminal.
Figure 3: Thought process of agentic RAG in terminal.

This cycle can repeat, allowing the agent to handle multi-step reasoning, correction, and complex query decomposition, enhancing its capabilities beyond traditional RAG systems.

Agentic RAG benchmark methodology

This benchmark was designed to measure the ability of Agentic RAG systems to identify the most relevant database for a specific query and then extract the correct information from that source. The methodology involved the following steps:

1. Dataset: The benchmark utilized the BIRD-SQL dataset,1 commonly employed in academic research for text-to-SQL tasks and database querying. This dataset is ideal as it provides natural language questions, the correct database containing the answer, and the accurate SQL query required to retrieve that answer.

2. Database Environment: Five distinct databases were set up, corresponding to different subject matters within the BIRD-SQL dataset. The schema and a brief description of each database were made accessible to the agent. It was designed such that the answer to each question resides in only one specific database. ChromaDB was used as the vector database for efficient storage and semantic retrieval of database descriptions and schemas.

3. Agent Architecture: A Langchain based ReAct (Reasoning and Acting) agent architecture was employed to process queries, select the appropriate database tool, and generate SQL queries when necessary. A separate Langchain “Tool” was defined for each database. These tools encapsulate the description of their respective databases, aiding the agent in its selection process.

4. Evaluation Process: For each question from the BIRD-SQL subset:

– The agent was presented with the question and any accompanying evidence text.

– The agent’s selected Tool (representing the target database) via its ReAct logic was recorded.

– The input provided to the selected Tool (“Action Input” – which could be descriptive text or a direct SQL query) was recorded.

– The agent’s chosen database was compared against the ground truth database specified in the BIRD-SQL dataset (Metric: % Correct Database Selection Rate).

– The SQL query generated by the agent (if applicable) was compared semantically against the ground truth SQL query from BIRD-SQL (Metric: % Correct SQL Query Retrieval Rate). SQL normalization(e.g., lowercasing, removing aliases) was applied before comparison to focus on semantic correctness rather than exact string matching.

Agentic RAG frameworks & libraries

Agentic RAG frameworks enable AI systems not only to find information but also to reason, make decisions, and take actions. Top tools and libraries that power Agentic RAG:

Last Updated at 01-24-2025
ToolsTypesGitHub StarsTool useAgent type

Langflow by Langflow AI

Agentic RAG framework

38.1k

Multi-agent

DB GPT

Agentic RAG framework

13.9k

Multi-agent

MetaGPT

Agentic RAG framework

45.7k

Multi-agent

Ragapp

Agentic RAG framework

3.9k

Multi-agent

GPT RAG by Azure

Agentic RAG framework

890

Multi-agent

Agentic RAG

Agentic RAG framework

78

Multi-agent

Qdrant Rag Eval

Agentic RAG framework

56

Single-agent

IBM Granite 3.0

Agentic RAG framework

Not applicable

Multi-agent

AutoGen

Agent Library

35.6k

Multi-agent

Agent GPT

Agent Library

32k

Multi-agent

Botpress

Agent library

12.9k

Multi-agent

LaVague

Agent Library

5.5k

Multi-agent

Superagent AI

Agent Library

5.4k

Multi-agent

Crew AI

Agent orchestrator

22.3k

Multi-agent

Brainqub3

Agent orchestrator

375

Single-agent

Transformers

LLMOps framework

136k

Multi-agent

Anything LLM

LLMOps framework

28.3k

Multi-agent

Haystack by Deepset AI

LLMOps framework

18.1k

Multi-agent

NVIDIA NeMo Framework

LLMOps framework

12.4k

Langgraph by Langchain

LLMOps framework

7.2k

Multi-agent

GenerativeAIExamples by NVIDIA

LLMops framework

2.5k

Multi-agent

Cortex by Snowflake

LLMops framework

Not applicable

Google Vertex AI

LLMOps framework

Not applicable

Claude 3.5 Sonnet by Anthropic

LLM

Not applicable

OpenAI GPT-4

LLM

Not applicable

This list includes tools that meet the following criteria:

  • 50+ stars on GitHub.
  • Common usage in Agentic RAG projects.

Note that in the table:

  • Tool use refers to the native ability of a system to route and call tools within its environment.
  • Tool type refers to the main usage area of the tools, such as:
    • Agentic RAG frameworks are designed specifically for building, deploying, or configuring Agentic RAG systems.
    • Agent libraries enable the creation of intelligent agents that can reason, make decisions, and execute multi-step tasks.
    • LLMOps frameworks manage the lifecycle of LLMs and optimize the deployment and use of LLMs within agent-based systems.
    • LLMs that have built-in capabilities for tool calling and routing, allowing for dynamic decision-making. Other LLMs may require external APIs or integrations to enable agent functionality.
  • Verification of tool use and agent types is achieved through public sources.

What is the agentic RAG?

Agentic Retrieval-Augmented Generation (RAG) is an AI framework that combines retrieval techniques with generative models to enable dynamic decision-making and knowledge synthesis. This approach integrates the accuracy of traditional RAG with the generative capabilities of advanced AI, aiming to enhance the efficiency and effectiveness of AI-driven tasks.

Worldwide search trends for Agentic RAG until 04/30/2025

Limitations of traditional RAG systems

Agentic RAG aims to overcome the limitations faced with the standard RAG system, such as:

  • Difficulty in information prioritization: RAG systems often struggle to efficiently manage and prioritize data within large datasets, which can reduce overall performance.
  • Limited integration of expert knowledge: These systems may undervalue specialized, high-quality content, favoring general information instead.
  • Weak contextual understanding: While capable of retrieving data, they frequently fail to fully comprehend its relevance or how it aligns with the specific query.
Figure 4: Agentic RAG architecture diagram in comparison with the traditional RAG2

How to build an agentic RAG

1. Tool use

  • Employ routers: The first step involves employing routers to determine whether to retrieve documents, perform calculations, or rewrite the query. This approach adds decision-making capabilities to route requests to multiple tools, enabling large language models (LLMs) to select appropriate pipelines.
  • Tool-calling integration: This refers to creating an interface for agents to connect with selected tools. Users can leverage LLMs with tool-calling capabilities or build their own to:
    • Pick a function to execute.
    • Infer the necessary arguments for that function.
    • Enhance query understanding beyond traditional RAG pipelines, enabling tasks like database queries or complex reasoning.
The graph illustrates the agentic RAG multi-step reasoning functionality
Figure 5: How to build Agentic RAG by adding calling agent3

2. Agent implementation

  • Single-call agents: A query triggers a single call to the appropriate tool, returning the response. This is effective for straightforward tasks, but may struggle with vague or complex queries.
  • Multi-call agents: This approach involves dividing tasks among specialized agents, with each agent focusing on a specific subtask. For example:
    • Retriever agent: Optimizes real-time query retrieval.
    • Manager agent: Handles task delegation and orchestration.
The graph illustrates the architecture of a mute-agent agentic RAG
Figure 6: Multi-agent RAG architecture4

3. Multi-step reasoning

For complex workflows, agents use reasoning loops to perform iterative, multi-step reasoning while retaining memory of intermediate steps. These loops involve:

  • Calling multiple tools.
  • Retrieving data and validating its relevance.
  • Rewriting queries as needed.

Frameworks often define multiple agents to handle specific subtasks, ensuring efficient execution of the overall process.

The graph illustrates the agentic RAG multi-document agent functionality
Figure 7: Multi-documents RAG5

4. Hybrid approaches: combining retrieval and execution

A hybrid approach combines retrieval pipelines with dynamic execution strategies:

  • Embedding and vector-based retrieval strategies for document access.
  • Tool-calling capabilities for dynamic query resolution.
  • Multi-agent collaboration for specialized subtasks.

What is the difference between RAG and agentic RAG?

Here are the strengths and weaknesses of RAG vs. Agentic RAG based on different aspects:

Last Updated at 12-11-2024
AspectTraditional RAGAgentic RAG

Prompt engineering

Manual optimization

Dynamic adjustments

Context awareness

Limited; static retrieval

Context-aware; adapts

Autonomy

No autonomous actions

Performs real-time actions

Reasoning

Needs external models

Built-in multi-step reasoning

Data quality

No evaluation mechanism

Ensures accuracy

Flexibility

Static rules

Dynamic retrieval

Retrieval efficiency

Static; costly

Optimized; cost-efficient

Simplicity

Straightforward setup

More complex configuration

Predictability

Consistent and rule-based

Dynamic behavior may vary

Cost in deployments

Cheaper for basic setups

Higher initial investment

  • Prompt engineering
    • Traditional RAG: Relies heavily on manual optimization of prompts.
    • Agentic RAG: Dynamically adjusts prompts based on context and goals, reducing the need for manual intervention.
  • Context awareness
    • Traditional RAG: Has limited contextual awareness and relies on static retrieval processes.
    • Agentic RAG: Considers conversation history and adapts retrieval strategies dynamically based on context.
  • Autonomy
    • Traditional RAG: Lacks autonomous actions and cannot adapt to evolving situations.
    • Agentic RAG: Performs real-time actions and adjusts based on feedback and real-time observations.
  • Reasoning
    • Traditional RAG: Requires additional classifiers and models for multi-step reasoning and tool usage.
    • Agentic RAG: Handles multi-step reasoning internally, eliminating the need for external models.
  • Data quality
    • Traditional RAG: Has no built-in mechanism to evaluate data quality or ensure accuracy.
    • Agentic RAG: Evaluates data quality and performs post-generation checks to ensure accurate outputs.
  • Flexibility
    • Traditional RAG: Operates on static rules, limiting adaptability.
    • Agentic RAG: Employs dynamic retrieval strategies and adjusts its approach as needed.
  • Retrieval efficiency
    • Traditional RAG: Retrieval is static and often costly due to inefficiencies.
    • Agentic RAG: Optimizes retrievals to minimize unnecessary operations, reducing costs and improving efficiency.
  • Simplicity
    • Traditional RAG: Features a straightforward setup with fewer configuration complexities.
    • Agentic RAG: Involves more complex configurations to support dynamic and context-aware operations.
  • Predictability
    • Traditional RAG: Consistent and rule-based, but rigid in behavior.
    • Agentic RAG: Behavior can vary dynamically based on real-time context and observations.
  • Cost in deployments
    • Traditional RAG: Cheaper for basic setups, but may incur higher long-term operational costs.
    • Agentic RAG: Requires a higher initial investment due to advanced features and dynamic capabilities.

Different types of Agentic RAG models

Some of the agents that leverage Large Language Models (LLMs) within Retrieval-Augmented Generation (RAG) frameworks include:

  • Routing agent: Uses a Large Language Model (LLM) for agentic reasoning to select the most appropriate Retrieval-Augmented Generation (RAG) pipeline (e.g., summarization or question-answering) for a given query. The agent determines the best fit by analyzing the input query.
  • One-shot query planning agent: Decomposes complex queries into smaller subqueries, executes them across various RAG pipelines with different data sources, and combines the results into a comprehensive response.
  • Tool use agent: Enhances standard RAG frameworks by incorporating external data sources (e.g., APIs, databases) to provide additional context. This allows for more enriched processing of queries using LLMs.
  • ReAct agent: Integrates reasoning and action for handling sequential, multi-part queries. It maintains an in-memory state and iteratively invokes tools, processes their outputs, and determines the next steps until the query is fully resolved.
  • Dynamic planning & execution agent: Aimed at managing more complex queries, this agent separates high-level planning from execution. It uses an LLM as a planner to design a computational graph of steps needed to answer the query and employs an executor to carry out these steps efficiently. The focus is on reliability, observability, parallelization, and optimization for production environments.

Agentic RAG benefits

Agentic RAG improves LLMs through:

  • Autonomous & goal-oriented approach: Unlike traditional RAG, Agentic RAG acts like an autonomous agent, making decisions to achieve defined goals and pursue deeper, more meaningful interactions.
  • Improved context awareness & sensitivity: Agentic RAG dynamically considers conversation history, user preferences, prior interactions, and the current context to provide relevant, informed responses and decision-making.
  • Dynamic retrieval & advanced reasoning: It uses intelligent retrieval methods tailored to queries, while evaluating and verifying the accuracy and reliability of retrieved data.
  • Multi-agent orchestration: It coordinates multiple specialized agents, breaking down queries into manageable tasks and ensuring seamless coordination to deliver accurate results.
  • Increased accuracy with post-generation verification: Agentic RAG models perform quality checks on generated content, ensuring the best possible response and combining LLMs with agent-based systems for superior performance.
  • Adaptability & learning: These systems continuously learn and improve over time, enhancing problem-solving abilities, accuracy, and efficiency, and adapting to various domains for specific tasks.
  • Flexible tool utilization: Agents can leverage external tools like search engines, databases, or APIs to enhance data collection, processing, and customization for diverse applications.

Agentic RAG challenges

  • Data quality: Reliable outputs require high-quality, curated data. Challenges arise when integrating and processing diverse datasets, including textual and visual data, to meet user query requirements. Further data retrieval processes must also ensure accuracy and consistency.
    • Tip: Implement automated data cleansing tools and AI-driven data validation techniques to ensure consistent and high-quality data integration across textual and visual datasets.
  • Scalability: Efficient management of system resources and retrieval processes is critical as the system grows. As user queries and data volumes increase, handling both real-time and batch processing for further data retrieval becomes a significant challenge.
    • Tip: Utilize scalable cloud-based infrastructure and distributed computing frameworks to handle increasing data loads efficiently. Incorporate dynamic load balancing for real-time query handling.
  • Explainability: Ensuring transparency in decision-making builds trust. Providing clear insights into how responses to user queries are generated, particularly when leveraging textual and visual data, remains a persistent challenge.
    • Tip: Leverage AI explainability tools like SHAP or LIME to make model predictions interpretable and integrate visualization dashboards to clarify the reasoning behind responses.
  • Privacy and security: Strong data protection and secure communication protocols are essential. Managing sensitive or confidential data requires robust encryption and compliance mechanisms during storage, further data retrieval, and processing.
    • Tip: Employ end-to-end encryption and access management solutions, and ensure compliance with data protection regulations such as GDPR or CCPA. Use secure API gateways for further data retrieval.
  • Ethical concerns: Addressing bias, fairness, and misuse is crucial for responsible AI deployment. Ensuring unbiased responses to diverse user queries remains a key consideration in ethical AI design.

Future prospects

The latest research on agentic RAG includes improvement areas like:

  • Knowledge graph integration: Enhances reasoning by leveraging complex data relationships.
  • Emerging technologies: Incorporating tools like ontologies and the semantic web to advance system capabilities.
  • Specialized agent collaboration: Agents with expertise in different domains (e.g., sales, marketing, finance) work together in a coordinated workflow to address complex tasks.
  • Quality optimization: Addressing inconsistent output to improve the reliability and precision of multi-agent systems.

Hypothetical scenario

Some believe specialized AI agents might represent various roles within an organization, collectively driving its operations. Suppose a scenario where you upload a Request for Proposal (RFP) into an Agentic RAG model. This would imitate a workflow as following:

  • The pre-sales engineer agent identifies the appropriate services.
  • The marketing agent crafts compelling content.
  • The sales agent determines pricing strategies.
  • The finance and legal agents finalize terms and conditions.

FAQs

What is RAG?

Retrieval-Augmented Generation (RAG) is a technique that combines retrieval-based methods with generative models to enhance information retrieval and response generation.

Explore more on retrieval-augmented generation technique and common models.

What is an agent?

An agent is a computer program designed to observe its environment, make decisions, and execute actions autonomously to achieve specific objectives without direct human intervention.

Usage in AI Systems
Agents are used to automate tasks, optimize processes, and make intelligent decisions in dynamic environments. Depending on their complexity, agents can range from simple rule-based systems to advanced models using learning techniques.

Types of Agents
Reactive Agents: Operate based on the current state of the environment and follow predefined rules, without using past experiences.
Cognitive Agents: Store past experiences and use them to analyze patterns and make decisions, enabling learning from previous interactions.
Collaborative Agents: Interact with other agents or systems to achieve shared goals, often within multi-agent systems where coordination and information sharing are key.

Is agentic RAG better?

Agentic RAG can be better for tasks requiring more dynamic, context-aware decision-making and iterative interactions, but its effectiveness depends on the specific use case and implementation needs.

What is the difference between vanilla RAG and agentic RAG?

Vanilla RAG passively retrieves and generates answers based on a static query-response model, while agentic RAG incorporates iterative processes, decision-making, and dynamic interactions to refine responses or handle complex tasks.

Further reading

Explore other LLM improvements, such as:

External sources

Share This Article
MailLinkedinX
Cem has been the principal analyst at AIMultiple since 2017. AIMultiple informs hundreds of thousands of businesses (as per similarWeb) including 55% of Fortune 500 every month.

Cem's work has been cited by leading global publications including Business Insider, Forbes, Washington Post, global firms like Deloitte, HPE and NGOs like World Economic Forum and supranational organizations like European Commission. You can see more reputable companies and resources that referenced AIMultiple.

Throughout his career, Cem served as a tech consultant, tech buyer and tech entrepreneur. He advised enterprises on their technology decisions at McKinsey & Company and Altman Solon for more than a decade. He also published a McKinsey report on digitalization.

He led technology strategy and procurement of a telco while reporting to the CEO. He has also led commercial growth of deep tech company Hypatos that reached a 7 digit annual recurring revenue and a 9 digit valuation from 0 within 2 years. Cem's work in Hypatos was covered by leading technology publications like TechCrunch and Business Insider.

Cem regularly speaks at international technology conferences. He graduated from Bogazici University as a computer engineer and holds an MBA from Columbia Business School.
Researched by
Headshot of Ekrem Sarı
Ekrem Sarı
Ekrem is an Industry Analyst at AIMultiple focused on intelligent automation and robotic process automation.

Next to Read

Comments

Your email address will not be published. All fields are required.

0 Comments