AIMultiple ResearchAIMultiple ResearchAIMultiple Research
LLM
Updated on May 6, 2025

Top 20+ Agentic RAG  Frameworks in 2025

Headshot of Cem Dilmegani
MailLinkedinX

Agentic RAG enhances traditional RAG by boosting LLM performance and enabling greater specialization. We conducted a benchmark to assess its performance on routing between multiple databases and generating queries.

Explore agentic RAG frameworks and libraries, key differences from standard RAG, benefits, and challenges to unlock their full potential.

Agentic RAG benchmark: multi-database routing and query generation

We used our agentic RAG benchmark methodology to demonstrate the system’s ability to select the correct database from a set of five distinct databases, each with unique contextual information, and generate semantically accurate SQL queries to retrieve the correct data:

In the agentic RAG benchmark, we used:

  • Agent Framework: Langchain – ReAct
  • Vector database: ChromaDB

In many real-world enterprise scenarios, data is often distributed across multiple databases, each containing specialized information relevant to specific domains or tasks. For example, one database might store financial records, while another holds customer data or inventory details.

An effective Agentic RAG system must intelligently route a user’s query to the most relevant database to retrieve accurate information. This process involves analyzing the query, understanding the context, and selecting the appropriate data source from a set of available databases.

Figure 1: Overview of Agentic RAG system routing a query to one of five distinct databases

Agent’s thought process

At the heart of an Agentic RAG system lies the LLM’s ability to autonomously reason and act to achieve a goal. The Langchain ReAct agent used in this benchmark manages this process through a “Thought-Action-Action Input” cycle.

Figure 2: Thought process of the Agentic RAG system.

1. Thought: The agent analyzes the incoming user query (“input”) and any provided evidence. It identifies keywords, entities, and the core information needed. It attempts to match the query against the descriptions of the available tools (databases). It determines which database is most relevant and what specific information (or SQL query) is required. This internal reasoning is visible in logs when `verbose=True` is enabled.

2. Action: Based on the conclusion reached in the Thought step, the agent selects the specific Tool (representing the target database) it intends to use.

3. Action Input: The agent determines the input to send to the selected tool. If it has formulated a specific SQL query, that query becomes the input. If it’s performing a more general lookup or hasn’t yet formed a query, it might be a descriptive phrase.

Thought process of agentic RAG in terminal.
Figure 3: Thought process of agentic RAG in terminal.

This cycle can repeat, allowing the agent to handle multi-step reasoning, correction, and complex query decomposition, enhancing its capabilities beyond traditional RAG systems.

Agentic RAG benchmark methodology

This benchmark was designed to measure the ability of Agentic RAG systems to identify the most relevant database for a specific query and then extract the correct information from that source. The methodology involved the following steps:

1. Dataset: The benchmark utilized the BIRD-SQL dataset,1 commonly employed in academic research for text-to-SQL tasks and database querying. This dataset is ideal as it provides natural language questions, the correct database containing the answer, and the accurate SQL query required to retrieve that answer.

2. Database Environment: Five distinct databases were set up, corresponding to different subject matters within the BIRD-SQL dataset. The schema and a brief description of each database were made accessible to the agent. It was designed such that the answer to each question resides in only one specific database. ChromaDB was used as the vector database for efficient storage and semantic retrieval of database descriptions and schemas.

3. Agent Architecture: A Langchain based ReAct (Reasoning and Acting) agent architecture was employed to process queries, select the appropriate database tool, and generate SQL queries when necessary. A separate Langchain “Tool” was defined for each database. These tools encapsulate the description of their respective databases, aiding the agent in its selection process.

4. Evaluation Process: For each question from the BIRD-SQL subset:

– The agent was presented with the question and any accompanying evidence text.

– The agent’s selected Tool (representing the target database) via its ReAct logic was recorded.

– The input provided to the selected Tool (“Action Input” – which could be descriptive text or a direct SQL query) was recorded.

– The agent’s chosen database was compared against the ground truth database specified in the BIRD-SQL dataset (Metric: % Correct Database Selection Rate).

– The SQL query generated by the agent (if applicable) was compared semantically against the ground truth SQL query from BIRD-SQL (Metric: % Correct SQL Query Retrieval Rate). SQL normalization(e.g., lowercasing, removing aliases) was applied before comparison to focus on semantic correctness rather than exact string matching.

Agentic RAG frameworks & libraries

Agentic RAG frameworks enable AI systems not only to find information but also to reason, make decisions, and take actions. Top tools and libraries that power Agentic RAG:

Updated at 01-24-2025
ToolsTypesGitHub StarsTool useAgent type
Langflow by Langflow AIAgentic RAG framework38.1kMulti-agent
DB GPTAgentic RAG framework13.9kMulti-agent
MetaGPTAgentic RAG framework45.7kMulti-agent
RagappAgentic RAG framework3.9kMulti-agent
GPT RAG by AzureAgentic RAG framework890Multi-agent
Agentic RAGAgentic RAG framework78Multi-agent
Qdrant Rag EvalAgentic RAG framework56Single-agent
IBM Granite 3.0Agentic RAG frameworkNot applicableMulti-agent
AutoGenAgent Library35.6kMulti-agent
Agent GPTAgent Library32kMulti-agent
BotpressAgent library12.9kMulti-agent
LaVagueAgent Library5.5kMulti-agent
Superagent AIAgent Library5.4kMulti-agent
Crew AIAgent orchestrator22.3kMulti-agent
Brainqub3Agent orchestrator375Single-agent
TransformersLLMOps framework136kMulti-agent
Anything LLMLLMOps framework28.3kMulti-agent
Haystack by Deepset AILLMOps framework18.1kMulti-agent
NVIDIA NeMo FrameworkLLMOps framework12.4k
Langgraph by LangchainLLMOps framework7.2kMulti-agent
GenerativeAIExamples by NVIDIALLMops framework2.5kMulti-agent
Cortex by SnowflakeLLMops frameworkNot applicable
Google Vertex AILLMOps frameworkNot applicable
Claude 3.5 Sonnet by AnthropicLLMNot applicable
OpenAI GPT-4LLMNot applicable

This list includes tools that meet the following criteria:

  • 50+ stars on GitHub.
  • Common usage in Agentic RAG projects.

Note that in the table:

  • Tool use refers to the native ability of a system to route and call tools within its environment.
  • Tool type refers to the main usage area of the tools, such as:
    • Agentic RAG frameworks are designed specifically for building, deploying, or configuring Agentic RAG systems.
    • Agent libraries enable the creation of intelligent agents that can reason, make decisions, and execute multi-step tasks.
    • LLMOps frameworks manage the lifecycle of LLMs and optimize the deployment and use of LLMs within agent-based systems.
    • LLMs that have built-in capabilities for tool calling and routing, allowing for dynamic decision-making. Other LLMs may require external APIs or integrations to enable agent functionality.
  • Verification of tool use and agent types is achieved through public sources.

What is the agentic RAG?

Agentic Retrieval-Augmented Generation (RAG) is an AI framework that combines retrieval techniques with generative models to enable dynamic decision-making and knowledge synthesis. This approach integrates the accuracy of traditional RAG with the generative capabilities of advanced AI, aiming to enhance the efficiency and effectiveness of AI-driven tasks.

Worldwide search trends for Agentic RAG until 05/22/2025

Limitations of traditional RAG systems

Agentic RAG aims to overcome the limitations faced with the standard RAG system, such as:

  • Difficulty in information prioritization: RAG systems often struggle to efficiently manage and prioritize data within large datasets, which can reduce overall performance.
  • Limited integration of expert knowledge: These systems may undervalue specialized, high-quality content, favoring general information instead.
  • Weak contextual understanding: While capable of retrieving data, they frequently fail to fully comprehend its relevance or how it aligns with the specific query.
Figure 4: Agentic RAG architecture diagram in comparison with the traditional RAG2

How to build an agentic RAG

1. Tool use

  • Employ routers: The first step involves employing routers to determine whether to retrieve documents, perform calculations, or rewrite the query. This approach adds decision-making capabilities to route requests to multiple tools, enabling large language models (LLMs) to select appropriate pipelines.
  • Tool-calling integration: This refers to creating an interface for agents to connect with selected tools. Users can leverage LLMs with tool-calling capabilities or build their own to:
    • Pick a function to execute.
    • Infer the necessary arguments for that function.
    • Enhance query understanding beyond traditional RAG pipelines, enabling tasks like database queries or complex reasoning.
The graph illustrates the agentic RAG multi-step reasoning functionality
Figure 5: How to build Agentic RAG by adding calling agent3

2. Agent implementation

  • Single-call agents: A query triggers a single call to the appropriate tool, returning the response. This is effective for straightforward tasks, but may struggle with vague or complex queries.
  • Multi-call agents: This approach involves dividing tasks among specialized agents, with each agent focusing on a specific subtask. For example:
    • Retriever agent: Optimizes real-time query retrieval.
    • Manager agent: Handles task delegation and orchestration.
The graph illustrates the architecture of a mute-agent agentic RAG
Figure 6: Multi-agent RAG architecture4

3. Multi-step reasoning

For complex workflows, agents use reasoning loops to perform iterative, multi-step reasoning while retaining memory of intermediate steps. These loops involve:

  • Calling multiple tools.
  • Retrieving data and validating its relevance.
  • Rewriting queries as needed.

Frameworks often define multiple agents to handle specific subtasks, ensuring efficient execution of the overall process.

The graph illustrates the agentic RAG multi-document agent functionality
Figure 7: Multi-documents RAG5

4. Hybrid approaches: combining retrieval and execution

A hybrid approach combines retrieval pipelines with dynamic execution strategies:

  • Embedding and vector-based retrieval strategies for document access.
  • Tool-calling capabilities for dynamic query resolution.
  • Multi-agent collaboration for specialized subtasks.

What is the difference between RAG and agentic RAG?

Here are the strengths and weaknesses of RAG vs. Agentic RAG based on different aspects:

Updated at 12-11-2024
AspectTraditional RAGAgentic RAG
Prompt engineeringManual optimization Dynamic adjustments
Context awarenessLimited; static retrieval Context-aware; adapts
AutonomyNo autonomous actions Performs real-time actions
ReasoningNeeds external models Built-in multi-step reasoning
Data qualityNo evaluation mechanism Ensures accuracy
FlexibilityStatic rules Dynamic retrieval
Retrieval efficiencyStatic; costly Optimized; cost-efficient
SimplicityStraightforward setup More complex configuration
PredictabilityConsistent and rule-based Dynamic behavior may vary
Cost in deploymentsCheaper for basic setups Higher initial investment
  • Prompt engineering
    • Traditional RAG: Relies heavily on manual optimization of prompts.
    • Agentic RAG: Dynamically adjusts prompts based on context and goals, reducing the need for manual intervention.
  • Context awareness
    • Traditional RAG: Has limited contextual awareness and relies on static retrieval processes.
    • Agentic RAG: Considers conversation history and adapts retrieval strategies dynamically based on context.
  • Autonomy
    • Traditional RAG: Lacks autonomous actions and cannot adapt to evolving situations.
    • Agentic RAG: Performs real-time actions and adjusts based on feedback and real-time observations.
  • Reasoning
    • Traditional RAG: Requires additional classifiers and models for multi-step reasoning and tool usage.
    • Agentic RAG: Handles multi-step reasoning internally, eliminating the need for external models.
  • Data quality
    • Traditional RAG: Has no built-in mechanism to evaluate data quality or ensure accuracy.
    • Agentic RAG: Evaluates data quality and performs post-generation checks to ensure accurate outputs.
  • Flexibility
    • Traditional RAG: Operates on static rules, limiting adaptability.
    • Agentic RAG: Employs dynamic retrieval strategies and adjusts its approach as needed.
  • Retrieval efficiency
    • Traditional RAG: Retrieval is static and often costly due to inefficiencies.
    • Agentic RAG: Optimizes retrievals to minimize unnecessary operations, reducing costs and improving efficiency.
  • Simplicity
    • Traditional RAG: Features a straightforward setup with fewer configuration complexities.
    • Agentic RAG: Involves more complex configurations to support dynamic and context-aware operations.
  • Predictability
    • Traditional RAG: Consistent and rule-based, but rigid in behavior.
    • Agentic RAG: Behavior can vary dynamically based on real-time context and observations.
  • Cost in deployments
    • Traditional RAG: Cheaper for basic setups, but may incur higher long-term operational costs.
    • Agentic RAG: Requires a higher initial investment due to advanced features and dynamic capabilities.

Different types of Agentic RAG models

Some of the agents that leverage Large Language Models (LLMs) within Retrieval-Augmented Generation (RAG) frameworks include:

  • Routing agent: Uses a Large Language Model (LLM) for agentic reasoning to select the most appropriate Retrieval-Augmented Generation (RAG) pipeline (e.g., summarization or question-answering) for a given query. The agent determines the best fit by analyzing the input query.
  • One-shot query planning agent: Decomposes complex queries into smaller subqueries, executes them across various RAG pipelines with different data sources, and combines the results into a comprehensive response.
  • Tool use agent: Enhances standard RAG frameworks by incorporating external data sources (e.g., APIs, databases) to provide additional context. This allows for more enriched processing of queries using LLMs.
  • ReAct agent: Integrates reasoning and action for handling sequential, multi-part queries. It maintains an in-memory state and iteratively invokes tools, processes their outputs, and determines the next steps until the query is fully resolved.
  • Dynamic planning & execution agent: Aimed at managing more complex queries, this agent separates high-level planning from execution. It uses an LLM as a planner to design a computational graph of steps needed to answer the query and employs an executor to carry out these steps efficiently. The focus is on reliability, observability, parallelization, and optimization for production environments.

Agentic RAG benefits

Agentic RAG improves LLMs through:

  • Autonomous & goal-oriented approach: Unlike traditional RAG, Agentic RAG acts like an autonomous agent, making decisions to achieve defined goals and pursue deeper, more meaningful interactions.
  • Improved context awareness & sensitivity: Agentic RAG dynamically considers conversation history, user preferences, prior interactions, and the current context to provide relevant, informed responses and decision-making.
  • Dynamic retrieval & advanced reasoning: It uses intelligent retrieval methods tailored to queries, while evaluating and verifying the accuracy and reliability of retrieved data.
  • Multi-agent orchestration: It coordinates multiple specialized agents, breaking down queries into manageable tasks and ensuring seamless coordination to deliver accurate results.
  • Increased accuracy with post-generation verification: Agentic RAG models perform quality checks on generated content, ensuring the best possible response and combining LLMs with agent-based systems for superior performance.
  • Adaptability & learning: These systems continuously learn and improve over time, enhancing problem-solving abilities, accuracy, and efficiency, and adapting to various domains for specific tasks.
  • Flexible tool utilization: Agents can leverage external tools like search engines, databases, or APIs to enhance data collection, processing, and customization for diverse applications.

Agentic RAG challenges

  • Data quality: Reliable outputs require high-quality, curated data. Challenges arise when integrating and processing diverse datasets, including textual and visual data, to meet user query requirements. Further data retrieval processes must also ensure accuracy and consistency.
    • Tip: Implement automated data cleansing tools and AI-driven data validation techniques to ensure consistent and high-quality data integration across textual and visual datasets.
  • Scalability: Efficient management of system resources and retrieval processes is critical as the system grows. As user queries and data volumes increase, handling both real-time and batch processing for further data retrieval becomes a significant challenge.
    • Tip: Utilize scalable cloud-based infrastructure and distributed computing frameworks to handle increasing data loads efficiently. Incorporate dynamic load balancing for real-time query handling.
  • Explainability: Ensuring transparency in decision-making builds trust. Providing clear insights into how responses to user queries are generated, particularly when leveraging textual and visual data, remains a persistent challenge.
    • Tip: Leverage AI explainability tools like SHAP or LIME to make model predictions interpretable and integrate visualization dashboards to clarify the reasoning behind responses.
  • Privacy and security: Strong data protection and secure communication protocols are essential. Managing sensitive or confidential data requires robust encryption and compliance mechanisms during storage, further data retrieval, and processing.
    • Tip: Employ end-to-end encryption and access management solutions, and ensure compliance with data protection regulations such as GDPR or CCPA. Use secure API gateways for further data retrieval.
  • Ethical concerns: Addressing bias, fairness, and misuse is crucial for responsible AI deployment. Ensuring unbiased responses to diverse user queries remains a key consideration in ethical AI design.

Future prospects

The latest research on agentic RAG includes improvement areas like:

  • Knowledge graph integration: Enhances reasoning by leveraging complex data relationships.
  • Emerging technologies: Incorporating tools like ontologies and the semantic web to advance system capabilities.
  • Specialized agent collaboration: Agents with expertise in different domains (e.g., sales, marketing, finance) work together in a coordinated workflow to address complex tasks.
  • Quality optimization: Addressing inconsistent output to improve the reliability and precision of multi-agent systems.

Hypothetical scenario

Some believe specialized AI agents might represent various roles within an organization, collectively driving its operations. Suppose a scenario where you upload a Request for Proposal (RFP) into an Agentic RAG model. This would imitate a workflow as following:

  • The pre-sales engineer agent identifies the appropriate services.
  • The marketing agent crafts compelling content.
  • The sales agent determines pricing strategies.
  • The finance and legal agents finalize terms and conditions.

FAQs

What is RAG?

Retrieval-Augmented Generation (RAG) is a technique that combines retrieval-based methods with generative models to enhance information retrieval and response generation.

Explore more on retrieval-augmented generation technique and common models.

What is an agent?

An agent is a computer program designed to observe its environment, make decisions, and execute actions autonomously to achieve specific objectives without direct human intervention.

Usage in AI Systems
Agents are used to automate tasks, optimize processes, and make intelligent decisions in dynamic environments. Depending on their complexity, agents can range from simple rule-based systems to advanced models using learning techniques.

Types of Agents
Reactive Agents: Operate based on the current state of the environment and follow predefined rules, without using past experiences.
Cognitive Agents: Store past experiences and use them to analyze patterns and make decisions, enabling learning from previous interactions.
Collaborative Agents: Interact with other agents or systems to achieve shared goals, often within multi-agent systems where coordination and information sharing are key.

Is agentic RAG better?

Agentic RAG can be better for tasks requiring more dynamic, context-aware decision-making and iterative interactions, but its effectiveness depends on the specific use case and implementation needs.

What is the difference between vanilla RAG and agentic RAG?

Vanilla RAG passively retrieves and generates answers based on a static query-response model, while agentic RAG incorporates iterative processes, decision-making, and dynamic interactions to refine responses or handle complex tasks.

Further reading

Explore other LLM improvements, such as:

Share This Article
MailLinkedinX
Cem has been the principal analyst at AIMultiple since 2017. AIMultiple informs hundreds of thousands of businesses (as per similarWeb) including 55% of Fortune 500 every month.

Cem's work has been cited by leading global publications including Business Insider, Forbes, Washington Post, global firms like Deloitte, HPE and NGOs like World Economic Forum and supranational organizations like European Commission. You can see more reputable companies and resources that referenced AIMultiple.

Throughout his career, Cem served as a tech consultant, tech buyer and tech entrepreneur. He advised enterprises on their technology decisions at McKinsey & Company and Altman Solon for more than a decade. He also published a McKinsey report on digitalization.

He led technology strategy and procurement of a telco while reporting to the CEO. He has also led commercial growth of deep tech company Hypatos that reached a 7 digit annual recurring revenue and a 9 digit valuation from 0 within 2 years. Cem's work in Hypatos was covered by leading technology publications like TechCrunch and Business Insider.

Cem regularly speaks at international technology conferences. He graduated from Bogazici University as a computer engineer and holds an MBA from Columbia Business School.
Researched by
Headshot of Ekrem Sarı
Ekrem Sarı
Ekrem is an Industry Analyst at AIMultiple focused on intelligent automation and robotic process automation.

Next to Read

Comments

Your email address will not be published. All fields are required.

0 Comments