
Following the launch of Rabbit, an AI device that can use mobile apps, the term large action models (LAMs) is getting popular. These models move beyond conversation by turning LLMs into “agents” that can connect the siloed, app-driven world without requiring a user to click on apps or integrate an API.
The line between hype and reality of LAMs is blurry but in short: LAM is a large language model (LLM) specifically trained to take actions (e.g. send API requests).1
What is a large action model (LAM)?
A large action model is an artificial intelligence that can reason and carry out complex tasks by turning them into actions.
How do large action models (LAM) work?
LAMs interact with applications via their user interfaces or more commonly via APIs. For example, they can process the images and code of a website or application to decide their next steps and perform actions.
This allows LAMs to navigate user and application interfaces. For example, if the information exists already or is accessible through another app, it will retrieve it from that app rather than asking the user.
Within LAMs, such degrees of autonomy and comprehension transform generative AI into an active assistant that can perform tasks such as:
- administering social media platforms
- getting weather information
- making reservations
- processing financial transactions
- connect to IoT devices to allow you to send commands to them (e.g. calling an Uber)

Source: Salesforce2
LAMs and LLMs – understanding the difference
Aspect | Large action models (LAMs) | Large language models (LLMs) |
---|---|---|
Functionality | Reason and generate text responses | |
Learning approach | Learn from human interactions | Trained on large data sets to comprehend context and voice in humans |
Example task: Booking a room | LAMs may handle the full procedure in a single command, including browsing interfaces and filling out hotel forms | LLMs may give directions and links, but they cannot finalize the task |
Performance | Adequate for specific tasks with limited scope | High performance across a wide range of tasks |
Adaptability | Requires more manual intervention to adapt to new tasks or domains | Can adapt more easily to a wide range of tasks with minimal retraining |
Though LAMs and large language models —LLMs— share some similarities, like their ability to grasp human intentions, their core purposes differ greatly.
LAMs are designed to take action, whereas LLMs excel in processing and generating language. While an LLM might suggest ideas or generate text based on your input, a LAM takes it a step further by autonomously performing tasks like making appointments, ordering products, or filling out forms.
Large agentic models (LAM) hype or real?
While some companies portray LAMs as a new architecture, the functionalities assigned to them have been implemented for some time using LLM agents.3
Additionally, LLM agents have previously been performing tasks that LAMs are described to do. The two concepts share common functionalities (see figure):
- Context-based analysis
- Prompt engineering
- Leveraging tools
- Reasoning4
Figure: Language-based AI agent workflow

Source: ICLR5
Furthermore, LAMs can be described as language-based agent designs such as (1) prompt template-based AI agents; (2) learnable prompt AI agents; and (3) large action models (LAMs); stating that we can think of a LAM as a LLM specifically trained to execute human actions from data.6
For more details on AI models see our data-driven research on:
Real-life LAM examples
1. Automatically completing forms or spreadsheets on websites
A LAM can recognize the needed fields on a form, gather the required data (e.g. addresses, names, passwords, and credit card numbers) from a database or user profile, and enter it into the proper fields.
Video: Automatically completing forms or spreadsheets with LAM
2. Completing online transactions
A LAM can work with buttons, links, and dropdown menus. It may also insert specific text into text fields and search bars. This is precisely what ordering pizza online entails: filling out text forms, clicking buttons, and selecting menu selections.
Video: HyperWriteAI Assistant Studio using the browser to place an online order
Source: HyperWriteAI8
Technologies in LAMs
A LAM may utilize the following techniques:
- Connections: Connect to several apps and APIs.
- Neuro-symbolic approach: Neuro symbolic programming is a method that allows LAMs to combine neural networks trained on large data sets with built-in symbolic logical reasoning capabilities. This enables them to notice patterns while also comprehending the underlying reasoning, making them more adaptive and capable of taking meaningful responses depending on the “why” of user requests.
- Instruction abstraction: Create instructions that provide modular, and hierarchical abstraction for modeling via an interface.
- Direct human modeling: Identify a user intent, habits, and routines, across applications to develop a template for acting.
- Task reasoning: Analyze the relationships between tasks, identifying dependencies and determining the optimal order of execution. It ensures that prerequisite tasks are completed before dependent ones begin. This enables the LAM to improve workflows based on past interactions.
- Continuous learning: LAMs not only perform task execution but also improve their performance over time through continuous learning. For example, LAM could manage customer inquiries about orders, returns, and product information. Over time, it would become more adept at resolving issues quickly, even predicting and addressing potential problems before customers reach out.
Further reading
External Links
- 1. Introducing SAM – A 7B Small Agentic Model that outperforms GPT-3.5 and Orca on reasoning benchmarks - SuperAGI.
- 2. Salesforce/xLAM-1b-fc-r · Hugging Face.
- 3. Language-based AI Agents and Large Action Models (LAMs) | Juan Carlos Niebles .
- 4. What Are Large Action Models and How Do They Work? | Trinetix. Trinetix | Globally Trusted Digital Partner
- 5. [2210.03629] ReAct: Synergizing Reasoning and Acting in Language Models.
- 6. [2402.15506] AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning.
- 7. UiPath joins Large Action Model Race - YouTube.
- 8. Matt Shumer on X: "Today, we’re unveiling Personal Assistant - @HyperWriteAI's groundbreaking AI agent that can use a web browser like a human. One agent to rule them all. It’s time to reimagine the way we interact with the internet. https://t.co/csGjtIU0.
Comments
Your email address will not be published. All fields are required.