AIMultipleAIMultiple
No results found.

Top 10 Edge AI Chip Makers with Use Cases

Cem Dilmegani
Cem Dilmegani
updated on Nov 17, 2025

The demand for low-latency processing has driven innovation in edge AI chips. These processors are designed to perform AI computations locally on devices rather than relying on cloud-based solutions.

Based on our experience analyzing AI chip makers, we identified the leading solutions for robotics, industrial IoT, computer vision, and embedded systems.

*TOPS = Tera Operations Per Second. These are maximum quoted values by vendors.
**Kria K26 performance varies depending on the FPGA configuration.

Analysis of Edge AI chips

1. NVIDIA Jetson AGX Orin

NVIDIA Jetson AGX Orin delivers 275 TOPS, positioning it as the highest-performance edge AI module currently available. The module is built on NVIDIA’s Ampere architecture and is designed for robotics and autonomous systems that require significant on-device processing capabilities.

Key specifications:

  • Power consumption: 10-60W (configurable based on workload)
  • Memory: Up to 64GB LPDDR5
  • Software: Full CUDA support, compatibility with NVIDIA’s datacenter AI stack

The power consumption range of 10-60W provides flexibility for different deployment scenarios. Lower power modes can extend battery life in mobile robotics applications, while maximum performance mode supports multiple concurrent AI workloads.

NVIDIA’s software ecosystem represents a significant advantage. Models developed for NVIDIA datacenter GPUs can be deployed on Jetson with minimal modification. This compatibility reduces development time for teams already working within the NVIDIA ecosystem.

2. Google Coral Dev Board

Google’s Coral Dev Board features the Edge TPU, a purpose-built ASIC for running TensorFlow Lite models at the edge. The Edge TPU delivers 4 TOPS while consuming only 2W, making it suitable for battery-powered IoT devices and embedded systems.

Key specifications:

  • Power consumption: 2W
  • Software: TensorFlow Lite, supports quantized models
  • Form factors: USB accelerator, M.2 module, SoM, and dev board

The Edge TPU’s architecture prioritizes power efficiency over raw performance. The 4 TOPS performance is achieved through 8-bit integer quantization, which reduces model size and power consumption.

The Coral ecosystem includes multiple form factors. The USB accelerator enables adding AI capabilities to existing systems through a single USB connection. The M.2 module provides a more integrated solution for custom hardware designs.

Limitations:

  • Limited to TensorFlow Lite models
  • Requires model quantization to int8
  • Performance decreases significantly for operations not optimized for the TPU

3. Intel Neural Compute Stick 2

Intel’s Neural Compute Stick 2 utilizes the Movidius Myriad X VPU to deliver 4 TOPS in a compact USB form factor. The device enables the addition of AI inference capabilities to existing systems without requiring hardware modifications.

Key specifications:

  • Power consumption: 5W
  • Software: OpenVINO toolkit support
  • Form factor: USB 3.0 stick

Intel’s OpenVINO toolkit provides model optimization and runtime libraries. The toolkit supports models from multiple frameworks, including TensorFlow, PyTorch, and ONNX. Model optimization through OpenVINO can significantly improve inference performance on Myriad X hardware.

Use cases:

  • Drones requiring real-time object detection
  • Smart cameras for retail analytics
  • AR devices with on-device image processing

4. AMD Xilinx Kria K26 SOM

The Kria K26 System-on-Module combines a Zynq UltraScale+ MPSoC with FPGA fabric, enabling adaptive edge AI solutions. The FPGA architecture allows customization of the processing pipeline for specific computer vision and sensor fusion workloads.

Key specifications:

  • Processing: Quad-core Arm Cortex-A53, dual-core Arm Cortex-R5F
  • FPGA: UltraScale+ programmable logic
  • Power consumption: 5-15W
  • Memory: 4GB DDR4

AMD provides pre-built vision AI applications through the Kria KV260 Vision AI Starter Kit. These applications include smart camera implementations with capabilities for object detection, classification, and tracking.

Advantages:

  • Customizable processing pipeline
  • Low-latency sensor interfaces
  • Adaptable to new AI model architectures

Limitations:

  • Requires FPGA development expertise for custom implementations
  • Performance depends on FPGA configuration
  • Higher development complexity compared to fixed-function accelerators

5. Qualcomm Robotics RB5 Platform

Qualcomm’s Robotics RB5 platform integrates 5G connectivity with edge AI processing, delivering approximately 15 TOPS through its Qualcomm AI Engine. The platform targets autonomous robots and drones that require both high-bandwidth connectivity and on-device AI processing.

Key specifications:

  • AI performance: 15 TOPS
  • Connectivity: 5G, Wi-Fi 6, Bluetooth 5.1
  • Processing: Qualcomm Kryo 585 CPU, Adreno 650 GPU, Hexagon 698 DSP
  • Power consumption: 5-15W

The integration of 5G offers high-bandwidth, low-latency connectivity for applications that require real-time cloud communication.

The RB5 platform includes hardware support for up to seven concurrent camera inputs. This multi-camera capability supports 360-degree perception systems for autonomous mobile robots.

Use cases:

  • Autonomous delivery robots
  • Industrial inspection drones
  • Warehouse automation systems
  • Connected vehicles

6. Hailo-8 AI Accelerator

Hailo-8 delivers 26 TOPS while consuming only 2.5-3W, representing one of the highest performance-per-watt ratios among edge AI chips.

Key specifications:

  • Performance: 26 TOPS
  • Power consumption: 2.5-3W
  • Form factors: M.2 module, PCIe card
  • Software: Hailo SDK with model zoo

The chip supports standard neural network layers and can run models developed in TensorFlow, PyTorch, and ONNX. Hailo’s compiler optimizes model graphs for their architecture during deployment.

Performance characteristics:

  • Low latency for real-time applications
  • Consistent performance across batch sizes
  • Efficient handling of multiple concurrent models

7. Rockchip RK3588 SoC

The RK3588 is an 8-core SoC featuring a 6 TOPS neural processing unit. The chip targets single-board computers and edge devices requiring moderate AI performance alongside general-purpose computing capabilities.

Key specifications:

  • CPU: Quad-core Cortex-A76 + Quad-core Cortex-A55
  • NPU: 6 TOPS
  • GPU: Mali-G610 MP4
  • Power consumption: 8-15W
  • Memory: Support for up to 32GB LPDDR4/5

The 6 TOPS NPU handles neural network inference for computer vision, natural language processing, and audio processing tasks.

Use cases:

  • Digital signage with content recognition
  • Edge gateways with AI preprocessing
  • Smart home hubs
  • Industrial HMI panels

The RK3588’s general-purpose computing capabilities make it suitable for applications where AI inference is one component of a larger system. Organizations building edge devices that combine AI with web servers, databases, or other software services have adopted this SoC.

8. NXP i.MX 8M Plus

NXP’s iMX 8M Plus features a 2.3 TOPS neural processing unit, designed specifically for industrial IoT applications. The processor prioritizes reliability, security, and long-term availability over maximum performance.

Key specifications:

  • NPU: 2.3 TOPS
  • CPU: Quad-core Cortex-A53, Cortex-M7 real-time core
  • Power consumption: 3-8W
  • Security: EdgeLock secure enclave

The inclusion of a Cortex-M7 real-time core enables deterministic processing for time-critical control loops. This architecture supports applications that combine AI-based decision-making with real-time control, such as industrial robots and automated manufacturing equipment.

NXP’s EdgeLock security features provide hardware-based secure boot, encrypted storage, and secure key management.

Use cases:

  • Industrial automation
  • Medical devices
  • Building automation
  • Smart agriculture

9. Axelera Metis AI Platform

Axelera’s Metis AI platform delivers up to 214 TOPS for high-throughput vision inference workloads. The platform uses Digital In-Memory Computing (D-IMC) architecture to improve throughput and efficiency.

Key specifications:

  • Performance: Up to 214 TOPS
  • Power consumption: 20-40W
  • Architecture: Digital In-Memory Computing (D-IMC)
  • Target: Computer vision inference

The D-IMC architecture performs computations directly within memory arrays, reducing data movement between memory and processing units. This approach addresses the memory bandwidth bottleneck that limits performance in traditional architectures.

Axelera targets applications that require the simultaneous processing of multiple video streams. The high throughput enables real-time analysis of dozens of camera feeds from a single device.

Use cases:

  • Multi-camera surveillance systems
  • Smart city infrastructure
  • Retail analytics with dense camera deployments
  • Industrial quality inspection systems

Axelera received €61.6 million in funding from the EuroHPC Joint Undertaking in March 2025, supporting the development of their Titania chiplet for deployment by 2028.

10. SiMa.ai MLSoC

SiMa.ai’s MLSoC (Machine Learning System-on-Chip) delivers over 50 TOPS while maintaining power consumption below 5W. The chip targets embedded vision applications requiring high performance in power-constrained environments.

Key specifications:

  • Performance: 50+ TOPS
  • Power consumption: <5W
  • Software: SiMa Platform SDK
  • Architecture: Optimized for vision transformers and CNNs

SiMa.ai designed the MLSoC specifically for computer vision workloads. The sub-5W power envelope enables deployment in battery-powered devices that require sustained high-performance inference.

Use cases:

  • Autonomous mobile robots
  • Drone-based inspection systems
  • Smart cameras for surveillance and analytics
  • Augmented reality devices

Performance vs. Power Consumption Analysis

Edge AI chips face a tradeoff between performance and power consumption.

High Performance (>50 TOPS):

  • NVIDIA Jetson AGX Orin (275 TOPS, 10-60W)
  • Axelera Metis (214 TOPS, 20-40W)
  • SiMa.ai MLSoC (50+ TOPS, <5W)

These solutions target applications where AI performance is the primary requirement. Use cases include autonomous vehicles, industrial robotics, and multi-camera video analytics systems.

Balanced Performance (15-30 TOPS):

  • Hailo-8 (26 TOPS, 2.5-3W)
  • Qualcomm RB5 (15 TOPS, 5-15W)

Balanced solutions optimize the performance-per-watt ratio. These chips are suitable for applications where both performance and power consumption are constrained, such as battery-powered robots and smart cameras.

Low Power (<10 TOPS):

  • Rockchip RK3588 (6 TOPS, 8-15W)
  • Intel Movidius Myriad X (4 TOPS, 5W)
  • Google Edge TPU (4 TOPS, 2W)
  • NXP i.MX 8M Plus (2.3 TOPS, 3-8W)

Low-power solutions prioritize energy efficiency over raw performance. IoT devices, battery-powered cameras, and embedded systems with limited thermal budgets typically use these chips.

The selection of appropriate hardware depends on:

  1. Required inference throughput (frames per second, inferences per second)
  2. Power budget (battery life requirements, thermal constraints)
  3. Latency requirements (real-time vs. near-real-time processing)
  4. Model complexity (number of parameters, operations per inference)

Software ecosystem

Software support has a significant impact on the practical performance and development time for edge AI deployments.

NVIDIA Jetson benefits from full CUDA ecosystem compatibility. Models developed for NVIDIA data center GPUs can be deployed with minimal modification. This compatibility reduces development time for teams already using NVIDIA hardware.

Google Edge TPU requires TensorFlow Lite models with int8 quantization. While this limitation ensures optimal performance on the TPU, it requires model conversion and validation steps. Organizations not using TensorFlow may face additional development work.

Intel Movidius integrates with the OpenVINO toolkit, which supports multiple model frameworks. The toolkit’s optimization capabilities can significantly improve inference performance, but require learning Intel-specific tools.

AMD Xilinx Kria demands FPGA development expertise for custom implementations. While pre-built vision AI stacks reduce this requirement, organizations that seek custom processing pipelines require specialized skills.

Qualcomm, Hailo, and other vendors provide their own SDKs and model compilers. Development teams should evaluate these tools during the selection process to understand the required effort for model deployment and optimization.

Form factor options

Edge AI chips are available in multiple form factors to address different integration requirements:

System-on-Module (SoM):

  • NVIDIA Jetson AGX Orin
  • AMD Xilinx Kria K26
  • Qualcomm RB5

SoM provides a complete computing module that can be integrated into custom carrier boards. This approach reduces hardware design complexity while enabling customization of I/O interfaces.

M.2 and PCIe Cards:

  • Hailo-8
  • Google Coral
  • Intel Movidius (via M.2 adapter)

M.2 and PCIe form factors enable adding AI acceleration to existing systems. This approach is suitable for applications that upgrade existing hardware platforms with AI capabilities.

USB accelerators:

  • Google Coral USB Accelerator
  • Intel Neural Compute Stick 2

USB accelerators provide the simplest integration path. These devices are suitable for prototyping, development, and applications where the host system has available USB ports and sufficient bandwidth.

Integrated SoC:

  • Rockchip RK3588
  • NXP i.MX 8M Plus

Integrated SoCs combine CPU, GPU, and NPU in a single chip. This integration reduces board complexity and cost for products designed around the specific SoC.

Application-specific recommendations

Robotics and autonomous systems: NVIDIA Jetson AGX Orin or Qualcomm RB5 provide the performance required for real-time navigation, object detection, and path planning. The choice depends on whether 5G connectivity is a requirement.

Industrial IoT and factory automation: NXP i.MX 8M Plus or AMD Xilinx Kria K26 address the security and real-time processing requirements common in industrial applications. The Kria platform suits applications requiring custom sensor interfaces or deterministic latency.

Smart cameras and video analytics: Hailo-8 or Axelera Metis deliver the performance-per-watt ratio required for always-on video processing. Hailo-8 suits single or few-camera deployments, while Axelera Metis targets multi-camera systems.

Battery-powered IoT devices: Google Edge TPU provides the lowest power consumption for applications where battery life is the primary constraint. The 2W power consumption enables extended operation from small batteries.

Drones and AR devices: Intel Movidius Myriad X or SiMa.ai MLSoC balance performance with power consumption for airborne and wearable devices. The weight and thermal constraints in these applications favor efficient solutions.

Development and prototyping: Intel Neural Compute Stick 2 or Google Coral USB Accelerator enable quick evaluation of edge AI capabilities without hardware modifications. These USB devices are suitable for proof-of-concept projects and algorithm development.

FAQs

Further reading

If you need help finding a vendor or have any questions, feel free to contact us:

Find the Right Vendors
Principal Analyst
Cem Dilmegani
Cem Dilmegani
Principal Analyst
Cem has been the principal analyst at AIMultiple since 2017. AIMultiple informs hundreds of thousands of businesses (as per similarWeb) including 55% of Fortune 500 every month.

Cem's work has been cited by leading global publications including Business Insider, Forbes, Washington Post, global firms like Deloitte, HPE and NGOs like World Economic Forum and supranational organizations like European Commission. You can see more reputable companies and resources that referenced AIMultiple.

Throughout his career, Cem served as a tech consultant, tech buyer and tech entrepreneur. He advised enterprises on their technology decisions at McKinsey & Company and Altman Solon for more than a decade. He also published a McKinsey report on digitalization.

He led technology strategy and procurement of a telco while reporting to the CEO. He has also led commercial growth of deep tech company Hypatos that reached a 7 digit annual recurring revenue and a 9 digit valuation from 0 within 2 years. Cem's work in Hypatos was covered by leading technology publications like TechCrunch and Business Insider.

Cem regularly speaks at international technology conferences. He graduated from Bogazici University as a computer engineer and holds an MBA from Columbia Business School.
View Full Profile

Be the first to comment

Your email address will not be published. All fields are required.

0/450