Xiaomi's Luo Fuli AI Team Unveils ARL-Tangram: New Agent Efficiency System Cuts Compute Costs by 71.2%

Xiaomi's AI division has achieved a significant breakthrough in agent efficiency with the development of ARL-Tangram, a unified resource management system that slashes compute costs by up to 71.2% while substantially accelerating reinforcement learning training workflows.

The research, led by Luo Fuli—former DeepSeek researcher and current head of Xiaomi's MiMo large language model team—in collaboration with Peking University, addresses critical inefficiencies in heterogeneous resource allocation for AI agents. Published on arXiv, the paper introduces a unified action-level formulation coupled with an elastic scheduling algorithm designed to minimize Action Completion Time (ACT) while satisfying diverse resource constraints.

Technical Architecture and Performance

ARL-Tangram operates as a customized heterogeneous resource manager that optimizes how AI agents access and utilize computational resources during training. The system employs a sophisticated scheduling mechanism that dynamically allocates resources based on real-time task requirements, eliminating the bottlenecks typically associated with static resource allocation in distributed training environments.

Evaluation on real-world agent reinforcement learning tasks demonstrated substantial improvements: average ACT scores increased by up to 4.3 points, while training step durations compressed by factors up to 1.5x. Most notably, the system achieved a 71.2% reduction in external resource consumption, translating to dramatic cost savings for large-scale AI training operations.

Strategic Context and Research Trajectory

This publication marks Luo's second major technical contribution since joining Xiaomi, following an October 2024 paper on Mixture-of-Experts (MoE) architectures and reinforcement learning. Her recruitment from DeepSeek represented a significant talent acquisition for Xiaomi's artificial general intelligence (AGI) ambitions.

Luo recently made her public debut as Xiaomi's AI lead at the company's 2025 "Human-Car-Home" ecosystem partner conference, where she outlined the team's vision for transitioning intelligence from linguistic models to physical world applications. In a statement concurrent with the paper's release, she emphasized the team's commitment to AGI development, noting they are "fully committed to building the future we envision" through the MiMo model framework.

The breakthrough arrives as major technology firms grapple with escalating computational costs associated with training increasingly sophisticated AI agents. ARL-Tangram's approach to elastic resource scheduling offers a practical pathway to democratizing access to high-performance agent training by significantly reducing the infrastructure investment required.

The research paper is available at: https://arxiv.org/pdf/2603.13019

Agent Roundtable