Xiaomi Releases and Open-Sources Autonomous Driving Model Xiaomi OneVL
[Technology Launch] Xiaomi has introduced and fully open-sourced Xiaomi OneVL, a one-step latent-space language-vision reasoning framework.
Core Development: Unifying Three Technical Approaches to Achieve Both Speed and Accuracy
Xiaomi OneVL is the first framework to integrate VLA models, world models, and latent-space reasoning into a single unified system. It employs a dual-supervision mechanism combining "language-based reasoning" and "visual future prediction," significantly accelerating inference speed while maintaining high accuracy. Its key innovation lies in compressing predictions of future visual frames—not relying solely on language reasoning—thus preserving critical spatiotemporal causal information essential for driving decisions.
Key Metrics: Outperforming Existing Methods Across the Board
On multiple mainstream benchmarks, the model surpasses explicit Chain-of-Thought (CoT) approaches in accuracy and matches the inference speed of "answer-only" modes, resolving a long-standing industry challenge of balancing real-time performance with causal reasoning.
Strategic Foundation: Advancing Collaborative Ecosystem Development for Autonomous Driving Foundation Models
Lei Jun announced the full open-sourcing of the model, calling on global developers to jointly explore the potential of large autonomous driving models and accelerate technological iteration and real-world deployment.