News and Blog

SenseTime Fully Open-Sources SenseNova U1: A Unified Model for Understanding and Generation

2026-04-29

Compact, Efficient, and Capable of High-Quality
Infographics and Continuous Image–Text Creation

SenseTime (00020) today announced the release and open-sourcing of SenseNova U1, its native unified multimodal model series. Built on the self-developed NEO-unify architecture introduced in March, the models in this series unifies multimodal understanding, reasoning, and generation within a monolithic model framework. Through efficient synergy between language and vision, SenseNova U1 enhances both understanding and content generation while maintaining semantic integrity and pixel-level fidelity, supporting complex infographic creation. It is also the industry’s first model to deliver the capability of continuous image–text creation within a unified architecture.

In domains such as logical reasoning and spatial intelligence, SenseNova U1 is able to understand complex layouts and fine-grained relationships in the physical world. This capability provides a critical foundation for future embodied AI systems, enabling robots to complete the full cycle of perception, reasoning, and precise task execution within a single model. Such an end-to-end approach represents an important step in advancing both technological development and industrial deployment.

Conventional multimodal models typically adopt a compartmentalized design, bridging a visual encoder (VE) with a language backbone through intermediate adapters. This approach resembles a workflow in which each component operates independently: one processes images, another converts visual content into text, a third interprets language, a fourth performs reasoning, and yet another translates outputs into design instructions before the final image is generated. As information must be transferred across separate components, overhead is incurred and semantic or visual fidelity is often compromised. To offset these structural limitations, such models generally require significantly more parameters, increasing complexity without fully addressing the underlying inefficiencies.

The NEO‑Unify architecture addresses these limitations by moving away from the conventional model design described above. It completely eliminates both the visual encoder (VE) and the variational auto‑encoder (VAE), and instead establishes a unified representation space. On this basis, SenseNova U1 operates as a single unified system capable of handling multiple modalities simultaneously. Images and text are processed within the same cognitive framework rather than being translated and handed off across separate components. By fusing language and vision at a foundational level, the architecture significantly reduces information loss and enables efficient multimodal understanding and generation, even at a relatively compact model scale.

The current open‑source release introduces the lightweight SenseNova U1 Lite series, which is available in two configurations:

SenseNova U1‑8B‑MoT — built on a dense backbone
SenseNova U1‑A3B‑MoT — built on a mixture‑of‑experts (MoE) backbone

Small Scale, Big Capability: Compact, Efficient Model with Performance Comparable to Commercial Models

Benchmark results highlight the performance characteristics of the SenseNova U1 Lite series. Across evaluations covering image understanding, image generation and editing, spatial intelligence, and visual reasoning, the models deliver leading results among open‑source models of comparable scale, setting a new benchmark for unified multimodal understanding and generation.

With its compact 8B MoT configuration, SenseNova U1 Lite matches, and in certain cases exceeds, the performance of larger commercial closed‑source models, demonstrating advantages across multiple tasks and application domains. This embodies the principle of “small scale, big capability.”

In general image generation benchmarks, SenseNova U1 Lite achieves commercial‑grade output quality comparable to Qwen‑Image 2.0 Pro and Seedream 4.5, while delivering meaningful gains in inference speed, supporting more efficient deployment in practical applications.

In the more demanding area of complex infographic generation, a task that has historically posed challenges for open‑source models, SenseNova U1 Lite attains commercial level performance, demonstrating strong control over layout coherence and text rendering accuracy.

Industry‑First Continuous Image–Text Creative Generation

Building on the strengths of the NEO‑Unify architecture, SenseNova U1 is the first model in the industry to achieve continuous image–text creative generation. Through native cross-modal understanding and generation, the model preserves fused visual and textual signals within contextual information, ensuring strong stylistic consistency and enabling efficient, coherent reasoning within a unified representation space. As a result, users can generate high-quality outputs within a single, one-shot model call, delivering significant efficiency gains compared with traditional multimodal approaches.

The SenseNova U1 Lite series is now fully open source and available for deployment and online use:

Open-Source Deployment:

GitHub: https://github.com/OpenSenseNova/SenseNova-U1
Hugging Face: https://huggingface.co/collections/sensenova/sensenova-u1
Call SenseNova U1 Skill: https://github.com/OpenSenseNova/SenseNova-Skills (Provides an extensive set of generation examples and prompt-engineering guides, enabling developers and agent-based applications to achieve high-quality infographic creation)

Online Experience & Access: Available soon via SenseTime’s office AI assistant, “Office Raccoon.”

SenseTime plans to continue advancing along this technical pathway, and will release larger‑scale models capable of delivering world‑class performance at significantly lower computational cost. The company believes that native unified multimodal intelligence represents a foundational step towards Artificial General Intelligence (AGI) and will continue to strengthen its open‑source ecosystem. Future iterations of the U1 series will include models with higher parameter counts, and SenseTime welcomes feedback from the global developer and research community to help shape the next generation of intelligent interaction.

Discord (SenseNova-U1-Lite Community): https://discord.gg/cxkwXWjp

Appendix I

Examples of SenseNova U1 Lite’s Capability: Demonstrating Commercial‑Grade Complex Infographic Generation Capabilities

Appendix II

Examples of SenseNova U1 Lite’s Capability: Delivering Coherent, High‑Fidelity Image–Text Interleaved Reasoning

Task 1: Medium‑Rare Steak Preparation
SenseNova U1 can reason through a complete cooking workflow, generating step‑by‑step instructions accompanied by corresponding images, while maintaining a consistent visual style throughout.

Task 2: Drawing an Iron Man Pattern
SenseNova U1 is able to iteratively refine a scanned sketch into a fully realized final image. Each stage extends the structure and detail of the previous output, with the unified representation space ensuring continuity, accuracy and visual fidelity across the entire creation process.

Appendix III

Superior Benchmark Performance of SenseNova U1 Lite

In general image generation tests, SenseNova U1 Lite delivers commercial‑grade quality comparable to Qwen‑Image 2.0 Pro and Seedream 4.5, while offering significant advantages in inference speed.

Even in the highly demanding area of complex infographic generation, a domain win which open‑source models have long faced limitations, SenseNova U1 Lite achieves commercial-grade performance, demonstrating strong control over layout structure and text rendering accuracy.

您尚未完善信息

完善信息后，即可下载资料

完善信息跳过，继续浏览

您尚未登录

您还未登录，登录方可继续

登录跳过，继续浏览

请选择您认为需要改进的地方：

导航不好用，不方便找到感兴趣的内容
产品介绍信息不够全面
产品介绍信息不容易懂
页面打开速度不快，页面浏览不流畅/有卡顿
页面不够美观
售后服务不好找，体验不好

跳过下一个

您是否能够达到本次网站的访问目的？

是
否
仍在进行中

下一个

您对商汤官网的满意度如何？

非常不满意非常满意

提交

已收到您对商汤官网的评价和建议！

感谢您的耐心反馈~

关闭

您还未登录，登录方可继续

登录跳过，继续浏览

您尚未完善信息

完善信息后，即可下载资料

完善信息跳过，继续浏览

Apply for Trial

Technical Capabilities

SenseTime Research

SenseNova

SenseFoundry Enterprise

SenseFoundry

SenseME

SenseMARS

SenseCare

Education

SenseMart

SenseAuto

SenseTime Fully Open-Sources SenseNova U1: A Unified Model for Understanding and Generation

您尚未完善信息

您尚未登录

请选择您认为需要改进的地方：

您是否能够达到本次网站的访问目的？

您对商汤官网的满意度如何？

已收到您对商汤官网的评价和建议！

您还未登录，登录方可继续

您尚未完善信息

Apply for Trial