Apply for Trial
News and Stories

Day 1 | SenseTime Launches Seko 2.0 Drama Series Generation Platform Powered by LightX2V inference framework to enable low cost, high-throughput video generation

2025-12-15

SenseTime kicked off its "Product Launch Week" today, and the first flagship release was Seko 2.0, a multi-episode drama generation agent tailored for short drama and motion comic creators. Supporting production up to 100-episode series, Seko 2.0 ensures consistency in characters, scenes, and props across episodes. It significantly boosts production efficiency, reducing comic drama production time by 80% to 90% compared to traditional methods. 

 

Seko 2.0 is powered by SenseTime’s open‑source LightX2V framework, the industry’s first inference framework capable of real‑time video generation. LightX2V enables low‑cost, high-throughput video creation, and has already been successfully adapted to Chinese AI chips such as Cambricon, paving the way for more affordable application solutions for creators in the future.


 

图片1.png


Supporting 100-episode series generation  

First launched in July 2025, Seko attracted over 100,000 creators within two months and its user base surpassed 200,000 in less than half a year, 30% of which were short drama creators and 20% comic drama creators. 

 

The new Seko 2.0 further lowers the creative threshold and enables more efficient video production. Its key upgraded features include: 

Ÿ   A revamped User Interface (UI) with optimized visual presentation, delivering an immersive creative experience.

Ÿ   Multi-episode generation capability with long-context management, supporting the creation up to 100 episodes. 

Ÿ   Intelligent Agent scheduling, enabling flexible association of characters, scenes, and props across episodes, with character makeup and styling adaptable to different scenarios. 

 

Xu Li, Chairman of the Board and CEO of SenseTime, stated: "Seko will open a door for people, who have creative imagination, but lack of professional production capabilities, so they can truly enter the creative ecosystem and unleash their imagination." He added that, as AI capabilities continue to evolve, the quality of generated episodes will gradually move up, generating both refined "high-brow" works and popular "low-brow" content that can coexist harmoniously. 

 

AI comic and short dramas have experienced an explosive growth in China. Data shows that the supply of comic dramas in mainland China surged at compound annual growth rate (CAGR) of 83% in the first half of this year. Meanwhile, more than 3,000 works were produced with total revenues jumping by 12 times. Many individual creators and small teams have joined this creative trend, some even taking it as their second career. 

 

Enabling multi-person lip-sync & audio-visual synchronization 

Most existing AI video generation products on the market only support single-episode short films or clip production. To meet creators’ urgent demand for high-quality, large-scale multi-episode production, Seko 2.0 achieves two core technology breakthroughs: 

 

1) SekoIDX: Achieving character consistency across multi-episode and cross-shot scenarios 

Maintaining character consistency across episodes and shots has long been an industry challenge. Traditional generation methods often result in characters appearing as "copied and pasted" (lacking vividness) or "unrecognizable" when responding to new motion or expression promots. By introducing "negative reference images" in the high-noise phase of diffusion models, SekoIDX ensures character consistency across multi-episode and cross-shot content while avoiding excessive similarity to reference images. It also maintains high stability when responding to commands for different expressions, poses, and scenes. 

 

2) SekoTalk: Realizing multi-person lip-sync and natural audio-visual synchronization 

Traditional digital human technology often suffers from inaccurate lip-sync in complex scenarios involving multi-language and multi-person interactions. As the industry’s first solution supporting lip-sync for more than two people, SekoTalk achieves highly precise audio-visual synchronization from single-person to multi-person interactions through a series of innovative designs, whether for daily conversations, intense arguments, or group counting. 

 

These underlying technology breakthroughs directly translate to a leap in productivity. Under the traditional workflow, a team would typically take over three months to complete 50 episodes. With Seko 2.0, the comic drama production cycle can be reduced by 80% to 90%. Seko’s comprehensive features make it feasible for independent creators to produce videos in a "one-person crew" model. 


 

图片2.png


Adapting to domestic solutions   

Multi-episode creation of AI short dramas and comic dramas involves a large number of shots. A single 5-second video requires generating nearly 100,000 tokens, and 10 to 20 shots can result in a total token demand of 1 to 2 million. 

 

To address this challenge, SenseTime has developed Phased DMD Distillation technology, which can significantly reduce the overall cost of multi-episode generation. By combining phased distillation with a Mixture of Experts (MoE) model, it allows different models to specialize in different stages of the generation process, substantially improving overall model capability and efficiency without increasing inference costs. 

 

SenseTime has also open-sourced LightX2V, the industry’s first inference framework enabling real-time video generation. Through optimizations such as DiT distillation acceleration, lightweight VAE, and sparse attention, it achieves low-cost real-time video generation. LightX2V can generate a 5-second video on consumer-grade graphics cards in under 5 seconds, significantly outperforming Sora2, which takes several minutes. To date, it has accumulated over 3.5 million downloads and has gained widespread popularity among creators worldwide. Currently, LightX2V has completed adaptation to Chinese chips including Cambricon and MetaX, realizing fully domestic deployment of video generation models with near-real-time generation efficiency. 

 

On domestic chip platforms, Seko can achieve generation effects comparable to international chip platforms in similar timeframes. International chip platforms can generate 1.25 seconds of video per second of operation, while domestic chip platforms generate 1.0625 seconds per second. With domestic chip performance improving and ecosystem maturing, this gap is expected to further narrow down. 

 

In the future, SenseTime will also offer domestic solution options for Seko creators, ushering in a low-cost era of AI video creation with extreme cost-performance ratio. 

 

图片3.png

Topping Douyin’s AI short drama chart  

In the comic and short drama market, Seko has successfully incubated a series of hit shows, including the live-action short drama “Wan Xin Ji”, which topped Douyin’s AI Short Drama Chart. Works such as “I Built a Doomsday Fortress on the Mountain Top” and “Yin Shen Lu” have also gained widespread attention. 

 

图片4.png

In the high-quality film and television sector, Seko has officially reached a strategic partnership with Yangtze River Film Group, a leading enterprise in the film and television industry. The two parties will jointly explore the integrated innovation of "Generative AI (AIGC) + Film and Television," planning to launch a series of short dramas based on Jingchu cultural and historical stories next year. The two sides will also co-incubate theatrical-level AIGC films, promoting the integration of AI creative tools into professional film and television production processes. 

 

Cambricon and SenseTime will work together to drive the prosperity and development of the domestic AI application ecosystem, refine a more efficient and user-friendly stepped product system, and build more open and developer-friendly tools to stimulate innovation in cutting-edge applications.