General Artificial Intelligence

Core Technology: Based on the foundation of proprietary technologies and SenseCore AI infrastructure,
SenseTime has rapidly opened up AI application in multiple vertical scenarios, and is empowering various industries.

Home>Core Technology >

Technical Capabilities: SenseCore AI Cloud Intelligent Automobile Technology General Artificial Intelligence Augmented Reality AI-enabled Content Generation Decision Intelligence AI-enabled Content Enhancement Medical Image Analysis

Key Technology

01Object Recognition
02Feature Point Positioning
03Identity Verification
04Facial Attributes
05Portrait Clustering
06Liveness Detection
07Portrait Beautification/Make-up
08Vehicle Type Recognition
09Scenario Recognition
10Remote Image Sensing and Interpretation
11Apparel Attribute Recognition
12Video Summarization
13Video Content Structuring
14Short Video Labelling
15Text Recognition
16Speech Recognition
17Natural Language Processing
18Robot Sensing and Control

01 / 018

Object Recognition

The world-leading universal object recognition algorithm accurately recognizes a lot of common objects in photos.

02 / 018

Feature Point Positioning

Millisecond-level positioning 21, 106 or 240 feature key points on eyes, mouths, noses on various accuracy levels. This technology is capable of decoding wide-angle face side profiles, dramatic changes in facial expressions, partially obscured and blurry images, and changes in light, as well as images taken in other environments. Developed as the first solution to enable the positioning of 14 body feature points on mobile devices, it leverages RGB images to provide real-time recognition of the head, shoulders, limbs and other parts, and can be deployed to detect a wide range of movements.

03 / 018

Identity Verification

The solution determines whether two images are of the same person with over 99% accuracy.

04 / 018

Facial Attributes

The facial attribute software accurately recognizes more than 10 facial attributes such as gender, facial expressions, accessories and facial motions among others. It can be deployed in targeted advertising or consumer analysis to better understand the audience and the customers.

05 / 018

Portrait Clustering

The portrait clustering technology rapidly consolidates hundreds of thousands of portrait images and is applicable to intelligent photo album management and analysis of group photos on social networks. Not only does it make photo management easier, but it also allows social networks to run and operate more efficiently.

06 / 018

Liveness Detection

The liveness detection technology serves as an additional security measure for user verification and prevents spoofing attacks. Together with identity verification, it can identify whether the image in front of the camera is a real person or an image manipulation, providing important security services for applications in industries such as finance. The liveness detection technology can effectively differentiate between high-definition photos, photoshopped images, 3D models, face swaps, and other attempts to bypass the system.

07 / 018

Portrait Beautification/Make-up

Based on image content detection and recognition technology, beautification product provides a wide range of smartphone portrait beautification and make-up effects for real-time scenarios.

08 / 018

Vehicle Type Recognition

The vehicle type detection system accurately recognizes plenty of vehicle types under different environmental conditions regardless of lighting conditions and shooting angles.

09 / 018

Scenario Recognition

The system accurately identifies hundreds of natural scenes, thousands of common objects and their properties. Not only does it accelerate the photo search and classification process for smart photo album management, but it also helps produce more eye-catching display advertisements of the scenes or objects.

10 / 18

Remote Image Sensing and Interpretation

Based on high-resolution satellite images, the remote image sensing and interpretation system automatically extracts and interprets geographical data such as clouds, snow, water, buildings and road networks, producing pixel-to-plane detection results for land use classification. Besides dynamic change detection, the system supports key points detection, accurately recognizing various target features including the location, length, width and resting position of the target.

11 / 18

Apparel Attribute Recognition

The system conducts automatic detection and recognition of apparel in photos and videos which accurately recognizes apparel types, patterns, sleeve types, collar types and other details despite interferences such as changes in illumination and changes in gesture.

12 / 18

Video Summarization

Based on self-developed deep learning algorithm, the technology analyzes the content and style of each shot in a long video. It discovers the intrinsic relations among scenes and activities in each shot, and extracts key information to produce a short video summary. This technology has been applied to the television industry, mobile Internet and more.

13 / 18

Video Content Structuring

By automatically analyzing and extracting key elements from the video (such as fashion and apparel, scenes, logos, merchandise and behaviors), the technology provides rich, structured information for efficient video content management and targeted marketing.

14 / 18

Short Video Labelling

Using an industry-leading large-scale multi-label tagging algorithm, the technology automatically comprehends video content, and generates text labels to improve the performance of video search and recommendation. With the comprehensive labeling system, the technology has been successfully applied to various industries including mobile Internet, television, advertising, and more.

15 / 18

Text Recognition

a) Natural Scene: The technology automatically extracts text information from complex images of natural scenes.
b) Natural Scene: The technology automatically extracts text information from card images captured under different environmental conditions.
c) Receipt: The technology recognizes multiple types of receipts regardless of their formats, and automatically locates text information on the receipts.

16 / 18

Speech Recognition

a) Speech Recognition: The technology automatically transcribes spoken audio into text.
b) Spoken Keyword Detection: Detection of keyword wakes up the device and then activates speech interaction. In some applications, the sequence of keywords can be used as voice commands to a smart device.
c) Speaker Recognition: Verify and identify speakers by their unique voice characteristics.

17 / 18

Natural Language Processing

a) Natural Language Understanding and Generation: Text representation learning, knowledge based semantic understanding, controllable language generation.
b) Dialog System: task driven dialog system, knowledge based multi-turn question answering system.

18 / 18

Robot Sensing and Control

a) Robot Simulation Platform: Leveraging the robot simulation platform to flexibly modify the experimental environment allows fast data collection, which helps the development and evaluation of learning-based autonomous grasping algorithms. It is implemented with modular structure so that the key module can be updated or replaced according to the requirements. The key data recorded in the simulation platform can be saved for further use.
b) 3D Vision-Guided Robot Random Bin Picking: By analyzing 3D visual data, the system accurately estimates the 6Dpose of stacked objects in a complex environment. With the collision detection and motion planning algorithm, the system can guide the robot manipulator to grasp stacked object in a specified way. This technology can be applied to various industrial scenarios such as flexible object assembly, machine tending, logistic order picking, palletization and depalletizion.
c) Vision-Driven Robot Arm Object Manipulation: Deep learning and reinforcement learning methods allow the robot arm to learn autonomously. Multi-object manipulation tasks based on vision sensors (such as object manipulation/placement and parts assembly) effectively reduce hardware and system integration costs. The model can also be trained using samples in the simulation environment and then transferred to the real environment, reducing on-site debugging overheads. The technology significantly enhances the flexibility of robot use in industrial scenarios such as optimizing product assembly line in manufacturing process and upgrading multi-category object sorting system in logistics.

Technology

Object Recognition
Feature Point Positioning
Identity Verification
Facial Attributes
Portrait Clustering
Liveness Detection
Portrait Beautification/Make-up
Vehicle Type Recognition
Scenario Recognition
Remote Image Sensing and Interpretation
Apparel Attribute Recognition
Video Summarization
Video Content Structuring
Short Video Labelling
Text Recognition
Speech Recognition
Natural Language Processing
Robot Sensing and Control