On-device APIs

Take advantage of machine learning features designed for immediate app integration, with no machine learning experience needed.

What’s new

New APIs in the Vision framework provide advanced image segmentation, animal body pose detection, and 3D human body pose leveraging depth information. Use VisionKit to easily integrate Visual Lookup and subject-lifting experiences into your app. The Natural Language framework enhances understanding of multilingual text using new transformer-based embedding models. The Speech framework makes it easy to introduce custom vocabulary for speech recognition, so you can personalize your user experiences.

Watch the latest video

Vision

Vision

Build features that can process and analyze images and video using computer vision.

View Vision framework

Image Classification

Automatically identify the content in images.

View API

Image Saliency

Quantify and visualize the key part of an image or where in the image people are likely to look.

View API

Image Alignment

Analyze and manage the alignment of images.

View API

Image Similarity

Generate a feature print to compute distance between images.

View API

Object Detection

Find and label objects in images.

View API

Object Tracking

Track moving objects in video.

View API

Trajectory Detection

Detect the trajectory of objects in motion in video.

View API

Contour Detection

Trace the edges of objects and features in images and video.

View API

Text Detection

Detect regions of visible text in images.

View API

Text Recognition

Find, recognize, and extract text from images.

View API

Face Detection

Detect human faces in images.

View API

Face Tracking

Track faces from a camera feed in real time.

View API

Face Landmarks

Find facial features in images by detecting landmarks on faces.

View API

Face Capture Quality

Compare face capture quality in a set of images.

View API

Human Body Detection

Find regions that contain human bodies in images.

View API

Body Pose

Detect landmarks on people in images and video.

View API

Hand Pose

Detect landmarks on human hands in images and video.

View API

Animal Recognition

Find cats and dogs in images.

View API

Barcode Detection

Detect and analyze barcodes in images.

View API

Rectangle Detection

Find rectangular regions in images.

View API

Horizon Detection

Determine the horizon angle in images.

View API

Optical Flow

Analyze the pattern of motion of objects between consecutive video frames.

View API

Person Segmentation New

Produce a matte image for a person in an image.

View API

Document Detection New

Detect rectangular regions in images that contain text.

View API

Natural Language

Natural Language

Analyze natural language text and deduce its language-specific metadata.

View Natural Language framework

Tokenization

Enumerate the words in text strings.

View API

Language Identification

Recognize the language of bodies of text.

View API

Named Entity Recognition

Use a linguistic tagger to name entities in a string.

View API

Part of Speech Tagging

Classify nouns, verbs, adjectives, and other parts of speech in a string.

View API

Word Embedding

Get a vector representation for any word and find similarity between two words or nearest neighbors for a word.

View API

Sentence Embedding

Get a vector representation for any string and find similarity between two strings.

View API

Sentiment Analysis

Score text as positive, negative, or neutral based on the sentiment.

View API

Speech

Speech

Take advantage of speech recognition and saliency features for a variety of languages.

View Speech framework

Speech Recognition

Recognize and analyze speech in audio and get back data like transcripts.

View API

Sound Analysis

Sound Analysis

Analyze audio and recognize it as a particular type, such as laughter or applause.

View Sound Analysis framework

Sound Classification

Analyze sounds in audio using the built-in sound classifier or a custom Core ML sound classification model.

View API