Announcing NVIDIA Maxine

a cloud native video streaming AI platform for services such as video conferencing – includes state-of-the-art AI models and optimized pipelines that can run several features in real time in the cloud.

NVIDIA Maxine is a fully accelerated platform for developers to build and deploy AI-powered features in video conferencing services using state-of-the-art models that run in the cloud. Applications based on Maxine can reduce video bandwidth usage down to one-tenth of H.264 using AI video compression, dramatically reducing costs.

Maxine includes latest innovations from NVIDIA research such as face alignment, gaze correction, face re-lighting and real time translation in addition to capabilities such as super-resolution, noise removal, closed captioning and virtual assistants. These capabilities are fully accelerated on NVIDIA GPUs to run in real time video streaming applications in the cloud.

As Maxine-based applications run in the cloud, the same features can be offered to every user on any device, including computers, tablets, and phones. And because NVIDIA Maxine is cloud native, applications can easily be deployed as microservices that scale to hundreds of thousands of streams in a Kubernetes environment.

Face Re-animation

Using new AI research, you can identify key facial points of each person on a video call and then use these points with a still image to reanimate a person’s face on the other side of the call using generative adversarial networks (GANs).

These key points can be used for face alignment, where faces are rotated so that people appear to be facing each other during a call, as well as gaze correction to help simulate eye contact, even if a person’s camera isn’t aligned with their screen.

Developers can also add features that allow call participants to choose their own avatars that are realistically animated in real time by their voice and emotional tone.

Video & Audio Effects

AI-based super-resolution and artifact reduction can convert lower resolutions to higher resolution videos in real time which helps to lower the bandwidth requirements for video conference providers, as well as improves the call experience for users with lower bandwidth. Developers can add features to filter out common background noise and frame the camera on a user’s face for a more personal and engaging conversation.

Additional AI models can help remove noise from low-light conditions creating a more appealing picture.

Conversational AI

Maxine-based applications can use NVIDIA Jarvis, a fully accelerated conversational AI framework with state-of-the-art models optimized for real time performance. Using Jarvis, developers can integrate virtual assistants to take notes, set action items, and answer questions in human-like voices.

Additional conversational AI services such as translations, closed captioning and transcriptions help ensure everyone can understand what’s being discussed on the call.

Reduce Video Bandwidth vs H.264

With AI-based video compression technology running on NVIDIA GPUs, developers can reduce bandwidth use down to one-tenth of the bandwidth needed for the H.264 video compression standard.

This cuts costs for providers and delivers a smoother video conferencing experience for end users, who can enjoy more AI-powered services while streaming less data on their computers, tablets, and phones.


Related presentations

Cet article vous a plu ?

Partagez-le sur vos réseaux sociaux