Nvidia has announced a new platform for creating virtual agents named Omniverse Avatar. The platform combines a number of discrete technologies — including speech recognition, synthetic speech, facial tracking, and 3D avatar animation — which Nvidia says can be used to power a range of virtual agents.
In a presentation at the company’s annual GTC conference, Nvidia CEO Jensen Huang showed off a few demos using Omniverse Avatar tech. In one, a cute animated character in a digital kiosk talks a couple through the menu at a fast food restaurant, answering questions like which items are vegetarian. The character uses facial-tracking technology to maintain eye-contact with the customers and respond to their facial expressions. “This will be useful for smart retail, drive-throughs, and customer service,” said Huang of the tech.
In another demo, an animated toy version of Huang answered questions about topics including climate change and protein production, and in a third, someone used a realistic animated avatar of themselves as a stand-in during a conference call. The caller was wearing casual clothes in a busy cafe, but their virtual avatar was dressed smartly and spoke without any background noise impinging. This last example builds on Nvidia’s Project Maxine work, which aims to improve common problems with video conferencing (like low quality streams and maintaining eye contact) with the help of machine learning fixes.
The Omniverse Avatar announcement is part of Nvidia’s inescapable “omniverse” vision — a grandiose bit of branding for a nebulous collection of technologies. Like the “metaverse,” the “omniverse” is basically about shared virtual worlds that allow for remote collaboration. But compared to the vision put forward by Facebook-owner Meta, Nvidia is less concerned with transporting your office meetings into virtual reality and more about replicating industrial environments with virtual counterparts and — in the case of its avatar work — creating avatars that interact with people in the physical world.
As ever with these presentations, Nvidia’s demos looked fairly slick, but it’s not clear how useful this technology will be in the real world. With the kiosk character, for example, it’s not clear if customers will actually prefer this sort of interactive experience to simply selecting the items they want from a menu. Huang noted in the presentation that the avatar has a two-second response time — slower than a human, and bound to cause frustrations if customers are in a rush. Similarly, although the company’s Project Maxine tech looks flash, we’ve yet to see it make a significant impact in the real world.