UPDATED 13:46 EDT / OCTOBER 20 2023


Nvidia develops AI system for programming robots

Nvidia Corp. today detailed Eureka, an artificial intelligence system that can automatically train robots to perform new tasks.

In an internal evaluation, the chipmaker used Eureka to teach 10 simulated robots 29 different actions. Engineers often create simulated versions of their machines before building them to support development work. Eureka taught Nvidia’s virtual robots to open drawers, perform pen spinning tricks and carry out other relatively complex tasks. 

Many robots are powered by a type of neural network called a reinforcement learning, or RL, model. RL models learn to perform a task through trial and error: they repeat the task numerous times in a simulated environment until they figure out how to carry it out correctly. The simulated learning environment includes a virtual robot that functions as a testbed for the neural network. 

In such projects, the AI training process is supervised by a piece of code known as a reward function. The function “rewards” a robot’s RL model when it draws a correct conclusion during the learning session and penalizes it for mistakes. In this manner, the RL model is guided towards finding the correct way of operating the robot. 

Writing reward functions for RL models has historically been a time-consuming and highly technical task. According to Nvidia, its new Eureka system automates the process. The system can generate reward functions based on natural language instructions such as “teach the robotic arm to play chess.”

Under the hood, Eureka uses OpenAI’s GPT-4 to turn users’ prompts into reward functions. Besides the prompts themselves, the system also takes so-called environment code as input. This is code that describes the simulated robot being trained to perform a new task. 

According to Nvidia, Eureka doesn’t merely generate reward functions but also improves them over time. The system creates multiple versions of a reward function and evaluates how well they work by applying them to a simulated robot. Then, Eureka analyzes the results of the evaluation to identify opportunities for improvement.

The system can also take into account developer feedback during the process. In particular, Eureka allows engineers to provide suggestions on how it should enhance a robot’s reward function. These suggestions are incorporated into the code optimization process. 

Nvidia says reward functions developed by Eureka outperformed human-written code across more than 80% of the robot actions it tested. As a result, the 10 simulated robots that were developed as part of the project carried out their assigned tasks more effectively. Nvidia’s researchers logged a 52% improvement in robot performance.

“Reinforcement learning has enabled impressive wins over the last decade, yet many challenges still exist, such as reward design, which remains a trial-and-error process,” said Anima Anandkumar, a senior director of AI research at Nvidia who participated in Eureka’s development. “Eureka is a first step toward developing new algorithms that integrate generative and reinforcement learning methods to solve hard tasks.”

Nvidia has released key components of Eureka and an academic paper describing how it works on GitHub. Engineers can run the software using the chipmaker’s Isaac Gym program, a simulation tool specifically designed to support the development of AI-powered robots. 

Image: Nvidia

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One-click below supports our mission to provide free, deep and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy