NVIDIA Patches Critical Bug in High-Performance Servers

NVIDIA Patches Critical Bug in High-Performance Servers

NVIDIA said a high-severity information-disclosure bug impacting its DGX A100 server line wouldn’t be patched until early 2021.

NVIDIA released a patch for a critical bug in its high-performance line of DGX servers that could open the door for a remote attacker to take control of and access sensitive data on systems typically operated by governments and Fortune-100 companies.

In all, NVIDIA issued nine patches, each fixing flaws in firmware used by DGX high-performance computing (HPC) systems, which are used for processor-intensive artificial intelligence (AI) tasks, machine learning and data modeling. All of the flaws are tied to its own firmware that runs on its DGX AMI baseboard management controller (BMC), the brains behind a remote monitoring service servers.

“Attacks can be remote (in case of internet connectivity), or if bad guys can root one of the boxes and get access to the BMC they can use the out of band management network to PWN the entire datacenter,” wrote researcher Sergey Gordeychik who is credited for finding the bugs. “If you have access to OOB, it is game is over for the target.”

Given the high-stake computing jobs typically running on the HPC systems, the researcher noted an adversary exploiting the flaw could “poison data and force models to make incorrect predictions or infect an AI model.”

No Patch Until 2021 for One Bug

NVIDIA said a patch fixing one high-severity bug (CVE‑2020‑11487), specifically impacting its DGX A100 server line, would not be available until the second quarter of 2021. The vulnerability is tied to a hard-coded RSA 1024 key with weak ciphers that could lead to information disclosure. A fix for the same bug (CVE‑2020‑11487), impacting other DGX systems (DGX-1, DGX-2) is available.

“To mitigate the security concerns,” NVIDIA wrote, “limit connectivity to the BMC, including the web user interface, to trusted management networks.”

Bugs Highlight Weaknesses in AI and ML Infrastructure

“We found a number of vulnerable servers online, which triggered our research,” the researcher told Threatpost. The bugs were disclosed Wednesday and presented as part of a presentationVulnerabilities of Machine Learning Infrastructure” at CodeBlue 2020, a security conference in Tokyo, Japan.

During the session Gordeychik demonstrated how NVIDIA DGX GPU servers used in machine learning frameworks (Pytorch, Keras and Tensorflow), data processing pipelines and applications such as medical imaging and face recognition powered CCTV – could be tampered with by an adversary.

The researcher noted, other vendors are also likely impacted. “Interesting thing here is the supply chain,” he said. “NVIDIA uses a BMC board by Quanta Computers, which is based on AMI software. So to fix issues [NVIDIA] had to push several vendors to get a fix.”

Those vendors include:

  • IBM (BMC Advanced System Management)
  • Lenovo (ThinkServer Management Module)
  • Hewlett-Packard Enterprise Megarac
  • Mikrobits (Mikrotik)
  • Netapp
  • ASRockRack IPMI
  • DEPO Computers
  • TYAN Motherboard
  • Gigabyte IPMI Motherboards
  • Gooxi BMC

Nine CVEs

As for the actual patches issued by NVIDIA on Wednesday, the most serious is tracked as CVE‑2020‑11483 and is rated critical. “NVIDIA DGX servers contain a vulnerability in the AMI BMC firmware in which the firmware includes hard-coded credentials, which may lead to elevation of privileges or information disclosure,” according to the security bulletin.

Vulnerable NVIDIA DGX server models impacted include DGX-1, DGX-2 and DGX A100.

Four of the NVIDIA bugs were rated high-severity (CVE‑2020‑11484, CVE‑2020‑11487, CVE‑2020‑11485, CVE‑2020‑11486) with the most serious of the four tracked as CVE‑2020‑11484. “NVIDIA DGX servers contain a vulnerability in the AMI BMC firmware in which an attacker with administrative privileges can obtain the hash of the BMC/IPMI user password, which may lead to information disclosure,” the chipmaker wrote.

Three of the other patched vulnerabilities were rated medium severity and one low.

“Hackers are well aware of AI and ML infrastructure issues and use ML infrastructure in attacks,” Gordeychik said.

Hackers Put Bullseye on Healthcare: On Nov. 18 at 2 p.m. EDT find out why hospitals are getting hammered by ransomware attacks in 2020. Save your spot for this FREE webinar on healthcare cybersecurity priorities and hear from leading security voices on how data security, ransomware and patching need to be a priority for every sector, and why. Join us Wed., Nov. 18, 2-3 p.m. EDT for this LIVE, limited-engagement webinar.

Suggested articles