SAN FRANCISCO – As companies quickly adopt machine learning systems, cybercriminals are close behind scheming to compromise them. That worries legal experts who say a lack of laws swing open the door for bad guys to attack systems.
During a panel session at RSA Conference 2020 this week, Cristin Goodwin, the assistant general counsel with Microsoft, said the number of machine learning related U.S. court cases is a mere 52. She noted most were related to patents, workplace discrimination and even gerrymandering. Few court cases addressed actual cyberattacks on machine learning systems – demonstrating a dangerous dearth in legal precedent around the technology.
Why so few machine learning court cases? Experts point to the fact that staple cybersecurity regulations such as the Computer Fruad and Abuse Act and Electronic Communications Privacy Act don’t specifically spell out how to handle machine learning attacks. This worries Goodwin who said there are insufficient legal measures to legally dissuade hackers from compromising machine learning platforms via spam assaults, copyright issues and outright attacks on companies.
[For Threatpost’s complete RSA Conference 2020 reporting, please visit our special coverage section, available here.]
This legal landscape is cold comfort to many companies who rely on machine learning as the backbone of many core services – from autonomous vehicles, enterprise security and sales and marketing functions.
“We need a new mechanism for our policies,” Betsy Cooper, director of the Aspen Tech Policy Hub at the Aspen Institute said. “We need to build into systems the flexibility where when we pass a law, it takes into account tech in a new way. We need to be much more creative in thinking about what our law and policies are [around machine learning].”
Machine Learning Attacks
The panel broke down three types of main attacks on machine learning that policies fail to address. The first, an evasion attack, is performed during production. In this attack, an adversary can change some pixels in a picture before uploading, so that image facial recognition systems do not classify the result. This attack is mostly used by spammers who can evade detection by obfuscating the content of spam emails, for instance.
Another popular attack, a model stealing attack, allows attackers to query a machine learning model, (like Google Translate, for instance) infer how it works, and use that information to create their own model, potentially opening up copyright issues. This attack can also be used by adversaries to better understand machine learning models so they can launch their own attacks against them, said researchers.
Poisoning attacks are another common machine learning threat. This attack type takes advantage of machine learning models while they are being trained to recognize certain inputs and respond with outputs. The attacker can inject bad data into the machine learning model’s training pool, in order to make it learn something it shouldn’t. This type of attack can open up machine learning systems to anything from data manipulation, logic corruption or even backdoor attacks.
Because specific policies don’t exist, several challenges come into play when looking at potential legal actions stemming from machine learning attacks. For instance, when addressing a machine learning evasion attack, it’s difficult to understand whether photo manipulation was malicious, versus someone playing around with an image’s pixels.
In fact, it’s difficult to tell from the get-go whether a photo was manipulated at all, or not. “From a technical level, it’s hard to detect if something was perturbed or not,” said Nicholas Carlini, research scientist with Google Brain. “There are papers out there discussing whether [machine learning] has been perturbed or not, and how to check if images have been modified. I don’t know what the law would say, if it’s so hard to even find out whether it’s been perturbed.”
Another issue is understanding whether a hacker is launching a malicious attack or not. For instance, in the model stealing attack, attackers query a machine learning model in order to infer how it works. However, no set rule exists that distinguishes between merely querying a machine learning model in order to learn what it’s doing – and taking that to the next step as part of an actual model stealing attack.
Legal experts say that potential policy needs to look at breaking down the intent behind the act and figure out whether it was criminal based. This could be done by looking at the context behind the attack and seeing whether, for instance, the person who committed the evasion attack had a history of engaging in malicious behavior, said Cooper.
When thinking about machine learning attacks, companies need to revisit their Terms of Service for machine learning models in a “similar way to any other service,” said Goodwin, in order to create a risk profile. That at least gives companies a clear idea of how their machine learning models can be used and abused – and the technical issues that need to be mitigated in order to prevent abuse.
Ways to Go
The panel of three was in agreement that policies that focus on potential machine learning attacks still have ways to go – and with the spurt of research around machine learning attacks that’s cropped up over the past few years, the government is hard pressed.
Cooper said that the pace of Congress is “glacial” and it will continue that way. She argued, regulatory sandboxes need to exist, similar to ones often used in financial tech, and policy makers need to become more flexible in how they develop regulations that are focused on technology.
Regardless, “it was eye opening to see if this is how we’re dealing with machine learning in the law,” said Goodwin. “There’s still so much to do in order to know how to set these baselines.”