LONDON, UK – With the infosec community eyeing artificial intelligence as the next big frontier for cyber defense, experts here at Infosecurity Europe on Tuesday warned that several challenges in how AI processes and interprets data need to first be fleshed out before widespread adoption.
AI, in the context of security, holds the promise of allowing security applications to automate functions at scale and opens the door for more efficient defense tools and processes. For example, AI could aid in container process anomaly detection, enterprise user login analytics and take defensive measure when sensing network or system anomalies.
But AI is still facing a plethora of challenges that are creating distrust in the technology. Those include human-based bias inherent embedded in data culled by systems and AI-based applications being leveraged by cybercriminals for various malicious uses.
Particularly when it comes to nation state cyber tools, “AI isn’t good enough when lives are on the line,” said Nicola Whiting, chief strategy officer with Titania, at Infosecurity Europe on Tuesday. “There’s a big program with trust when it comes to AI. We can’t always trust that AI is unbiased… and we can’t validate it unless we fix this.”
Her point is, we can’t trust AI outcomes if the data used to reach those conclusions contain human biases or are intentionally manipulated.
History of AI Failures
Despite the hype, AI has faced various roadblocks and failures, many of them stemming from inherent human bias and data integrity issues.
One such infamous failure is “Tay,” Microsoft’s experimental artificial intelligence Twitter chatbot that the company said was used for “conversational understanding.” The purpose of the chatbot would be to become smarter as it engaged with Twitter users – however, Microsoft didn’t take into account how Tay would be influenced by the source of its data, the various users of Twitter – eventually leading the bot to Tweet out racist and hateful tweets.
The chatbot proved an important point for the AI industry. That is: AI data is inherently biased, said Whiting. “Microsoft learned some things from that – if you don’t account for data integrity and bias, bad things happen,” he said.
It’s not just Tay – Amazon also disbanded four years of work in AI recruitment after discovering that its AI tools were collecting past unconscious biased behavior and discriminating against women for software developer jobs – downgrading applications from applicants who went to all-women schools, for instance.
Beyond inherent issues, AI is also a boon for cybercriminals who may use the tool to automate phishing, use voice authentication AI applications to spoofing attacks, or scalable packet sniffing.
While these failures prove long-standing issues in AI, it’s not stopping nation states from moving from physical to cyber weaponry in an escalating arms race – and using lethal AI weapons in doing so.
AI has sparked concerns from many in the tech industry – in fact, in 2018 3,000 AI and robotics researchers penned support for a key European Parliament resolution to support an international ban on AI-based lethal weapons.
But “with an escalating [cyber-defense] arms race, automation is the only way we can start leveling the playing fields,” said Whiting. “We can’t just go back on AI.”
How Can We Fix AI?
Tom Cignarella, director of the security coordination center at Adobe, said AI and machine learning do have “great potential to make our defenses stronger,” but a combination of training, the right people, the right level of data and a “healthy level of skepticism” are first needed.
“I think this is like any other tech, just because we’re using AI to find anomalies doesn’t mean we should trust it,” said Cignarella.
When it comes to information bias, there are several steps, starting with understanding the AI bias risks from both a data and human unconscious standpoint, as well as increasing the AI industry’s diversity to include a wide array of views and ideas, Whiting said.
The industry also needs to increase deterministic data, which enables risks to be considered from well-defined parameters (such as device configurations). That will enable future systems to be adaptive to activity and to re-configure themselves in-line with security best practices and compliance standards.
AI-driven engines also need to reduce probabilistic data where possible. Probabilistic data is when risks are extrapolated from how devices understand or respond to attacks (such as legacy scanning technology). Finally, there needs to be a way to validate AI decision processes and data types to ensure integrity.
According to Whiting, “less emphasis on interpreted data means better info for humans to review.”