InfoSec Insider

Long Tail Analysis: A New Hope in the Cybercrime Battle

Looking for niche anomalies in an automated way with AI and machine learning is the future.

Our hyper-connected world and its ever-faster network speeds have resulted in mountains of diverse data that needs to be processed. It has also resulted in an ever-expanding attack surface, requiring cybersecurity solutions to scale like never before. These days, scale is about more than traffic volume (which can be used for, say, DDoS attacks committed by a botnet of hijacked devices); it’s also about the need to rapidly identify threats and stop them before they can succeed.

A methodology that helps here is long-tail analysis, an approach that looks for very weak signals from attackers who are technologically savvy enough to stay under the radar and remain undetected.

Chasing the Long Tail

The term long tail first emerged in 2004, created by WIRED editor-in-chief Chris Anderson to describe “the new marketplace.” His theory is that our culture and economy are increasingly shifting away from a focus on a relatively small number of “hits” (mainstream products and markets) at the head of the demand curve and toward a huge number of niches in the tail.

Here’s how this long-tail concept applies to cybersecurity: You are specifically looking for those least-common events that will be the most useful in understanding anomalous behavior in your environments.

A security analyst uses this basic four-step process for long-tail analysis:

  1. The analyst finds events of interest, such as website connections or user authentication. Then, you determine how to aggregate the events in a way that provides enough meaning for analysis. As an example, you can graph user accounts by the number of authentication events or web domains by the number of connections.
  2. This grouping of data will create a distribution that might be skewed in a particular direction with a long tail, either to the left or right. You might be particularly interested in the objects that fall within that long tail. These are the objects that are extracted, in table format, for further analysis.
  3. You then investigate each object as necessary. In the case of authentications, you would look at the account owner, the number of authentication events and the purpose of the account, all with the intended goal of understanding why that specific behavior is occurring.
  4. Determine what actions, if any, you need to take and proceed to the next object. You might decide to simply ignore the event and repeat step 3 with the next object. Otherwise, the next steps include working with incident responders or your IT team.

The Ideal and the Real

Your varied security sources generate enormous volumes of data. It’s extremely challenging to extract weak signals while avoiding all of the false positives. The standard attempt to resolve this challenge is to provide analysts with banks of monitors displaying different dashboards that they need to be familiar with in order to detect malicious patterns. As you know, this doesn’t scale. It is not reasonable to expect a person to react to these dashboards consistently. Nor should they be expected to “do all the things.”

Rather, people tend to become security analysts because they like digging into the data. They’ll pivot into one of the many approaches used to combat cybersecurity threats – such as log management solutions, packet-analysis platforms and even some endpoint agents – all designed to record and play back a historical record. They break down common behaviors, looking for those outliers. They zero in on these “niche” activities and understand them one at a time. Unfortunately, analysts can’t always get to each permutation, and they are left unresolved.

Hope on the Horizon

Cybersecurity at human speed is no longer tenable. There are new, machine learning-based technologies that use integrated reasoning to automate long-tail analysis. This means organizations can do more of this valuable research more efficiently – specifically, with less manpower and cost. This will improve your team’s ability to find threats and dispatch them before they can do damage. As the market matures and this capability becomes available, long-tail analysis will super-charge your cybersecurity efforts.

Chris Calvert is co-founder, Respond Software

Enjoy additional insights from Threatpost’s InfoSec Insider community by visiting our microsite.

Suggested articles