Darktrace

Some malware exploits its victims by sending large numbers of spam emails. This behavior is also observed on devices controlled by malicious external actors, and sometimes by unauthorized internal actors using corporate devices for their own interests. If corporate devices persistently send spam, the corporate domain or public IP addresses may eventually be included in known spam lists, e.g. SpamHaus, interfering with corporate email or even causing reputational damage.

It is difficult to detect compromised devices sending spam because network connections related to spam are usually short-lived and can occur randomly, and because legitimate corporate email campaigns or peaks of activity in email servers can be misinterpreted.

We use a supervised machine-learning method to automatically analyze outgoing messages and determine whether they are legitimate or spam. The method analyzes the frequency of outgoing messages from a given device, several properties of outgoing messages, and other variables related to the principle of locality.

For example, a server performing legitimate activity tends to connect to a predictable number of endpoints, and an anomalous increase might indicate that the device has been compromised. The method also analyzes email subject fields to look for suspicious terminology, and the list of recipients to look for large numbers of recipients, and non-corporate or generic free email services.

Preliminary results demonstrate highly accurate detection of compromised devices sending spam, computational efficiency, and the ability to detect outgoing spam almost immediately.

Researcher

Dr. Andrés Curto Martín

Research Abstracts

Rapid Process-Chain Anomaly Detection Using a Multistage Classifier

Sorting long lists of file names by relevance and sensitive content

A system for analyzing network activity to detect stealthy cryptocurrency mining

Autonomous detection of the intended function of a corporate inbox through meta-scoring

Using epidemiology theory to identify the most damaging network devices

Automatic Identification of Scanned IP Ranges

A real-time, self-correcting similarity classifier for emails

Using graph theory to identify critical nodes within computer networks

Analyzing network activity to detect compromised devices sending spam emails