BlackHat USA 2020

Uncommon Sense: Detecting Exploits with Novel Hardware Performance Counters and ML Magic


In recent years, exploits like speculative execution, Rowhammer, and Return Oriented Programming (ROP) were detected using hardware performance counters (HPCs). But to date, only relatively simple and well-understood counters have been used, representing just a tiny fraction of the information we can glean from the system. What’s worse, using only well-known counters as detectors for these attacks has a huge disadvantage – an attacker can easily bypass known counter-based detection techniques with minimal changes to existing sample exploit code.

If we want a viable future for exploit detection, we need to move beyond just scratching the surface of the HPC iceberg. Uncovering the treasure trove of overlooked and undocumented counters is necessary if we are to both build defenses against these attacks and anticipate how an adversary could bypass our defenses.

We’ll begin our journey in walking through our ML-based solution to more effective exploit detection. Using the entire corpus of performance counters for commonly used baseline programs and behaviorally-similar malicious programs, we zero in on the counters we want to use as features for our supervised classifiers. We will then interpret our model to determine how they can effectively detect various exploits using novel performance counters.

Finally, we’ll showcase the uncommon and previously ignored performance counters that were lurking in the dark, with so much useful information. The results seen here will emphasize the need for documenting these counters, which were highly significant in our models for attack detection.



Open Data Science Conference -East 2019

Machine learning To Detect Cyber Attacks: A Case Study


Machine learning is proving to be an important tool against cyber attacks, especially in finding zero day threats and in behavioral threat detection. Here, we will see how a couple of bugs that exploit critical vulnerabilities in modern computer processors, namely “Meltdown” and “Spectre” that were released in early 2018, took the cyber world by storm. These hardware vulnerabilities allow programs to steal data that is processed on the computer. We will see the Jupyter notebook that demonstrates the entire process of raw cpu data collection, data wrangling, machine learning experiments and final model selection to successfully detect the Spectre and Meltdown attacks when it is happening real time in a Linux system. The final machine learning model is the basis for the actual threat detection strategy that is engineered into the security product.

PyData Boston 2020

Topic Modeling for data discovery: A cybersecurity use case


Topic modeling is a very useful NLP technique to analyze and classify huge corpus of data. It helps us in clustering unstructured text data into meaningful groups. In cybersecurity, filtering huge data logs to fish out leaked credentials / passwords is a huge time consuming task for red teams. A red team consists of security professionals who act as adversaries to overcome cybersecurity issues. Red teams consist of ethical hackers who evaluate system security in an objective manner. In this talk, we will go through the basics of LDA (Latent Dirichlet Allocation) topic modeling. Then by using this technique on a real world example, we’ll go through a hacker’s system logs, and try to filter out useful data like pass codes and credentials which can be used for further security analysis.

Video (From 34:37)