Last week at BSidesLV and DEFCON, MLSec Project released two tools to help organizations make better sense of their threat data. While we recognize and heartily endorse the idea that threat intelligence should comprise far more than just network addresses and context, these sorts of data sets lend themselves to quantifiable analysis.
TIQ-Test is a set of R scripts that carry out a number of statistical tests against IP address threat data. Currently we focus on novelty, overlap, and population: how often do the data change? How much do feeds overlap? And how do they compare with the population of IP addresses allocated to a particular country? We can envision more comparisons, of course, but these provide a starting point for research. And by releasing the tools, not just the results, we want to encourage other researchers to extend the tests and run them against their own data sets. Importantly, we did not release any analyses of private threat data.
Related to that, we also released Combine, a tool designed to read, parse, normalize, and output threat intel data. Current output formats include CSV and JSON, though we will add CIM and CybOX shortly. We have a set of feeds included by default, though we hope to add more in the future. Today the process for adding new feeds is slightly more painful than we would like, though anyone with a modicum of Python knowledge should be able to do so even now.
For more context, you can watch the recording of our BSidesLV talk, check out our slide deck and the R Markdown file that reproduces the charts and data we presented. And of course Alex and I are both available here so you can contact us privately with questions, comments, doubts, and constructive criticism.
For clarity: our feed selection stemmed from what we could find most easily rather than any desire to skew results. If you are the owner or maintainer of one of such lists, please feel free to reach out in private if you would like to discuss our results, especially if you think you might have more or better quality data to share. Some feed maintainers have already done so and we're happy to work with everyone to make sure that there are no misunderstandings or any other kinds of problems.
Similarly, if you control a private or semi-private list and would like us to work with you, we can help to perform some statistical analyses on your data. And of course anyone can use the tiq-test and Combine code to run the tests themselves.
(Also thanks to my employer Verisign for giving me the freedom & opportunity to work on the open source bits of this project!)
We just sent you an email. Please click the link in the email to confirm your subscription!
OKSubscriptions powered by Strikingly