Return to site

STIX, Stones, and Standardization Woes

Written by Alex Pinto

It is with great excitement that I have become a voting member on the new STIX/TAXII/CybOX standard technical committee! The standards are now being transitioned from DHS and MITRE's wings and to a broader community under OASIS so that they can become actual standards, and benefit from decades of experience from a great number of professionals from outside the USG. The MLSec Project was invited in recognition of the work we have been doing to analyze and demystify Threat Intelligence feeds.

I have privately done my fair share of jokes in expense of these proto-standards, such as the ever popular "Why did it have to be XML?", and the revealing "I just wanted an IP address, not all this weird structure". I am also sure some of my closest friends what are very influential and knowledgeable on the DFIR field will cast disapproving side-glances.

The fact is that, just like with the usage of word "cyber", no amount of eye rolling and snark will keep those standards from being implemented on a great number of security products. And, as many of you will agree, I'd rather have a good standard that will be actually useful than the mish-mash of custom data feeds we have today. Even worse would be a bad standard, where everyone is "forced" to implement, but then crate hidden workarounds internally to make it work with whatever they are doing.

Trust me, I am a (cyber data) engineer

However, this is not a defense of the current proto-standards or what they bring to the table now. There are at least 3 things that I have heard and experienced that I believe should be addressed on the technical committee. As we are still spinning up the standard making machine there at OASIS, I'd rather present this to you first:

1) "The whole XML thing" - I honestly do not care that much about this. I'll not be the one doing the parsing, the computer will. An XML format has at least one advantage over JSON, which is the ability to easily validate schema on all those XML parsing libraries. Yes, there are solutions like JSON Schema, but they are much less mature. XML is verbose and repetitive, but this also makes the data stream very compressible.

2) "But I just wanted to share an IP address" - STIX can be very daunting for someone who is trying to do something quick to fulfill an immediate need. And fulfilling immediate needs it ALL incident response professional that are code-literate do. There is often the need to cook something up, and then this "hack" becomes more and more useful over time, until it evolves into a full-fledged open-source project or even a commercial product. In summary, STIX/TAXII/CybOX need to become more beginner-friendly, either by having tutorials and quick start guides or even a STIX-light like core set of entities that are the most used, and how to glue them together for these simple use cases.

3) The ambiguity problem - But perhaps the most important and difficult challenge to tackle is the ambiguity in the standard. My understanding is that there is usually more than on way to represent something. The entities are too similar, and knowledgeable integrators are often left in confusion about if something should be an Observable or an Indicator, or fell tempted to just create an Unknown Entity and stick their information in it. The standard HAS TO BE deterministic (i.e. one input equals one and only one possible output), and leave very little to interpretation. The parsing computers will not be good at interpreting anything (working on it, give me a call in a month or two) and the result will be systems with correct implementation of the standard that cannot interoperate. That will effectively kill the already dwindling trust the current proto-standards have.

"When it comes to standards, you are either at the table or you are on the menu"

For better or for worse, cyber information sharing will be a defining force on the enterprise in the coming years, and STIX/TAXII/CybOX will be heavily lobbied for in the US and probably the rest of the western world. We should all be supporting the STIX family and making sure it is the better it can possibly be. For our own sake and continuing sanity.

Honestly, all I want do to parse less weird data formats. I a