I have privately done my fair share of jokes in expense of these proto-standards, such as the ever popular "Why did it have to be XML?", and the revealing "I just wanted an IP address, not all this weird structure". I am also sure some of my closest friends what are very influential and knowledgeable on the DFIR field will cast disapproving side-glances.
The fact is that, just like with the usage of word "cyber", no amount of eye rolling and snark will keep those standards from being implemented on a great number of security products. And, as many of you will agree, I'd rather have a good standard that will be actually useful than the mish-mash of custom data feeds we have today. Even worse would be a bad standard, where everyone is "forced" to implement, but then crate hidden workarounds internally to make it work with whatever they are doing.
However, this is not a defense of the current proto-standards or what they bring to the table now. There are at least 3 things that I have heard and experienced that I believe should be addressed on the technical committee. As we are still spinning up the standard making machine there at OASIS, I'd rather present this to you first:
1) "The whole XML thing" - I honestly do not care that much about this. I'll not be the one doing the parsing, the computer will. An XML format has at least one advantage over JSON, which is the ability to easily validate schema on all those XML parsing libraries. Yes, there are solutions like JSON Schema, but they are much less mature. XML is verbose and repetitive, but this also makes the data stream very compressible.
3) The ambiguity problem - But perhaps the most important and difficult challenge to tackle is the ambiguity in the standard. My understanding is that there is usually more than on way to represent something. The entities are too similar, and knowledgeable integrators are often left in confusion about if something should be an Observable or an Indicator, or fell tempted to just create an Unknown Entity and stick their information in it. The standard HAS TO BE deterministic (i.e. one input equals one and only one possible output), and leave very little to interpretation. The parsing computers will not be good at interpreting anything (working on it, give me a call in a month or two) and the result will be systems with correct implementation of the standard that cannot interoperate. That will effectively kill the already dwindling trust the current proto-standards have.
For better or for worse, cyber information sharing will be a defining force on the enterprise in the coming years, and STIX/TAXII/CybOX will be heavily lobbied for in the US and probably the rest of the western world. We should all be supporting the STIX family and making sure it is the better it can possibly be. For our own sake and continuing sanity.
Honestly, all I want do to parse less weird data formats. I a