Design of a New 'xbits' Cross-Stream IDS Keyword
22 August, 2013
In the previous blog post a proposal was made for a new Snort and Suricata keyword "xbits" for cross-stream signature matching. This post had little discussion of implementation tradeoffs, and some have reacted to the blog post by saying that it is difficult to properly design and implement the type of cross-stream state tracking that would be necessary for xbits to work. I agree. However, the initial xbits proposal did not assume that such an implementation would be easy or straightforward, and this blog post will attempt to illustrate where some of the pitfalls are likely to be. In the end, I'm confident that is possible to develop something similar to xbits. Victor Julien, lead developer of Suricata, commented on my Google+ posting on the xbits keyword to state that Suricata has been considering implementing something similar to xbits for a while.First, before diving into xbits itself, I would argue that there is already precedent in IDS/IPS engines for detecting an important class of communications that cross multiple transport layer "conversations" (using this term loosely for a moment): port scans and sweeps. Detecting such traffic is mostly about setting thresholds on various things such as the number of ports and IP addresses that are contacted within a given period of time, prioritizing on sets of ports that are usually associated with important services (with sweeps for certain ports sometimes spiking after a new vulnerability is discovered), and differentiating TCP flags that are used (a TCP FIN scan looks a lot different than a connect() scan and indicates some things about the adversary such as privileged OS access). By definition, the raw ability to differentiate scans and sweeps vs. normal traffic requires the capability of keeping some state across transport layer conversations. It just so happens that Snort and Suricata track this state within dedicated preprocessors and do not also expose port scan detection configuration in the signature language itself. By contrast, other preprocessors do offer signature language interfaces such as stream5 with the flow keyword. To be clear, I'm not at all advocating that configuration aspects of the sfPortscan preprocessor actually belong in the signature language (that would be unnatural to say the least) - I'm merely making the point that the idea of maintaining some state across transport layer conversations is something that Snort and Suricata already do. So, in this area at least, such a concept is not foreign.
Traffic Visibility
Now, in terms of factors that would affect an xbits implementation, it should be noted that not every IDS/IPS necessarily has a global view of all traffic on a given network. This can be the result of several different factors that depend not only on the physical hardware, but also how the IDS/IPS itself is developed (multi-threaded or not), and configured. Starting with the hardware, I've seen major IDS deployments that require multiple IDS appliances working together in order to inspect all of the network traffic. Essentially the IDS appliances form a cluster of systems (not in the HPC sense) where portions of the traffic are split across each appliance with a device such as a Gigamon tap. This allows each IDS appliance to handle a fraction of the traffic that it would otherwise have had to inspect, and this in turn enables the cluster as a whole to scale to massive amounts of network traffic. But, this also means portions of the traffic are physically separated from one IDS to the next, and therefore xbits on a single IDS can only apply to the set of transport layer conversations that actually traverse it. So, is there an opportunity for an attacker to evade xbits when deployed in this fashion? Sure, if the attacker knows how the traffic splitting is done, then attacks could most likely be sent against systems in ways that nullify xbits criteria just by using different source networks to force each individual IDS appliance into having only a limited view of a cross-stream attack. There are potential evasions at every level, but many attackers are not going to have access to such traffic splitting details.Other IDS/IPS architectures such as those that rely on network processors or specialized packet acquisition hardware can also result in limited traffic visibility. Some organizations run multiple Snort processes on a single appliance, and have a network processor split packet data based on IP network ranges or transport layer port number across the Snort instances. Once again, each Snort process has a limited few of the traffic. Similarly, in a multi-threaded IDS/IPS such as Suricata, each thread may be tasked with processing a portion of the total network traffic, and this can result in a limited view within software even if there is no hardware enforced traffic splitting mechanism.
The Stream Preprocessor
Modifying the stream preprocessor to handle the xbits keyword could be tricky. By its nature, xbits would force the stream preprocessor to consider information that is not derived from single transport layer conversations, so locking issues against a global xbits tracking data structure in a multi-threaded context would become important. Also, it would be nice to not place onerous restrictions on xbits such as requiring that a connection close before a set xbit can be tested within a different connection. My guess is that stream preprocessor modifications have previously been a barrier to implementing something like xbits.xbits Design
Given all of the above, what would be the ideal xbits design? Stepping back for a moment, when any signature language feature is implemented in an IDS, what should be the primary goal? Better attack detection. Performance is certainly a consideration too, and performance features sometimes bleed into the signature language (see the fast_pattern keyword for example), but usually a new signature language feature is added because it enables better detection of threats at acceptable performance levels. Further, some features of the language are important enough to expend lots of CPU cycles and consume precious memory anyway because the detection accuracy would be significantly harmed without them - see the pcre keyword for example. So, we should strive for the ideal xbits design from a detection perspective and let performance and other tradeoffs take place where they must:- Allow an xbit to be set on one transport layer conversation and inspected in a different conversation before the first is closed.
- Allow an xbit to be set on a conversation that involves one IP protocol, and tested in a conversation that involves a different IP protocol. E.g. set a bit on a UDP flow and test it in a subsequent TCP connection.
- Interface with the current stream preprocessor to allow the setting and testing of xbits to take advantage of existing connection tracking capabilities. This most likely can be implemented as an extension to stream5 without requiring a wholly new preprocessor.
- For multi-threaded intrusion detection engines such as Suricata, some of the same tradeoffs that allow port scan detection to apply across threads could be used for xbits. Ideally, xbits would not be limited to traffic that is seen within a single thread.