Michael Rash, Security Researcher

Handling Escaped Semicolons in Snort Rules with fwsnort

fwsnort and escaped semicolons in Snort rules Recently I ran into a situation in which several Snort rules from the Emerging Threats rule sets were not being properly translated into iptables rules by fwsnort. It turned out that fwsnort did not correctly parse Snort content fields that contained escaped semicolons (e.g. "\;"). In the Snort signature language, the argument to every keyword in the body of a Snort rule such as content, pcre, and flowbits is terminated with a semicolon, and some keywords also use opening and closing double quotes. But, Snort supports escaping with a backslash so that these characters can easily be made to be part of a keyword argument as opposed to the delimiting syntax. Snort does not allow the argument of a content keyword to contain an embedded semicolon that is not escaped (e.g. content:"distloc=;";), and will generate an error similar to the following if a rule does not conform to this: Initializing rule chains...
ERROR: /etc/snort/rules/web-cgi.rules(3) =& Content data needs to be enclosed in quotation marks (")!
Fatal Error, Quitting..
In this case, we change content:"distloc=;"; to content:"distloc=\;"; and the error goes away. However, in addition to the escaping mechanism, any double quote or semicolon that is part of a content field can just be specified in hex notation between pipe "|" characters instead.

So, what are the tradeoffs in using one convention vs. the other?

Using backslashes can complicate the way an argument looks (since backslashes are not part of the content that is actually searched for in network traffic), but they can also make the argument more intuitive to look at than the hex syntax. This can be important when looking at lots of packet traces. For example, in web traffic the semicolon is used in HTTP request headers as a separator and therefore has special significance in HTTP, and the semicolon is also a separator for multiple commands launched from a command shell. So, for those that don't automatically know the hex equivalent of a semicolon (0x3b), it might be better to look at content:"distloc=\;"; instead of content:"distloc=|3B|"; when interpreting signature matches against raw packet traces since it emphasizes the importance of the semicolon.

There are important examples of Snort rule sets that use each strategy for the arguments to content fields (escaped semicolons vs. the hex equivalent). The complete Emerging Threats rule set contains 58 signatures with escaped semicolons: $ perl -lwne 'while (/content:"(.*?)"/g) { $tmp = $1; if ($tmp =~ /\x3b/) { print $tmp; }} ' emerging-all.rules |wc -l
Note that the 'while (/content:"(.*?)"/g)' loop is necessary above in order to parse all content fields from each Snort rule - using something like 'if (/content:"(.*?)"/' would just parse the very first content field in each Snort rule. Here is an example content field from the "ET MALWARE Related Spyware Checkin" signature: |0d 0a|User-Agent\: Mozilla/3.0 (compatible\; Indy Library)|0d 0a| By contrast, I've seen a few Sourcefire VRT rule sets, and none of them appear to use escaped semicolons in any of their signatures. They always prefer to use the "|3B|" hex notation.

Now, why is this important for fwsnort? The reason is that the current version - fwsnort-1.0.5 - does not properly parse content fields with escaped semicolons. However, this will be corrected in the upcoming fwsnort-1.0.6 release, which will be completed within the next two days or so. In the meantime, here is a link to fwsnort-1.0.6-pre4 that corrects this issue.