Analyzing a Trac SPAM Attempt
05 July, 2008

Even though my Trac deployment does not display user-generated content, there are still places where Trac accepts query strings from users, and spammers seem to try and use these fields for their own ends. Let's see if we can find a few examples. Trac web logs are informative, but sifting through huge logfiles can be tedious. Fortunately, simply sorting the logfiles by line length (and therefore by web request length) allows many suspicious web requests to bubble up to the top. Below is a simple perl script that sorts any ascii text file by line length, and precedes each printed line with the line length followed by the number of lines equal to that length. That is, not all lines are printed since the script is designed to handle large files - we want the unusually long lines to be printed but the many shorter lines (which represent the vast majority of legitimate web requests) to be summarized. This is an important feature considering that at this point there is over 2.5GB of log data specifically from my Trac server.
$ cat #!/usr/bin/perl -w # # prints out a file sorted by longest lines # # $Id: 1739 2008-07-05 13:44:31Z mbr $ # use strict; my %url = (); my %len_stats = (); my $mlen = 0; my $mnum = 0; open F, "< $ARGV[0]" or die $!; while (<F>) { my $len = length $_; $url{$len} = $_; $len_stats{$len}++; $mlen = $len if $mlen < $len; $mnum = $len_stats{$len} if $mnum < $len_stats{$len}; } close F; $mlen = length $mlen; $mnum = length $mnum; for my $len (sort {$b <=> $a} keys %url) { printf "[len: %${mlen}d, tot: %${mnum}d] %s", $len, $len_stats{$len}, $url{$len}; } exit 0;To illustrate how it works, below is the output of the script used against itself. Note that at the top of the output the more interesting code appears whereas the most uninteresting code (such as blank lines and lines that contain closing "}" characters) are summarized away at the bottom:
$ ./ [len: 51, tot: 1] # $Id: 1739 2008-07-05 13:44:31Z mbr $ [len: 50, tot: 1] printf "[len: %${mlen}d, tot: %${mnum}d] %s", [len: 48, tot: 1] $len, $len_stats{$len}, $url{$len}; [len: 44, tot: 1] # prints out a file sorted by longest lines [len: 43, tot: 1] for my $len (sort {$b <=> $a} keys %url) { [len: 37, tot: 1] if $mnum < $len_stats{$len}; [len: 34, tot: 1] $mlen = $len if $mlen < $len; [len: 32, tot: 1] open F, "< $ARGV[0]" or die $!; [len: 29, tot: 1] $mnum = $len_stats{$len} [len: 25, tot: 1] my $len = length $_; [len: 24, tot: 1] $len_stats{$len}++; [len: 22, tot: 2] $mnum = length $mnum; [len: 21, tot: 1] $url{$len} = $_; [len: 20, tot: 2] my %len_stats = (); [len: 19, tot: 1] #!/usr/bin/perl -w [len: 14, tot: 3] while (<F>) { [len: 12, tot: 1] use strict; [len: 9, tot: 1] close F; [len: 8, tot: 1] exit 0; [len: 2, tot: 5] } [len: 1, tot: 6]Now, let's execute the script against the trac_access_log file and look at one of the longest web requests. (The script was able to reduce the 12,000,000 web requests in my Trac logs to a total of 610 interesting lines.) This particular request is 888 characters long, but there were some other similar suspicious requests that had over 4,000 characters that are not displayed for brevity:
[len: 888, tot: 1] - - [02/Mar/2008:00:30:17
-0500] "GET /trac/fwsnort/anydiff?new_path=%2Ffwsnort%2Ftags%2Ffwsnort
map.html%0A& HTTP/1.1" 200 3683 "-"
"User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1;
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1) ; .NET CLR
1.1.4322; .NET CLR 2.0.50727; InfoPath.2)"
My guess is that the above request is a bot that is trying to do one of two
things: 1) force Trac to accept the content in the request (which contains
a bunch of links to pages like "" - note that
I altered the domain so as to not legitimize the original content) and
display it for other Trac users or to search engines, or 2) force Trac
itself to generate web requests to the provided links (perhaps as a way to increase
hit or referrer counts from domains - like mine - that are not affiliated with the
spammer). Either way, the strategy is flawed because the request is against the
Trac "anydiff" interface which doesn't accept user content other than svn revision
numbers, and (at least in Trac-0.10.4) such requests do not cause Trac to issue any
external DNS or web requests - I verified this with tcpdump on my Trac server after
generating similar requests against it.
Still, in all of my Trac web logs, the most suspicious web requests are against the "anydiff" interface, and specifically against the "web-cgi.rules" file bundled within the fwsnort project. But, the requests never come from the same IP address, the "anydiff" spam attempts never hit any other link besides the web-cgi.rules page, and they started with regularity in March, 2008. This makes a stronger case for the activity coming from bot that is unable to infer that its activities are not actually working (no surprise there). Finally, I left the original IP address of the web request above intact so that you can look for it in your own web logs. Although is not listed in the Spamhaus DNSBL service, a rudimentary Google search indicates that has been noticed before by other sites as a comment spammer.