2004-08-24
As noted in the article "Penetration Testing of Web Applications" the use of web applications to conduct business is increasing. Companies often have custom sites built by in-house developers, and it is almost impossible to find all the vulnerabilities in a web site using automated tools. Simply looking for default installations of different software may turn up nothing, but it may still be vulnerable to many different programming errors in this custom-built site. Conducting an assessment of website can be a major undertaking and it is much more painful if the assessment is carried out with out the proper tools. A manual inspection of the site is almost always required, but when a particular vulnerability is found it can be very handy to have a set of tools to automate certain steps from there.
Why Libwhisker?Since we are dealing with custom applications we need a set of custom utilities. There are many different tools out there that can be scripted, but we are going to focus on Libwhisker. Libwhisker is not a tool or application in itself; it is a PERL library which allows for the creation of custom HTTP packets. Since Libwhisker is a PERL module and not an application, it is assumed that the reader has some knowledge of the HTTP protocol and is familiar with writing PERL scripts that use external modules. First let's answer the question: why do we do we need to look at another PERL module to do what can all ready be done through existing PERL modules (i.e. LWP, HTTP, URI)? Libwhisker offers us many advantages over other PERL modules:
Using LibwhiskerThe main data structure in Libwhisker is the 'whisker' anonymous hash. A hash is a data structure in PERL that is comparable to associated arrays in other programming and scripting languages.This 'whisker' anonymous hash can either define different aspects of a HTTP request or read different parts of the HTTP response. However, determining how to access this information can be a source of confusion. Prior to using any of the Libwhisker functions, two PERL hashes first need to be defined, one for the HTTP request and one for the HTTP response. Some items will be defined in the 'whisker' hash and some will be either defined directly in the request hash or read directly from the response hash. To determine which portions of the HTTP request/response are part of the 'whisker' hash and which are part of the request/response hash, let's look at the possible options for the 'whisker' hash that relate to a HTTP packet. Note that internal to the 'whisker' hash are the 'hin' and 'hout' hashes which are directly mapped to the request and response hash, respectively. For this article we will use the %request, %response and %jar PERL hashes to refer to the HTTP request hash, HTTP response hash and HTTP cookies hash, respectively.
Below is a diagram of an HTTP request packet that shows where the parts of the 'whisker' hash relate to and where the parts of the 'request' hash relate to.
Below is a diagram of an HTTP response packet that shows how to access it with the 'whisker' hash.
Getting StartedTo get used to using the Libwhisker module we will write a command line tool that allows us to follow the first five steps in "Penetration Testing of Web Applications". This will provide us a good example to base our scripts on. For each step we will add to the script. To briefly summarize, the information gathering steps we are going to script are:
There are a couple of different ways to initialize the request and response hash. One is to initialize the hashes manually [ref1] and the other is to use the Libwhisker functions 'LW2::http_new_request' and 'LW2::http_new_response' [ref2].
%request = ();
%response = ();
LW2::http_init_request(\%request);
$request{'whisker'}->{'host'} = "www.victim.com";
II. use LW2; $request = LW2::http_new_request(host=>'www.victim.com', uri =>'/'); $response = LW2::http_new_response(); To accomplish the first of our five steps, we need to be able to send a HEAD or OPTIONS request to the server and see what type of information is sent back. This means that we are going to have to alter the value in {'whisker'}->{'method'} and print out all the headers that are returned from the server. Instead of hard coding the method we will allow it to be specified via the command line using the '-m' option. We will only allow the GET, HEAD and OPTIONS methods.
#Define the modules that we intend to use.
use strict;
use LW2;
use Getopt::Std;
#Define hashes for our command line options, request
#information and response information.
my (%opts, %request, %response, $headers_array, $header);
getopts('h:m:', \%opts);
#Initialize all the request variables. Some of these we will overwrite.
LW2::http_init_request(\%request);
if (!(defined($opts{h}))) {
die "You must specify a host to scan.\n";
}
if (defined($opts{m})) {
if ($opts{m} =~ /OPTIONS|HEAD|GET/) {
$request{'whisker'}->{'method'} = $opts{m};
} else {
die "You can only use OPTIONS, HEAD or GET for the method.\n";
}
}
##start making requests
#Set the host that we want to scan
$request{'whisker'}->{'host'} = $opts{h};
#Make RFC compliant
LW2::http_fixup_request(\%request);
#Do the actual scan.
if(LW2::http_do_request(\%request,\%response)){
##error handling
print 'ERROR: ', $response{'whisker'}->{'error'}, "\n";
print $response{'whisker'}->{'data'}, "\n";
} else {
##show results
#Get the information out of the anonymous array.
#'$headers_array' is a reference.
$headers_array = $response{'whisker'}->{'header_order'};
print "HTTP " ,$response{'whisker'}->{'version'}, "\t";
print $response{'whisker'}->{'code'} , "\n";
foreach $header (@$headers_array) {
print "$header";
print "\t$response{$header}\n";
}
}
The second, third, fourth, and fifth steps in our example can be combined, as they either deal with specifying a URI and/or looking at the actual HTML data returned from the server. To modify the URI from it's default '/' we need to change {'whisker'}->{'uri'}. We are also going to want to add an option to print out the HTML data. This way we can see if there is anything interesting in the HTML data that is returned like 404 messages (step2), file and directories (step3), source code (step4), and errors generated by manipulating the GET request through the URI (step 5). For the URI we will use the '-u' switch and for the printing of the HTML code we will use the '-d' switch. Let's take a look at our new code snippet, which takes the previous one and builds on it:
#Define the modules that we intend to use.
use strict;
use LW2;
use Getopt::Std;
#Define hashes for our command line options,
#request information and response information.
my (%opts, %request, %response, $headers_array, $header);
##note the additions of 'd' for data and 'u' as
##the option for the URI, below
getopts('dh:m:u:', \%opts);
#Initialize all the request variables. Some of these we will overwrite.
LW2::http_init_request(\%request);
if (!(defined($opts{h}))) {
die "You must specify a host to scan.\n";
}
if (defined($opts{m})) {
if ($opts{m} =~ /OPTIONS|HEAD|GET/) {
$request{'whisker'}->{'method'} = $opts{m};
} else {
die "You can only use OPTIONS, HEAD or GET for the method.\n";
}
}
##now set URI if passed on command line
if (defined($opts{u})) {
$request{'whisker'}->{'uri'} = $opts{u};
}
#Set the host that we want to scan
$request{'whisker'}->{'host'} = $opts{h};
#Make RFC compliant
LW2::http_fixup_request(\%request);
#Do the actual scan.
if(LW2::http_do_request(\%request,\%response)){
print 'ERROR: ', $response{'whisker'}->{'error'}, "\n";
print $response{'whisker'}->{'data'}, "\n";
} else {
#Get the information out of the anonymous array.
#'$headers_array' is a reference.
$headers_array = $response{'whisker'}->{'header_order'};
print "\n\n";
print "HTTP " ,$response{'whisker'}->{'version'}, "\t";
print $response{'whisker'}->{'code'} , "\n";
foreach $header (@$headers_array) {
print "$header";
print "\t$response{$header}\n";
}
##if 'd' is passed, print some data
if (defined($opts{d})) {
print "\n\n-----------------------------
-----------------------------\n\n";
print $response{'whisker'}->{'data'} , "\n";
}
}
Now we will add support for changing the 'User-Agent' header. Developers use this to determine if the client is running a browser that they support. By default the 'User-Agent' header used by Libwhisker is 'Mozilla (libwhisker/2.0)'. We will add support for three different browsers, Netscape 7.1, Microsoft IE 6 and Mozilla Firefox 0.9. To change which browser will be spoofed, the command line option '-U' will be added to our script and it will take either N (Netscape), I (Internet Explorer) or F (Mozilla Firefox). There is more to spoofing a browser than just changing the 'User-Agent' header, however. While this will most likely do the trick most of the time, each browser also uses other headers and values. If you want to fully spoof a browser you will have to first determine which headers are used and what their associated values are, and then set them in your script accordingly. We also want to add support for the POST method. Some applications require the user to have a session established before allowing them to post data, so we will need to grab the cookies from the response we get and set them for our POST request. One caveat to this is the URI that we will be specifying using the '-u' option should only be used in the POST request. Initially the script will do a simple GET request and get the cookie, then set the cookie in the POST request prior to calling 'LW2::http_do_request'. Below is a script that will allow us to change the 'User-Agent' header and also do POST requests. The '-D' option will be used to specify the data that should be in the POST request. Below is the final incarnation of our example script, which can now achieve all five of the tasks that we initially set out to automate and solve.
ConclusionHaving your own suite of custom scripts and tools can be very handy when it comes to web application assessments. One of the benefits to this approach is that you know how the tools work and can add new functionality to fit your needs. This also helps in learning how an application works.
Many tools out there will tell a user that an application is vulnerable by doing some basic testing, but sometimes you need more than that. Libwhisker can be used to go beyond the information gathering phase that was shown here. Just like many security tools that use libnet and libpcap to create and read specially crafted packets, Libwhisker allows us the same type of functionality for creating and parsing HTTP packets and can very useful to a penetration tester.
|
|
Comments
Comments or reprint requests can be sent to the editor. View more articles by Neil Desai on SecurityFocus.
|
