Wednesday, May 28, 2008

Burp Sequencer 101

Talking to a friend the other day, I realised that some smart hackers who use Burp Sequencer for analysing session token randomness do not necessarily realise quite how it works or how to interpret its results.

This is hardly surprising: statistical testing for randomness is a complex business; Burp Sequencer lets you click and go; and all hackers like a tool which they can whip out and use without too much effort.

Anyone who really wants to understand the inner workings of Burp Sequencer can of course read TFM. But for everyone else, here are four key questions that you should know basic answers to if you want to use the tool effectively.

1. What does "bits of effective entropy" mean?

Each token is a collection of individual bits of data. Burp converts each token into a bit-level representation based on the size of the character set that appears at each position within the raw token - for example, if the first character position employs 8 different characters, then this represents 3 bits of data.

Burp tests each bit position within the token separately. If you have captured 1,000 tokens for analysis, then Burp tests the 1,000 values at the first bit position; then the 1,000 values at the second position; and so on. Several different tests are performed, and an overall result is derived for each bit position. Because the different tests look for different statistical features of non-random data, the "score" for each bit position is the lowest score returned by all of the tests. In other words, if a bit position fails any of the statistical tests, then that bit position is deemed to be non-random, and so fails overall.

(Burp also tests for correlations between the data at different bit positions, to avoid perverse results in cases where the same data is duplicated at different points within the token.)

2. What does "significance level" mean?

Burp reports an overall result showing the quantity of effective entropy "at a significance level of 1%":

and also shows a chart showing the amount of entropy at other significance levels:

Unless you enjoyed statistics at school, don't dwell on this. The significance level represents how stringent we want our tests to be in identifying non-random data. The lower the significance level, the more obviously non-random a sample needs to be before we reject the conclusion that it is random. 1% is a common, and reasonable, significance level to use in this kind of test - it means roughly that we will wrongly conclude that random data is non-random in only 1% of cases.

So the "Overall result" at the top of the Summary tab is the right one to use in general.

3. How many samples do I need to capture?

This is a very important question. In many situations, capturing a large sample of tokens from a web application may be time-consuming or intrusive. As a convenience, Burp lets you perform the statistical tests based on a sample of only 100 tokens. The only use for this is to allow you to quickly identify tokens that are blatantly non-random. Getting a "pass" result based on this size of sample provides very little assurance of the actual statistical properties of the token generator. The "Reliability" message at the bottom of the Summary tab gives you a rough indication of how effective your sample size is:

In general, unless smaller samples prove obviously non-random, you should aim to capture at least 5,000 tokens from the application, and more if possible. To see why, let's look at the results of testing a defective token generator. Suppose an application uses the following broken linear congruential generator:

long nextToken = lastToken * 13 + 27;

Although broken, this generator produces tokens that look fairly random to the naked eye, as evidenced by WebScarab's graphical view:

After capturing 1,000 tokens using Burp, the statistical tests report 49 bits of effective entropy: a "pass" result in most situations:

However, if we continue and analyse 3,000 tokens, the result is less promising: only 37 bits pass the tests:

And after analysing 6,000 tokens, only 17 bits pass the test: a "fail" result in most situations:

Continuing further, Burp will eventually report that all bits fail the test. So to reiterate: size really matters. The non-random properties of many generators only emerge when large sample sizes are tested. If in doubt, always continue capturing.

4. How does Burp's result relate to predictability?

Statistical tests for randomness cannot in general establish predictability or otherwise. Deterministic data may pass statistical tests for randomness and yet be completely predictable by someone who knows the generation algorithm (for example, a hash of an incrementing number). Conversely, data that fails statistical tests may not actually be predictable in a practical situation, because the patterns that are discernible within the data do not sufficiently narrow down the range of possible future outputs to a range that can be viably tested.

In general, you should interpret Burp Sequencer's output as follows:

  • If a large sample fails the tests, then the generator is definitely non-random, and its output may or may not be predictable in practice.

  • If a large sample passes the tests, then the generator may or may not be random, and its output may or may not be predictable in practice.

I hope this helps clear up a few uncertainties people may have about using Burp to test for randomness. If you have any other questions, ask away.

Monday, May 5, 2008

Null byte attacks are alive and well

Null byte attacks against web and other applications are nothing new. Later in this post, I will describe two cases that may nonetheless be unfamiliar to some readers.

Before that, a well-known example to illustrate the issue. Consider a Java-based application which displays an image file specified by the user:

String filename = request.getParameter("filename");

if (filename.endsWith(".jpg"))
File f = new File(filename);

Forget about path traversal attacks for the moment. The application seeks to ensure that the function can only be used to access JPEG files by checking that the user-supplied filename ends in ".jpg". If so, the filename is passed to the constructor for, which opens the specified file.

The file extension check can of course be defeated using input of the form:


where %00 is a URL-encoded null byte. The reason the attack works is because of the different ways in which null bytes are typically handled in native and managed code. In native code, the length of a string is determined by the position of the first null byte from the start of the string - the null byte effectively terminates the string. In managed code, on the other hand, string objects comprise a character array, which may contain null bytes, and a separate record of the string's length.

When the above Java code excutes String.endsWith() on our input, it knows that the filename string is 15 characters long, and it checks whether the last four characters (after the null byte) match ".jpg", which they do. However, when the filename is processed by, this operation is ultimately implemented within native operating system APIs in which a string's length is determined by the first null byte after the start of the string. So the filename which is ultimately processed is equivalent to:


and so the application's defence is defeated.

Interestingly, the corresponding code in ASP.NET:

string Filename = Request.Params["filename"];

if (Filename.EndsWith(".jpg"))
FileStream fs = File.Open(Filename, FileMode.Open);

is not vulnerable to the same attack. Our malicious input results in the following exception:

Illegal characters in path.

Before our input is passed to the native filesystem APIs, ASP.NET checks whether the managed string contains any invalid characters, including null bytes, and rejects the input if it does. So ASP.NET helps to protect developers against null byte attacks in this instance.

Null byte attacks against LDAP

LDAP is a protocol for querying directory services, and in the context of web applications is most commonly encountered in functionality for searching personnel directories etc. Suppose an application lets us search by name for employees in the Marketing department only. When we supply the input:


the application performs an LDAP query with the following filter:


When an LDAP filter combines multiple logical conditions, as here, the boolean operator comes at the start of the list of conditions, and applies to all of the conditions in the list. Hence, there is nothing directly analogous to the "or 1=1" attack in SQL injection. In a conjunctive filter (one employing the boolean AND operator, &) any additional conditions we inject are only able to further restrict the results that will be returned, and cannot undo the restrictions imposed by existing conditions.

On some LDAP platforms, it is possible to supply two filters back-to-back, the second of which is ignored. In this situation, we can use the following crafted input to circumvent the department restriction and view all entries in the directory:


(See this paper for examples of this kind of attack in action.)

However, some LDAP services, notably Microsoft ADAM (Active Directory Application Mode) do not tolerate queries with two filters. Hence, you may hear it said that LDAP injection into conjunctive filters cannot be usefully exploited against ADAM directories.

This overlooks the possibility of null byte attacks, however. If our query is being passed to ADAM, we can use the following input to circumvent the department restriction and view all entries:


This input does not result in a well-formed LDAP filter, when considered as a managed string. However, because the query is ultimately processed by ADAM in native code, the filter gets truncated to our injected null byte, and so the subsequent conditions in the filter are not processed.

Further, the ASP.NET APIs for querying ADAM, in the System.DirectoryServices namespace, do not block input containing null bytes, in the way that System.IO.File does.

Null byte attacks against XPath

The Web Application Hacker's Handbook includes an example of XPath injection against an application function which retrieves users' credit card numbers based on their username and password (see p316 onwards). If the XML datastore contains entries of the form

<name>James Hunter</name>
<ccard>8113 5320 8014 3313</creditcard>

then a legitimate user's request for their credit card number will result in this XPath query:

/addressBook/address[name='James Hunter' and

In the book, we described how an attacker could retrieve a list of credit card numbers for all users with input like:

/addressBook/address[name='James Hunter' and
password='' or 'a'='a']/ccard

and could perform various blind attacks to extract other information from the datastore one bit at a time, in the manner of Absinthe for blind SQL injection.

However, none of these attacks involved subverting the /ccard attribute in the application's query. Hence, the ccard data field was the only item we could retrieve directly, and we could infer other data, such as names and passwords, only by manipulating the query filter to return either a credit card number or otherwise, based on a controllable condition.

Again, what was overlooked here was the possibility of a null byte attack. In fact, we can supply the following input to effectively replace the ccard attribute with one of our own:

/addressBook/address[name='James Hunter' and

This technique enables us to fully subvert both the filter and attributes of the query, and directly retrieve data such as names and passwords which we could otherwise infer only using cumbersome blind techniques.

As previously, common managed APIs for performing XPath queries, such as those in the .NET System.Xml.XPath namespace, do not block input containing null bytes, and yet are ultimately implemented in native code in which our input gets terminated at the null byte.