Uninformed: Informative Information for the Uninformed

Vol 5» 2006.Sep


SimpleComparison Metric

SimpleCompare is the first of three related metrics, the other two being MediumCompare and ComplexCompare. SimpleCompare is unique in that it compares the input against a print in the database without using any information about other prints in the database. That means that if a certain duration value is incredibly unique, such as the illegal ones only found in prism2 based implementations, it has no opportunity to take this into consideration.

All the metrics presented in this chapter break the fingerprints up into two different sets of data points. The first set is a set of pairs of the form (duration value, count). The second set is a set of triples of the form (packet type, duration value, count). The diagrams below leave the count component of both tuples out for clarity.

SimpleCompare, as well as the other metrics, has three different flavors. It can be computed using just the (duration value, count) pairs, or it can be computed using just the (packet type, duration value, count) triples. Finally the results from both analyses can be combined. Combining the results of these metrics is simply a matter of adding the return values from both metrics.

SimpleCompare utilizes two functions that are used throughout this chapter. They are used to compute the duration ratios in tables 5.2 and 5.3, and are defined as follows.

\begin{math}
{duration\_ratio(p,d)}= {\text{\char93  of packets with packet\_ty...
... = d}\over{\text{\char93  of total packets with packet\_type =
p}}}
\end{math}% WIDTH=434 HEIGHT=22

The SimpleCompare metric is defined below. The input packet capture is denoted by L. R, on the other hand, denotes a print in the capture database for a particular 802.11 implementation.

\begin{math}
{duration\_ratio(d)}= {\text{\char93  of packets with duration =
d}\over{\text{\char93  of total packets observed}}}
\end{math}% WIDTH=321 HEIGHT=22


Figure 4.1: SimpleCompare duration-value only analysis



sum = 0;
for every duration-value d $\in (L \cap R)$% WIDTH=71 HEIGHT=32  
sum += $1.0 - \vert L.duration\_ratio(d) - R.duration\_ratio(d) \vert$% WIDTH=350 HEIGHT=32 
return sum;

The metric weights common durations that appear in their respective prints at roughly the same rate more heavily than ones that do not. However, SimpleCompare doesn't pay attention to duration values that aren't in the intersection, as illustrated in Figure 4.1, even though the number of values not in the intersection is clearly a strong indicator of how close two prints match. It also doesn't have any idea of how unique any specific duration values are across the entire database.

At first, this lack of a global perspective on the relative likeliness of seeing duration values seemed that it would hinder this algorithm significantly. Consider the case when a prism2 sample is input that uses all the same illegal duration values as the one stored in the database, but at very different rates. SimpleCompare lacks the information to realize that the illegal values identify a prism2 implementation, and could grade this sample incorrectly.

At this point, SimpleCompare is also ignoring the packet type in which the duration values appear. This can cause two problems. One is that two different implementations use the same duration value, but in consistently different packet types (probe requests versus association responses for example). The other is that the ratio that duration values are used across all packet types fluctuate largely across packet samples, but the rate is much more consistent when confined to a particular packet type. Both of these problems are addressed by considering the packet types when looking at durations.

We can reuse SimpleCompare except this time we run it against the (packet type, duration) pairs, as illustrated below.


Figure 4.2: SimpleCompare (packet type, duration) analysis



sum = 0;
for every pair(packet_type p, duration-value d) $\in (L \cap R)$% WIDTH=71 HEIGHT=32  
sum += $1.0 - \vert L.duration\_ratio(p,d) - R.duration\_ratio(p,d) \vert$% WIDTH=380 HEIGHT=32 
return sum;