Search: Home Bugtraq Vulnerabilities Mailing Lists Jobs Tools Vista
      Digg this story   Add to del.icio.us   (page 1 of 3 ) next 
Identifying P2P users using traffic analysis
Yiming Gong 2005-07-21

With the emergence of Napster in the fall of 1999, peer to peer (P2P) applications and their user base have grown rapidly in the Internet community. With the popularity of P2P and the bandwidth it consume, there is a growing need to identify P2P users within the network traffic.

In this paper the author will propose a new method based on traffic behavior that helps identify P2P users, and even helps to distinguish what type of P2P applications are being used.

Current Technology

When it comes to identifying P2P users, currently there are only two choices: port based analysis and protocol analysis. Here is a brief review of both.

Port based analysis

Port based analysis is the most basic and straightforward method to detect P2P users in network traffic. It is based on the simple concept that many P2P applications have default ports on which they function. When these applications are run, they use these ports to communicate with outside. The following is a example list:

Limewire 6346/6347 TCP/UDP
Morpheus 6346/6347 TCP/UDP
BearShare default 6346 TCP/UDP
Edonkey 4662/TCP
EMule 4662/TCP 4672/UDP
Bittorrent 6881-6889 TCP/UDP
WinMx 6699/TCP 6257/UDP

To perform port based analysis, administrators just need to observe the network traffic and check whether there are connection records using these ports. If a match is found, it may indicate a P2P activity. Port based analysis is almost the only choice for network administrators who don't have special software or hardware (such as an IDS) to monitor traffic.

Port matching is very simple in practice, but its limitations are obvious. Most P2P applications allow users to change the default port numbers by manually selecting whatever port(s) they like. Additionally, many newer P2P applications are more inclined to use random ports, thus making the ports unpredictable. Also there is a trend for P2P applications begin to masquerade their function ports within well-known application ports such as port 80. All these issues make port based analysis less effective.

Protocol analysis

Despite the poor results found using simple port matching, an administrator has another choice: application layer protocol analysis.

With this approach, an application or piece of equipment monitors traffic passing through the network and inspects the data payload of the packets according to some previously defined P2P application signatures. Many of today's commercial and open source P2P application identification solutions are based on this approach, and include the L7-filter, Cisco's PDML, Juniper's netscreen-IDP, Alteon Application Switches, Microsoft common application signatures, and NetScout. They each do their detection work by doing regular expression matches on the application layer data, in order to determine whether a special P2P application is being used.

Because protocol analysis focuses on the packet payload and raises alerts only on a definite match, any client-side tricks that use non-default or dynamic ports to avoid detection by P2P applications will fail. Using this approach, the result is normally more accurate and believable, but it still has some shortcomings. Here are some points to remember with protocol analysis of P2P networks:

  • P2P applications are evolving continuously, and therefore signatures can change. Static signature based matching requires new signatures to be effective when these changes occur.
  • With more and more P2P identification and control products on the market, P2P developers tend to tunnel around any controls placed in their way. They could easily achieve this by encrypting the traffic, such as by using SSL, making protocol analysis much more difficult.
  • Signature-based identification means that the product should read and process all network traffic, which brings up the issue of how to maintain network stability in a large network. The product may burden network equipment heavily or even cause network failures. If it works inline, what will you do when the product fails?
  • Signature-based identification at the application level (L7) is also highly resource- intensive. The higher bandwidth network, the more cost and resources you need to inspect it. Suppose you inspect a 1Gbit or even 10Gbit network link, how much investment must you make to get an appropriate product?

Most importantly, if your organization cannot afford the special appliances or applications that perform protocol analysis, is port matching your only alternative? Fortunately, the answer is no. An approach based on traffic behavior patterns proves to be both functional and cost-effective.

Traffic behavior

Network traffic information can usually be easily retrieved from various network devices without affecting network performance or service availability too much. For small or medium networks, administrators can rely on their gateway or perimeter equipment logs. For larger networks and ISPs, administers can enable the Netflow function on their routers or switches to export network traffic records.
Article continued on Page 2 



SecurityFocus accepts Infocus article submissions from members of the security community. Articles are published based on outstanding merit and level of technical detail. Full submission guidelines can be found at http://www.securityfocus.com/static/submissions.html.
    Digg this story   Add to del.icio.us   (page 1 of 3 ) next 
Comments Mode:







 

Privacy Statement
Copyright 2008, SecurityFocus