Logo image
Classifying ethernet data packets based on raw bit patterns
Conference paper   Open access

Classifying ethernet data packets based on raw bit patterns

W.D. Kenworthy
2010 Third International Conference on Knowledge Discovery and Data Mining, pp.490-493
IEEE
3rd International Conference on Knowledge Discovery and Data Mining (Phuket, 09/01/2010–10/01/2010)
2010
pdf
Classifying_ethernet_data_packets.pdfDownloadView
Published (Version of Record) Open Access
url
Link to Published Version *Subscription may be requiredView

Abstract

Currently most operations on network data packets are controlled by the applicable protocols such as TCP/IP. However, there is scope to examine and classify the data without resorting to processing through a protocol stack. To do this, use can be made of the complex and sophisticated algorithms developed for the analysis of biological and genomics data. This makes use of similarities in the way information is stored in biological structures and network data traffic. It can be shown that network data flows have many of the same structural characteristics as biological DNA - areas of conservation (an area of data that has the same composition as an area in another packet of data will often have similar functionality), "motifs" with particular functions and the equivalent of "junk DNA" - areas where seemingly random changes occur. This paper looks at the novel application of algorithms designed to process DNA data to analyse and classify Ethernet network data packets based on the patterns discernible in the data rather than the more traditional method of matching fixed fields within the data based on protocol specifications. We are able to show that these algorithms are able to successfully and accurately classify packets of data into groups whose members have similar characteristics based on actual content rather than meta-data. This provides a unique and useful method of grouping and classifying packets that could be of use in diverse applications such as IDS systems, and the search for, and identification of specific types of data.

Details

Metrics

458 File views/ downloads
73 Record Views
Logo image