56IEEECOMMUNICATIONSSURVEYS&TUTORIALS,VOL.10,NO.4,FOURTHQUARTER2008 ASurveyofTechniquesforInternetTraffic ClassificationusingMachineLearning ThuyT.T.NguyenandGrenvilleArmitage Abstract—TheresearchcommunityhasbegunlookingforIP trafficclassificationtechniquesthatdonotrelyon‘wellknown’ TCPorUDPportnumbers,orinterpretingthecontentsofpacket payloads.Newworkisemergingontheuseofstatisticaltraffic characteristicstoassistintheidentificationandclassification process.Thissurveypaperlooksatemergingresearchintothe applicationofMachineLearning(ML)techniquestoIPtraffic classification-aninter-disciplinaryblendofIPnetworkingand dataminingtechniques.Weprovidecontextandmotivationfor theapplicationofMLtechniquestoIPtrafficclassification, andreview18significantworksthatcoverthedominantperiod from2004toearly2007.Theseworksarecategorizedand reviewedaccordingtotheirchoiceofMLstrategiesandprimary contributionstotheliterature.Wealsodiscussanumberofkey requirementsfortheemploymentofML-basedtrafficclassifiers inoperationalIPnetworks,andqualitativelycritiquetheextent towhichthereviewedworksmeettheserequirements.Open issuesandchallengesinthefieldarealsodiscussed. IndexTerms—Trafficclassification,InternetProtocol,Machine Learning,RealTime,Payloadinspection,Flowclustering,Sta- tisticaltrafficproperties. I.INTRODUCTION REAL-TIMEtrafficclassificationhasthepotentialto solvedifficultnetworkmanagementproblemsforIn- ternetserviceproviders(ISPs)andtheirequipmentvendors. Networkoperatorsneedtoknowwhatisflowingovertheir networkspromptlysotheycanreactquicklyinsupportof theirvariousbusinessgoals.Trafficclassificationmaybea corepartofautomatedintrusiondetectionsystems[1][2] [3],usedtodetectpatternsindicativeofdenialofservice attacks,triggerautomatedre-allocationofnetworkresources forprioritycustomers[4],oridentifycustomeruseofnetwork resourcesthatinsomewaycontravenestheoperator’sterms ofservice.Morerecently,governmentsarealsoclarifying ISPobligationswithrespectto‘lawfulinterception’(LI)of IPdatatraffic[5].Justastelephonecompaniesmustsupport interceptionoftelephoneusage,ISPsareincreasinglysubject togovernmentrequestsforinformationonnetworkuseby particularindividualsatparticularpointsintime.IPtraffic classificationisanintegralpartofISP-basedLIsolutions. CommonlydeployedIPtrafficclassificationtechniqueshave beenbasedarounddirectinspectionofeachpacket’scontents atsomepointonthenetwork.SuccessiveIPpacketshaving thesame5-tupleofprotocoltype,sourceaddress:portand destinationaddress:portareconsideredtobelongtoa flow ManuscriptreceivedApril23,2007;revisedSeptember5,2007. TheauthorsarewiththeCentreforAdvancedInternetArchitec- tures,SwinburneUniversityofTechnology,Melbourne,Australia(e-mail: [email protected],[email protected]). DigitalObjectIdentifier10.1109/SURV.2008.080406. whosecontrollingapplicationwewishtodetermine.Simple classificationinfersthecontrollingapplication’sidentityby assumingthatmostapplicationsconsistentlyuse‘wellknown’ TCPorUDPportnumbers(visibleintheTCPorUDPhead- ers).However,manyapplicationsareincreasinglyusingunpre- dictable(oratleastobscure)portnumbers[6].Consequently, moresophisticatedclassificationtechniquesinferapplication typebylookingforapplication-specificdata(orwell-known protocolbehavior)withintheTCPorUDPpayloads[7]. Unfortunately,theeffectivenessofsuch‘deeppacketin- spection’techniquesisdiminishing.Suchpacketinspection reliesontworelatedassumptions: • Thirdpartiesunaffiliatedwitheithersourceorrecipient areabletoinspecteachIPpacket’spayload(i.e.isthe payloadvisible) • Theclassifierknowsthesyntaxofeachapplication’s packetpayloads(i.e.canthepayloadbeinterpreted) Twoemergingchallengesunderminethefirstassumption- customersmayuseencryptiontoobfuscatepacketcontents (includingTCPorUDPportnumbers),andgovernments mayimposeprivacyregulationsconstrainingtheabilityof thirdpartiestolawfullyinspectpayloadsatall.Thesecond assumptionimposesaheavyoperationalload-commercial deviceswillneedrepeatedupdatestostayaheadofregular (orsimplygratuitous)changesineveryapplication’spacket payloadformats. Theresearchcommunityhasrespondedbyinvestigating classificationschemescapableof inferringapplication-level usagepatternswithoutdeepinspectionofpacketpayloads. Newerapproachesclassifytrafficbyrecognisingstatistical patternsinexternallyobservableattributesofthetraffic(such astypicalpacketlengthsandinter-packetarrivaltimes).Their ultimategoaliseitherclusteringIPtrafficflowsintogroups thathavesimilartrafficpatterns,orclassifyingoneormore applicationsofinterest. Anumberofresearchersarelookingparticularlycloselyat theapplicationofMachineLearning(ML)techniques(asub- setofthewiderArtificialIntelligencediscipline)toIPtraffic classification.TheapplicationofMLtechniquesinvolvesa numberofsteps.First, features aredefinedbywhichfutureun- knownIPtrafficmaybeidentifiedanddifferentiated.Features areattributesofflowscalculatedovermultiplepackets(such asmaximumorminimumpacketlengthsineachdirection, flowdurationsorinter-packetarrivaltimes).ThentheML classifieristrainedtoassociatesetsoffeatureswithknown trafficclasses(creatingrules),andapplytheMLalgorithm toclassifyunknowntrafficusingpreviouslylearnedrules. EveryMLalgorithmhasadifferentapproachtosortingand 1553-877X/08/$25.00 c  2008IEEENGUYENandARMITAGE:ASURVEYOFTECHNIQUESFORINTERNETTRAFFICCLASSIFICATIONUSINGMACHINELEARNING57 prioritisingsetsoffeatures,whichleadstodifferentdynamic behaviorsduringtrainingandclassification.Inthispaperwe providetherationaleforIPtrafficclassificationinIPnetworks, reviewthestate-of-the-artapproachestotrafficclassification, andthenreviewandcritiqueemergingML-basedtechniques forIPtrafficclassification. Therestofthispaperisorganisedasfollows.SectionII outlinestheimportanceofIPtrafficclassificationinopera- tionalnetworks,introducesanumberofmetricsforassess- ingclassificationaccuracy,anddiscussesthelimitationsof traditionalport-andpayload-basedclassification.SectionIII providesbackgroundinformationaboutMLandhowitcanbe appliedinIPtrafficclassification.Thesectionalsodiscussesa numberofkeyrequirementsfortheemploymentofML-based classifiersinoperationalIPnetworks.SectionIVreviewsthe significantworksinthisfield(predominantlyfrom2004to early2007).Thesectioniswrappedupwithaqualitative discussionoftheextenttowhichthereviewedworksmeet therequirementsspecifiedinsectionIII.SectionVconcludes thepaperwithsomefinalremarksandsuggestionsofpossible futurework. II.APPLICATION CONTEXTFOR MACHINE LEARNING BASED IPTRAFFIC CLASSIFICATION A.TheimportanceofIPtrafficclassification TheimportanceofIPtrafficclassificationmaybeillustrated bybrieflyreviewingtoimportantareas-IPqualityofservice (QoS)schemes,andlawfulinterception(LI). Inrespondingtothenetworkcongestionproblem,acom- monstrategyfornetworkprovidersisunder-utilising(over- provisioning)thelinkcapacity.However,thisisnotnecessar- ilyaneconomicsolutionformostISPs.Ontheotherhand, thedevelopmentofotherQoSsolutionssuchasIntServ[8] orDiffServ[9]hasbeenstymiedinpartduetothelack ofQoSsignalingandofaneffectiveservicepricingmech- anism(assuggestedin[10]and[11]).Signalingallowsthe communicationofspecificQoSrequirementsbetweenInternet applicationsandthenetwork.Apricingmechanismisneeded todifferentiatecustomerswith differentneedsandchargefor theQoSthattheyreceive.Italsoactsasacostrecovery mechanismandprovidesrevenuegenerationfortheISPsto compensatefortheireffortsinprovidingQoSandmanaging resourceallocation. AllQoSschemeshavesomedegreeofIPtrafficclassifi- cationimplicitintheirdesign.DiffServassumesthatedge routerscanrecogniseanddifferentiatebetweenaggregate classesoftrafficinordertosettheDiffServcodepoint(DSCP) onpacketsenteringthenetworkcore.IntServpresumesthat routersalongapathareabletodifferentiatebetweenfinely grainedtrafficclasses(andhistoricallyhaspresumedtheuse ofpacketheaderinspectionto achievethisgoal).Trafficclas- sificationalsohasthepotentialtosupportclass-basedInternet QoScharging.Furthermore,real-timetrafficclassificationis thecorecomponentofemergingQoS-enabledproducts[12] andautomatedQoSarchitectures[4][13]. Trafficclassificationisalsoanimportantsolutionforthe emergingrequirementthatISPnetworkshavetoprovideLI capabilities.GovernmentstypicallyimplementLIatvarious Fig.1.EvaluationMetrics levelsofabstraction.Inthetelephonyworldalawenforcement agencymaynominatea‘personofinterest’andissueawarrant forthecollectionofinterceptinformation.Theinterceptmay behigh-levelcallrecords(whocalledwhoandwhen)or low-level‘tapping’oftheaudiofromactualphonecallsin progress.IntheISPspace,trafficclassificationtechniques offerthepossibilityofidentifyingtrafficpatterns(which endpointsareexchangingpacketsandwhen),andidentifying whatclassesofapplicationsarebeingusedbya‘person ofinterest’atanygivenpointintime.Dependingonthe particulartrafficclassificationscheme,thisinformationmay potentiallybeobtainedwithoutviolatinganyprivacylaws coveringtheTCPorUDPpayloadsoftheISPcustomer’s traffic. B.Trafficclassificationmetrics Akeycriteriononwhichtodifferentiatebetweenclassifi- cationtechniquesispredictiveaccuracy(i.e.,howaccurately thetechniqueormodelmakesdecisionswhenpresentedwith previouslyunseendata).Anumberofmetricsexistwithwhich toexpresspredictiveaccuracy. 1)Positives,negatives,accuracy,precisionandrecall: AssumethereisatrafficclassXinwhichweareinterested, mixedinwithabroadersetofIPtraffic.Atrafficclassifieris beingusedtoidentify(classify)packets(orflowsofpackets) belongingtoclassXwhenpresentedwithamixtureof previouslyunseentraffic.Theclassifierispresumedtogive oneoftwooutputs-aflow(orpacket)isbelievedtobea memberofclassX,oritisnot. Acommonwaytocharacterizeaclassifier’saccuracyis throughmetricsknownas FalsePositives, FalseNegatives, TruePositives and TrueNegatives.Thesemetricsaredefined asfollows: • FalseNegatives (FN):PercentageofmembersofclassX incorrectlyclassifiedasnotbelongingtoclassX. • FalsePositives (FP):Percentageofmembersofother classesincorrectlyclassifiedasbelongingtoclassX. • TruePositives (TP):PercentageofmembersofclassX correctlyclassifiedasbelongingtoclassX(equivalentto 100%-FN). • TrueNegatives (TN):Percentageofmembersofother classescorrectlyclassifiedasnotbelongingtoclassX (equivalentto100%-FP). Figure1illustratestherelationshipsbetweenFN,FP,TP andTN.AgoodtrafficclassifieraimstominimisetheFalse NegativesandFalsePositives. Someworksmakeuseof Accuracy asanevaluationmetric. Itisgenerallydefinedasthepercentageofcorrectlyclassified58IEEECOMMUNICATIONSSURVEYS&TUTORIALS,VOL.10,NO.4,FOURTHQUARTER2008 instancesamongthetotalnumberofinstances.Thisdefinition isusedthroughoutthepaperunlessotherwisestated. MLliteratureoftenutilisestwoadditionalmetricsknown as Recall and Precision.Thesemetricsaredefinedasfollows: • Recall:PercentageofmembersofclassXcorrectly classifiedasbelongingtoclassX. • Precision:Percentageofthoseinstancesthattrulyhave classX,amongallthoseclassifiedasclassX. Ifallmetricsareconsideredtorangefrom0(bad)to100% (optimal)itcanbeseenthatRecallisequivalenttoTP. 2)ByteandFlowaccuracy: Whencomparingliteratureon differentclassificationtechniquesitisalsoimportanttonote theunitoftheauthor’schosenmetric.Recall,Precision,FN andFPmayallbereportedaspercentagesofbytesorflows relativetothetrafficbeingclassified.Anauthor’schoicehere cansignificantlyalterthemeaningoftheirreportedaccuracy results. Mostrecentlypublishedtrafficclassificationstudieshave focusedon flowaccuracy -measuringtheaccuracywithwhich flowsarecorrectlyclassified,relativetothenumberofother flowsintheauthor’stestand/ortrainingdataset(s).However, somerecentworkhasalsochosentoexpresstheiraccuracy calculationsintermsof byteaccuracy -focusingmoreonhow manybytesarecarriedbythepacketsofcorrectlyclassified flows,relativetothetotalnumberofbytesintheauthor’stest and/ortrainingdataset(s)(e.g.[14][15]). Ermanetal.in[16]arguethatbyteaccuracyiscrucialwhen evaluatingtheaccuracyoftrafficclassificationalgorithms. TheynotethatthemajorityofflowsontheInternetaresmall andaccountforonlyasmallportionoftotalbytesandpackets inthenetwork(mice flows).Ontheotherhand,themajority ofthetrafficbytesaregeneratedbyasmallnumberoflarge flows(elephant flows).Theygiveanexamplefroma6-month datatracewherethetop(largest)1%offlowsaccountfor over73%ofthetrafficintermsofbytes.Withathreshold todifferentiateelephantandmiceflowsof3.7MB,thetop 0.1%offlowswouldaccountfor46%ofthetraffic(inbytes). Presentedwithsuchadataset,aclassifieroptimisedtoidentify allbutthetop0.1%oftheflowscouldattaina99.9%flow accuracybutstillresultin46%ofthebytesinthedatasetto bemisclassified. Whetherflowaccuracyorbyteaccuracyismoreimportant willgenerallydependontheclassifier’sintendeduse.For example,whenclassifyingtrafficforIPQoSpurposesitis plausiblethatidentifyingeveryinstanceofashortlivedflow needingQoS(suchasa5minute,32Kbit/secphonecalls)isas importantasidentifyinglonglivedflowsneedingQoS(such asa30minute,256Kbit/secvideoconference)withbothbeing farmoreimportanttocorrectlyidentifythanthefewflowsthat representmulti-hour(and/orhundredsofmegabytes)peerto peerfilesharingsessions.Conversely,anISPdoinganalysis ofloadpatternsontheirnetworkmaywellbesignificantly interestedincorrectlyclassifyingtheapplicationsdrivingthe elephantflowsthatcontributeadisproportionatenumberof packetsacrosstheirnetwork. C.Limitationsofpacketinspectionfortrafficclassification TraditionalIPtrafficclassificationreliesontheinspection ofapacket’sTCPorUDPportnumbers(portbasedclas- sification),orthereconstructionofprotocolsignaturesinits payload(payloadbasedclassification).Eachapproachsuffers fromanumberoflimitations. 1)PortbasedIPtrafficclassification: TCPandUDPpro- videforthemultiplexingofmultipleflowsbetweencommon IPendpointsthroughtheuseofportnumbers.Historically manyapplicationsutilisea‘wellknown’portontheirlocal hostasarendezvouspointtowhichotherhostsmayinitiate communication.Aclassifiersittinginthemiddleofanetwork needonlylookforTCPSYNpackets(thefirststepinTCP’s three-wayhandshakeduringsessionestablishment)toknow theserversideofanewclient-serverTCPconnection.The applicationistheninferredbylookinguptheTCPSYN packet’stargetportnumberintheInternetAssignedNumbers Authority(IANA)’slistofregisteredports[17].UDPuses portssimilarly(thoughwithoutconnectionestablishmentnor themaintenanceofconnectionstate). However,thisapproachhaslimitations.Firstly,someap- plicationsmaynothavetheirportsregisteredwithIANA (forexample,peertopeerapplicationssuchasNapsterand Kazaa)[18].Anapplicationmayuseportsotherthanits well-knownportstoavoidoperatingsystemaccesscontrol restrictions(forexample,non-privilegedusersonunix-like systemsmaybeforcedtorunHTTPserversonportsother thanport80.)Also,insomecasesserverportsaredynamically allocatedasneeded.Forexample,theRealVideostreamer allowsthedynamicnegotiationoftheserverportusedfor thedatatransfer.Thisserverportisnegotiatedonaninitial TCPconnection,whichisestablishedusingthewell-known RealVideocontrolport[19]. MooreandPapagiannaki[20]observednobetterthan70% byteaccuracyforport-basedclassificationusingtheofficial IANAlist.MadhukarandWilliamson[21]showedthatport- basedanalysisisunabletoidentify30-70%ofInternettraffic flowstheyinvestigated.Senetal.[7]reportedthatthedefault portaccountedforonly30%ofthetotaltraffic(inbytes)for theKazaaP2Pprotocol. InsomecircumstancesIPlayerencryptionmayalsoobfus- catetheTCPorUDPheader,makingitimpossibletoknow theactualportnumbers. 2)PayloadbasedIPtrafficclassification: Toavoidtotal relianceonthesemanticsofportnumbers,manycurrent industryproductsutilisestatefulreconstructionofsessionand applicationinformation fromeachpacket’scontent. Senetal.[7]showedthatpayloadbasedclassification ofP2Ptraffic(byexaminingthesignaturesofthetrafficat theapplicationlevel)couldreducefalsepositivesandfalse negativesto5%oftotalbytesformostP2Pprotocolsstudied. MooreandPapagiannaki[20]useacombinationofport andpayloadbasedtechniquestoidentifynetworkapplications. Theclassificationprocedurestartswiththeexaminationofa flow’sportnumber.Ifnowell-knownportisused,theflow ispassedthroughtothenextstage.Inthesecondstage,the firstpacketisexaminedtoseewhetheritcontainsaknown signature.Ifoneisnotfound,thenthepacketisexaminedto seewhetheritcontainsawell-knownprotocol.Ifthesetests fail,theprotocolsignaturesinthefirstKByteoftheflow arestudied.Flowsremainingunclassifiedafterthatstagewill requireinspectionoftheentireflowpayload.TheirresultsNGUYENandARMITAGE:ASURVEYOFTECHNIQUESFORINTERNETTRAFFICCLASSIFICATIONUSINGMACHINELEARNING59 showthatportinformationby itselfiscapableofcorrectly classifying69%ofthetotalbytes.Includingtheinformation observedinthefirstKByteofeachflowincreasestheaccuracy toalmost79%.Higheraccuracy(uptonearly100%)canonly beachievedbyinvestigatingtheremainingunclassifiedflows’ entirepayload. Althoughpayloadbasedinspectionavoidsrelianceonfixed portnumbers,itimposessignificantcomplexityandprocessing loadonthetrafficidentificationdevice.Itmustbekeptup- to-datewithextensiveknowledgeofapplicationprotocolse- mantics,andmustbepowerfulenoughtoperformconcurrent analysisofapotentiallylargenumberofflows.Thisapproach canbedifficultorimpossiblewhendealingwithproprietary protocolsorencryptedtraffic.Furthermoredirectanalysisof sessionandapplicationlayercontentmayrepresentabreachof organisationalprivacypoliciesorviolationofrelevantprivacy legislation. D.Classificationbasedonstatisticaltrafficproperties Theprecedingtechniquesarelimitedbytheirdependence ontheinferredsemanticsoftheinformationgatheredthrough deepinspectionofpacketcontent(payloadandportnumbers). Newerapproachesrelyontraffic’sstatisticalcharacteristics toidentifytheapplication.Anassumptionunderlyingsuch methodsisthattrafficatthenetworklayerhasstatistical properties(suchasthedistributionofflowduration,flowidle time,packetinter-arrivaltimeandpacketlengths)thatare uniqueforcertainclassesofapplicationsandenabledifferent sourceapplicationstobedistinguishedfromeachother. Therelationshipbetweentheclassoftrafficanditsobserved statisticalpropertieshasbeennotedin[22](wheretheauthors analysedandconstructedempiricalmodelsofconnection characteristics-suchasbytes,duration,arrivalperiodicity -foranumberofspecificTCPapplications),andin[23] (wheretheauthorsanalysedInternetchatsystemsbyfocusing onthecharacteristicsofthetrafficintermsofflowduration, packetinter-arrivaltimeand packetsizeandbyteprofile). Laterwork(forexample[24][25]and[26])alsoobserved distinctivetrafficcharacteristics,suchasthedistributionsof packetlengthsandpacketinter-arrivaltimes,foranumber ofInternetapplications.Theresultsoftheseworkshave stimulatednewclassificationtechniquesbasedontrafficflow statisticalproperties.Theneedtodealwithtrafficpatterns, largedatasetsandmulti-dimensionalspacesofflowandpacket attributesisoneofthereasonsfortheintroductionofML techniquesinthisfield. III.BACKGROUNDON MACHINE LEARNINGANDTHE APPLICATIONOF MACHINE LEARNINGIN IPTRAFFIC CLASSIFICATION ThissectionsummariesthebasicconceptsofMachine Learning(ML)andoutlineshowMLcanbeappliedtoIP trafficclassification. A.AreviewofclassificationwithMachineLearning MLhashistoricallybeenknownasacollectionofpowerful techniquesfordataminingandknowledgediscovery,which searchforanddescribeusefulstructuralpatternsindata. In1992Shi[27]noted ‘Oneofthedefiningfeaturesof intelligenceistheabilitytolearn. [...].Machinelearningis thestudyofmakingmachinesacquirenewknowledge,new skills,andreorganiseexistingknowledge.’ Alearningmachine hastheabilitytolearnautomaticallyfromexperienceand refineandimproveitsknowledgebase.In1983Simonnoted ‘Learningdenoteschangesinthesystemthatareadaptivein thesensethattheyenablethesystemtodothesametaskor tasksdrawnfromthesamepopulationmoreefficientlyand moreeffectivelythenexttime’ [28]andin2000Wittenand Frankobserved ‘Thingslearnwhentheychangetheirbehavior inawaythatmakesthemperformbetterinthefuture’ [29]. MLhasawiderangeofapplications,includingsearch engines,medicaldiagnosis,textandhandwritingrecognition, imagescreening,loadforecasting,marketingandsalesdi- agnosis,andsoon.AnetworktrafficcontrollerusingML techniqueswasproposedin1990,aimingtomaximisecall completioninacircuit-switchedtelecommunicationsnetwork [30];thiswasoneoftheworksthatmarkedthepointat whichMLtechniquesexpandedtheirapplicationspaceinto thetelecommunicationsnetworkingfield.In1994MLwas firstutilisedforInternetflowclassificationinthecontextof intrusiondetection[31].Itisthestartingpointformuchof theworkusingMLtechniquesinInternettrafficclassification thatfollows. 1)InputandoutputofaMLprocess: Broadlyspeaking, MListheprocessoffindinganddescribingstructuralpatterns inasupplieddataset. MLtakesinputintheformofa dataset of instances (also knownas examples).Aninstancereferstoanindividual,inde- pendentexampleofthedataset.Eachinstanceischaracterised bythevaluesofits features (alsoknownas attributes or discriminators)thatmeasuredifferentaspectsoftheinstance. (Inthenetworkingfieldconsecutivepacketsfromthesame flowmightformaninstance,whilethesetoffeaturesmight includemedianinter-packetarrivaltimesorstandarddeviation ofpacketlengthsoveranumberofconsecutivepacketsin aflow.)Thedatasetisultimatelypresentedasamatrixof instancesversusfeatures[29]. Theoutputisthedescriptionoftheknowledgethathas beenlearnt.Howthespecificoutcomeofthelearningprocess isrepresented(thesyntaxandsemantics)dependslargelyon theparticularMLapproachbeingused. 2)Differenttypesoflearning: WittenandFrank[29]define fourbasictypesoflearning: • Classification(or supervisedlearning) • Clustering(or unsupervisedlearning) • Association • Numericprediction Classificationlearninginvolvesamachinelearningfroma setofpre-classified(alsocalledpre-labeled)examples,from whichitbuildsasetofclassificationrules(a model)toclassify unseenexamples.Clusteringisthegroupingofinstancesthat havesimilarcharacteristicsintoclusters,withoutanyprior guidance.Inassociationlearning,anyassociationbetween featuresissought.Innumericprediction,theoutcometobe predictedisnotadiscreteclassbutanumericquantity.60IEEECOMMUNICATIONSSURVEYS&TUTORIALS,VOL.10,NO.4,FOURTHQUARTER2008 MostMLtechniquesusedforIPtrafficclassificationfocus ontheuseofsupervisedandunsupervisedlearning. 3)SupervisedLearning: Supervisedlearningcreates knowledgestructuresthatsupportthetaskofclassifyingnew instancesintopre-definedclasses[32].Thelearningmachine isprovidedwithacollectionofsampleinstances,pre-classified intoclasses.Outputofthelearningprocessisaclassification modelthatisconstructedbyexaminingandgeneralisingfrom theprovidedinstances. Ineffect,supervisedlearningfocusesonmodelingthe input/outputrelationships.Itsgoalistoidentifyamapping frominputfeaturestoanoutputclass.Theknowledgelearnt (e.g.commonalitiesamongmembersofthesameclassand differencesbetweencompetingones)canbepresentedasa flowchart,adecisiontree,classificationrules,etc.,thatcanbe usedlatertoclassifyanewunseeninstance. Therearetwomajorphases(steps)insupervisedlearning: • Training:Thelearningphasethatexaminestheprovided data(calledthetrainingdataset)andconstructs(builds) aclassificationmodel. • Testing (alsoknownasclassifying):Themodelthathas beenbuiltinthetrainingphaseisusedtoclassifynew unseeninstances. Forexample,letTSbeatrainingdataset,thatisasetof input/outputpairs, TS= {,,...,} wherexi isthevectorofvaluesoftheinputfeatures correspondingtothei th instance,andyi isitsoutputclass value.(Thename supervisedlearning comesfromthefact thattheoutputclassesarepre-definedinthetrainingdataset.) Thegoalofclassificationcanbeformulatedasfollows:From atrainingdatasetTS,findafunctionf(x)oftheinputfeatures thatbestpredictstheoutcomeoftheoutputclassyforany newunseenvaluesofx.Theoutputtakesitsvalueinadiscrete set {y1,y2,...,yM} thatconsistsofallthepre-definedclass values.Thefunctionf(x)isthecoreoftheclassificationmodel. Themodelcreatedduringtrainingisimprovedifwesi- multaneouslyprovideexamplesofinstancesthatbelongto class(es)ofinterestandinstancesknownto not bemembers oftheclass(es)ofinterest.Thiswillenhancethemodel’slater abilitytoidentifyinstancesbelongingtoclass(es)ofinterest. Thereexistanumberofsupervisedlearningclassification algorithms,eachdifferingmainlyinthewaytheclassification modelisconstructedandwhatoptimizationalgorithmisused tosearchforagoodmodel.(Examplesincludethesupervised DecisionTreeandNaiveBayesclassificationalgorithms[33] [29].) 4)Clustering: Classificationtechniquesusepre-defined classesoftraininginstances.Incontrast,clusteringmethods arenotprovidedwiththisguidance;instead,theydiscovernat- uralclusters(groups)inthedatausinginternalisedheuristics [33]. Clusteringfocusesonfindingpatternsintheinputdata.It clustersinstanceswithsimilarproperties(definedbyaspecific distancemeasuringapproach,suchasEuclideanspace)into groups.Thegroupsthataresoidentifiedmaybeexclusive,so thatanyinstancebelongsinonlyonegroup;ortheymaybe overlapping,whenoneinstancemayfallintoseveralgroups; theymayalsobeprobabilistic,thatisaninstancebelongsto agroupwithacertainprobability.Theymaybehierarchical, wherethereisadivisionofinstancesintogroupsatthetop level,andtheneachofthesegroupsisrefinedfurther-even downtothelevelofindividualinstances[29]. Therearethreebasicclusteringmethods:theclassick- meansalgorithm,incrementalclustering,andtheprobability- basedclusteringmethod.Theclassick-meansalgorithmforms clustersinnumericdomains,partitioninginstancesintodis- jointclusters,whileincrementalclusteringgeneratesahierar- chicalgroupingofinstances.Theprobability-basedmethods assigninstancestoclassesprobabilistically,notdeterministi- cally[29]. 5)Evaluatingsupervisedlearningalgorithms: AgoodML classifierwouldoptimiseRecallandPrecision.However,there maybetrade-offsbetweenthesemetrics.Todecidewhich oneismoreimportantorshouldbegivenhigherpriorityone needstotakeintoaccountthecostofmakingwrongdecisions orwrongclassifications.Thedecisiondependsonaspecific applicationcontextandonescommercialand/oroperational priorities. Varioustoolsexisttosupportthistrade-offprocess.The receiveroperatingcharacteristic (ROC)curveprovidesaway tovisualizethetrade-offsbetweenTPandFPbyplotting thenumberofTPasafunctionofthenumberofFP(both expressedaspercentageofthetotalTPandFPrespectively). Ithasbeenfoundusefulinanalysinghowclassifiersperform overarangeofthresholdsettings[29].AnotheristheNeyman- Pearsoncriterion[34],whichattemptstomaximizeTPgiven afixedthresholdonFP[35]. MostofIPclassificationworkreviewedlaterinthissurvey donotaddressthecostsoftradingbetweenRecalland Precision. Achallengewhenusingsupervisedlearningalgorithmsis thatboththetrainingandtestingphasesmustbeperformed usingdatasetsthathavebeenpreviouslyclassified(labeled) 1 .Ideallyonewouldhavealargetrainingset(foroptimal learningandcreationofmodels)andalarge,yetindependent, testingdatasettoproperlyassessthealgorithm’sperformance. (Testingonthetrainingdatasetisbroadlymisleading.Such testingwillusuallyonlyshowthattheconstructedmodel isgoodatrecognisingtheinstancesfromwhichitwas constructed.) Intherealworldweareoftenfacedwithalimitedquantity ofpre-labeleddatasets.Asimpleprocedure(sometimesknown as holdout [29])involvessettingasidesomepart(e.g.two thirds)ofthepre-labeleddatasetfortrainingandtherest(e.g. onethird)fortesting. Inpracticewhenonlysmallorlimiteddatasetsareavailable avariantofholdout,called N-foldcross-validation,ismost commonlyused.ThedatasetisfirstsplitintoNapproximately equalpartitions(orfolds).Eachpartition(1/N)inturnisthen usedfortesting,whiletheremainder((N − 1)/N)areused fortraining.TheprocedureisrepeatedNtimessothatinthe end,everyinstancehasbeenusedexactlyoncefortesting.The overallRecallandPrecisionarecalculatedfromtheaverage 1Inthiscontext’labeling’istheprocessofclassifyingthemembersofa datasetusingmanual(human)inspectionoranirrefutableautomatedprocess. Incontrasttoacontrolledtrainingandtestingenvironment,operational classifiersdonothaveaccesstopreviouslylabeledexampleflows.NGUYENandARMITAGE:ASURVEYOFTECHNIQUESFORINTERNETTRAFFICCLASSIFICATIONUSINGMACHINELEARNING61 oftheRecallsandPrecisionsmeasuredfromallNtests.Ithas beenclaimedthatN=10(tenfoldcross-validation)provides agoodestimateofclassificationperformance[29]. SimplypartitioningthefulldatasetNwaysdoesnotguaran- teethatthesameproportionisusedforanygivenclasswithin thedataset.Afurtherstep,knownas stratification,isusually applied-randomlysamplingthedatasetinsuchawaythat eachclassisproperlyrepresentedinbothtrainingandtesting datasets.Whenstratificationisusedincombinationwithcross- validation,itiscalled stratifiedcross-validation.Itiscommon tousestratifiedten-foldcross-validationwhenonlylimited pre-labeleddatasetsareavailable. 6)Evaluatingunsupervisedlearningalgorithms: While RecallandPrecisionarecommonmetricstoevaluateclas- sificationalgorithms,evaluatingclusteringalgorithmsismore complicated.Thereareintermediatestepsinevaluatingthe resultingclustersbeforelabelingthemorgeneratingrulesfor futureclassification.Givenadataset,aclusteringalgorithm canalwaysgenerateadivision,withitsownfindingof structurewithinthedata.Differentapproachescanleadto differentclusters,andevenfor thesamealgorithm,different parametersordifferentorderofinputpatternsmightalterthe finalresults[36][37]. Therefore,itisimportanttohaveeffectiveevaluationstan- dardsandcriteriatoprovidetheuserswithacertainlevelof confidenceinresultsgeneratedbyaparticularalgorithm,or comparisonsofdifferentalgorithms[38].Criteriashouldhelp toanswerusefulquestionssuchashowmanyclustersare hiddeninthedata,whataretheoptimalnumberofclusters [37],whethertheresultedclustersaremeaningfulorjustan artifactofthealgorithms[38],howonealgorithmperforms comparedtoanother:howeasytheyaretouse,howfastit istobeemployed[36],whatistheintra-clusterquality,how goodisinter-clusterseparation,whatisthecostoflabelingthe clustersandwhataretherequirementsintermsofcomputer computationandstorage. Halkidietal.[37]identifythreeapproachestoinvestigate clustervalidity:externalcriteria,internalcriteriaandrelative criteria.Thefirsttwoapproachesarebasedonstatistical hypothesistesting.Externalcriteriaarebasedonsomepre- specifiedstructure,whichisknownaspriorinformationon thedata,andusedasastandardtocompareandvalidate theclusteringresults[38].Internalcriteriaapproachevaluates clusteringresultofanalgorithmbasedonexaminingthe internalstructureinheritedfromthedataset.Relativecriteria emphasisesfindingthebestclusteringschemethataclustering algorithmcandefineundercertainassumptionsandparame- ters.Thebasicideaistoevaluateaclusteringstructureby comparingittotheonesusingthesamealgorithmbutwith differentparametervalues[39].(Moredetailscanbefoundin [37][38][36][29].) 7)Featureselectionalgorithms: KeytobuildingaML classifierisidentificationofthesmallestnecessarysetof featuresrequiredtoachieveone’saccuracygoals-aprocess knownas featureselection. Thequalityofthefeaturesetiscrucialtotheperformance ofaMLalgorithm.Usingirrelevantorredundantfeatures oftenleadstonegativeimpactsontheaccuracyofmostML algorithms.Itcanalsomakethesystemmorecomputationally expensive,astheamountofinformationstoredandprocessed riseswiththedimensionalityofafeaturesetusedtodescribing thedatainstances.Consequentlyitisdesirabletoselecta subsetoffeaturesthatissmallinsizeyetretainsessential andusefulinformationabouttheclassesofinterest. Featureselectionalgorithmscanbebroadlyclassifiedinto filtermethodorwrappermethod.Filtermethodalgorithms makeindependentassessmentbasedongeneralcharacteristics ofthedata.Theyrelyonacertainmetrictorateandselect thebestsubsetbeforethelearningcommences.Theresults providedthereforeshouldnotbebiasedtowardaparticular MLalgorithm.Wrappermethodalgorithms,ontheotherhand, evaluatetheperformanceofdifferentsubsetsusingtheML algorithmthatwillultimatelybeemployedforlearning.Its resultsarethereforebiasedtowardtheMLalgorithmused. Anumberofsubsetsearchtechniquescanbeused,e.g. Correlation-basedFeatureSelection(CFS)filtertechniques withGreedy,Best-FirstorGeneticsearch.(Additionaldetails canbefoundin[29][40][41][42][43].) B.TheapplicationofMLinIPtrafficclassification AnumberofgeneralMLconceptstakeaspecificmeaning whenappliedtoIPtrafficclassification.Forthepurposeof subsequentdiscussionwedefinethefollowingthreeterms relatingtoflows: • Flow or Uni-directionalflow:Aseriesofpacketssharing thesamefive-tuple:sourceanddestinationIPaddresses, sourceanddestinationIP portsandprotocolnumber. • Bi-directionalflow:Abi-directionalflowisapairofuni- directionalflowsgoingintheoppositedirectionsbetween thesamesourceanddestinationIPaddressesandports. • Full-flow:Abi-directionalflowcapturedoveritsentire lifetime,fromtheestablishmenttotheendofthecom- municationconnection. AclassusuallyindicatesIPtrafficcausedby(orbelonging to)anapplicationorgroupofapplications.Instancesareusu- allymultiplepacketsbelonging tothesameflow.Featuresare typicallynumericalattributescalculatedovermultiplepackets belongingtoindividualflows.Examplesincludemeanpacket lengths,standarddeviationofinter-packetarrivaltimes,total flowlengths(inbytesand/orpackets),Fouriertransformof packetinter-arrivaltime,andsoon.Aspreviouslynotednotall featuresareequallyuseful,sopracticalMLclassifierschoose thesmallestsetoffeaturesthatleadtoefficientdifferentiation betweenmembersofaclassandothertrafficoutsidetheclass. 1)TrainingandtestingasupervisedMLtrafficclassifier: Figures2,3and4illustratethestepsinvolvedinbuildinga trafficclassifierusingasupervisedlearning(or supervisedML) algorithm.Inthisexample,thetrafficclassifierisintendedto recogniseaparticularclassofapplications(real-timeonline gametraffic)inamongsttheusualmixoftrafficseenonan IPnetwork. Figure2capturestheoveralltrainingandtestingprocess thatresultsinaclassificationmodel.Asnotedearlier,the optimalapproachtotrainingasupervisedMLalgorithmis toprovidepreviouslyclassifiedexamplesoftwotypesofIP traffic:trafficmatchingtheclassoftrafficthatonewisheslater toidentifyinthenetwork(inthiscaseonlinegametraffic),62IEEECOMMUNICATIONSSURVEYS&TUTORIALS,VOL.10,NO.4,FOURTHQUARTER2008 Fig.2.Trainingandtestingforatwo-classsupervisedMLtrafficclassifier Fig.3.TrainingthesupervisedMLtrafficclassifier andrepresentativetrafficofentirelydifferentapplicationsone wouldexpecttoseeinfuture(oftenreferredtoas interfering traffic). Figure3expandsonthesequenceofeventsinvolvedin trainingasupervisedMLtrafficclassifier.Firstamixof‘traffic traces’arecollectedthatcontainbothinstancesoftheapplica- tionofinterest(e.g.onlinegametraffic)andinstancesofother interferingapplications(suchasHTTP,DNS,SSHand/or peer2peerfilesharing).The‘flowstatisticsprocessing’step involvescalculatingthestatisticalpropertiesoftheseflows (suchasmeanpacketinter-arrivaltime,medianpacketlength and/orflowduration)asapreludetogeneratingfeatures. Anoptionalnextstepis‘datasampling’,designedtonarrow downthesearchspacefortheMLalgorithmwhenfacedwith extremelylargetrainingdatasets(traffictraces).Thesampling stepextractsstatisticsfromasubsetofinstancesofvarious applicationclasses,andpassesthesealongtotheclassifierto beusedinthetrainingprocess. AsnotedinsectionIII-A7,afeaturefiltering/selectionstep isdesirabletolimitthenumberoffeaturesactuallyused totrainthesupervisedMLclassifierandthuscreatethe classificationmodel.TheoutputofFigure3isaclassification model. Cross-validation(orstratifiedcross-validation)maybeused togenerateaccuracyevaluationresultsduringthetraining phase.However,ifthesourcedatasetconsistsofIPpackets collectedatthesametimeandthesamenetworkmeasurement point,thecross-validationresultsarelikelytoover-estimate theclassifier’saccuracy.(Ideallythesourcedatasetwould Fig.4.DataflowwithinanoperationalsupervisedMLtrafficclassifier containamixoftrafficcollectedatdifferenttimesand measurementpoints,oruseentirelyindependentlycollected trainingandtestingdatasets.) Figure4illustratesdataflowwithinanoperationaltraffic classifierusingthemodelbuiltinFigure3.Trafficcaptured inreal-timeisusedtoconstructflowstatisticsfromwhich featuresaredeterminedandthenfedintotheclassification model.(Herewepresumethatthesetoffeaturescalculated fromcapturedtrafficislimitedtotheoptimalfeatureset determinedduringtraining.)Theclassifier’soutputindicates whichflowsaredeemedtobemembersoftheclassof interest(asdefinedbythemodel).Certainimplementations mayoptionallyallowthemodeltobeupdatedinreal-time (performingasimilardatasamplingandtrainingprocessto thatshowninFigure3).(Forcontrolledtestingandevaluation purposesofflinetraffictracescanbeusedinsteadoflivetraffic capture.) 2)Supervisedversusunsupervisedlearning: Aspreviously noted,IPtrafficclassificationisusuallyaboutidentifyingtraf- ficbelongingtoknownapplications(classesofinterest)within previouslyunseenstreamsofIPpackets.Thekeychallengeis todeterminetherelationship(s)betweenclassesofIPtraffic (asdifferentiatedbyMLfeatures)andtheapplicationscausing theIPtraffic. SupervisedMLschemesrequireatrainingphasetocement thelinkbetweenclassesandapplications.Trainingrequires a-prioriclassification(or labeling)oftheflowswithinthe trainingdatasets.Forthisreason,supervisedMLmaybe attractivefortheidentificationofaparticular(orgroupsof) application(s)ofinterest.However,asnotedinsectionIII-A3, thesupervisedMLclassifierworksbestwhentrainedon examplesofalltheclassesitexpectstoseeinpractice. Consequently,itsperformancemaybedegradedorskewed ifnottrainedonarepresentativemixoftrafficorthenetwork link(s)beingmonitoredstartseeingtrafficofpreviouslyun- knownapplications.(Forexample,Parketal.[44]showedthat accuracyissensitivetosite-dependenttrainingdatasets,while Ermanetal.[45]showeddifferentaccuracyresultsbetween thetwodatatracesstudiedforthesameMLalgorithms.) WhenevaluatingsupervisedMLschemesinanoperational contextitisworthwhileconsideringhowtheclassifierwillbeNGUYENandARMITAGE:ASURVEYOFTECHNIQUESFORINTERNETTRAFFICCLASSIFICATIONUSINGMACHINELEARNING63 suppliedwithadequatesupervisedtrainingexamples,whenit willbenecessarytore-train,andhowtheuserwilldetecta newtypeofapplications. ItmightappearthatoneadvantageofunsupervisedML schemesistheautomaticdiscoveryofclassesthroughthe recognitionof‘natural’patterns(clusters)inthedataset. However,resultingclustersstillneedtobelabeled(forex- ample,throughdirectinspectionbyahumanexpert)inorder thatnewinstancesmaybeproperlymappedtoapplications. (Arelatedbenefitisthattrafficfrompreviouslyunknown applicationsmaybedetectedbynotingwhennewclusters emerge-sometimestheemergenceofnewapplicationflows isnoteworthyevenbeforetheactualidentityoftheapplication hasbeendetermined.) AnotherissueforunsupervisedMLschemesisthatclusters donotnecessarilymap1:1toapplications.Itwouldbeideal ifthenumberofclustersformedisequaltothenumber ofapplicationclassestobeidentified,andeachapplication dominatedoneclustergroup.However,inpractice,thenumber ofclustersisoftengreaterthanthenumberofapplication classes[46][47].Oneapplicationmightspreadoverand dominateanumberofclusters,or converselyanapplication mightalsospreadoverbutnotdominateanyoftheclusters. Mappingbackfromaclustertoasourceapplicationcan becomeagreatchallenge. WhenevaluatingunsupervisedMLschemesinanopera- tionalcontextitisworthwhileconsideringhowclusterswill belabeled(mappedtospecificapplications),howlabelswill beupdatedasnewapplicationsaredetected,andtheoptimal numberofclusters(balancingaccuracy,costoflabelingand labellookup,andcomputationalcomplexity). C.Challengesforoperationaldeployment SectionII-AnotedsomeimportantIPtrafficclassification scenarioswhereclassificationmustnormallyoccur asthetraf- ficisflowing (orwithinsomefairlyshortperiodoftimeafter thetrafficoccurred).Thiscreatessomeparticularrequirements fortimelyclassificationastrafficisintransitacrossanetwork. 1)Timelyandcontinuousclassification: A timely classifier shouldreachitsdecisionusingasfewpacketsaspossible fromeachflow(ratherthanwaitinguntileachflowcompletes beforereachingadecision).Reducingthenumberofpackets requiredforclassificationalsoreducesthememoryrequiredto bufferpacketsduringfeaturecalculations.Thisisanimportant considerationforsituationswheretheclassifieriscalculating featuresfor(tensof)thousandsofconcurrentflows.Depend- ingonthebusinessreasonforperformingclassification,it maybeunacceptabletosampletheavailableflowsinorder toreducethememoryconsumption.Instead,oneaimstouse lesspacketsfromeachflow. However,itisnotsufficienttosimplyclassifybasingon thefirstfewpacketsofaflow.Forexample,maliciousattacks mightdisguisethemselveswiththestatisticalpropertiesof atrustedapplicationearlyintheirflow’slifetime.Orthe classifieritselfmighthavebeenstarted(orrestarted)whilst hundredsorthousandsofflowswerealreadyactivethrougha networkmonitoringpoint(therebymissingthestartsofthese activeflows).Consequentlytheclassifiershouldideallyper- form continuous classification-recomputingitsclassification decisionthroughoutthelifetimeofeveryflow. TimelyandcontinuousMLclassificationmustalsoaddress thefactthatmanyapplicationschangetheirstatisticalproper- tiesovertime,yetaflowshouldideallybecorrectlyclassified asbeingthesameapplicationthroughouttheflow’slifetime. 2)Directionalneutrality: Applicationflowsareoftenas- sumedtobebi-directional,andtheapplication’sstatistical featuresarecalculatedseparatelyintheforwardandreverse directions.Manyapplications(suchasmultiplayeronline gamesorstreamingmedia)exhibitdifferent(asymmetric) statisticalpropertiesintheclient-to-serverandserver-to-client directions.Consequently,theclassifiermusteither‘know’the directionofapreviouslyunseenflow(forexample,which endistheserverandwhichistheclient)orbetrainedto recogniseanapplicationofinterestwithoutrelyingonexternal indicationsofdirectionality. Inferringtheserverandclientendsofaflowisfraught withpracticaldifficulties.Asareal-worldclassifiershould notpresumethatithasseenthefirstpacketofeveryflow currentlybeingevaluated,itcannotbesurewhetherthefirst packetitsees(ofanynewbi-directionalflowofpackets)is headinginthe‘forward’or‘reverse’direction.Furthermore, thesemanticsoftheTCPorUDPportfieldsshouldbe consideredunreliable(eitherduetoencryptionobscuringthe realvalue,ortheapplicationusingunpredictableports),so itbecomesdifficulttojustifyusing‘wellknown’server-side portnumberstoinferaflow’sdirection. 3)Efficientuseofmemoryandprocessors: Anotherim- portantcriteriaforoperationaldeploymentistheclassification system’suseofcomputationalresources(suchasCPUtime andmemoryconsumption).Theclassifier’sefficiencyimpacts onthefinancialcosttobuild,purchaseandoperatelargescale trafficclassificationsystems.Aninefficientclassifiermaybe inappropriateforoperationaluseregardlessofhowquicklyit canbetrainedandhowaccuratelyitidentifiesflows. MinimisingCPUcyclesandmemoryconsumptionisadvan- tageouswhethertheclassifierisexpectedtositinthemiddle ofanISPnetwork(whereasmallnumberoflarge,powerful devicesmayseehundredsofthousandsofconcurrentflowsat multi-gigabitrates)orouttowardtheedges(wherethetraffic loadissubstantiallysmaller,buttheCPUpowerandmemory resourcesofindividualdevicesarealsodiminished). 4)PortabilityandRobustness: Amodelmaybeconsidered portable ifitcanbeusedinavarietyofnetworklocations,and robust ifitprovidesconsistentaccuracyinthefaceofnetwork layerperturbationssuchaspacketloss,trafficshaping,packet fragmentation,andjitter.Aclassifieralsoisrobustifitcan efficientlyidentifytheemergenceofnewtrafficapplications. IV.AREVIEWOF MACHINE LEARNINGBASED IP TRAFFIC CLASSIFICATION TECHNIQUES Inthissectionwecreatefourbroadcategoriestoreviewsig- nificantworkspublishedonML-basedIPtrafficclassification todatein: • ClusteringApproaches:Workswhosemainapproach centersaroundunsupervisedlearningtechniques. • SupervisedLearningApproaches:Workswhosemain approachcentersaroundsupervisedlearningtechniques.64IEEECOMMUNICATIONSSURVEYS&TUTORIALS,VOL.10,NO.4,FOURTHQUARTER2008 • HybridApproaches:Workswhoseapproachcombine supervisedandunsupervisedlearningtechniques. • ComparisonsandRelatedWork:Worksthatcompareand contrastdifferentMLalgorithms,orconsidernon-ML approachesthatcouldbeconsideredinconjunctionwith MLapproaches. Thekeypointsofeachworkarediscussedinthefollowing subsectionsandsummarisedinTableI,IIandIII. A.ClusteringApproaches 1)FlowclusteringusingExpectationMaximization: In 2004McGregoretal.[48]publishedoneoftheearliest workthatappliedMLinIPtrafficclassificationusingthe ExpectationMaximizationalgorithm[49].Theapproach clusterstrafficwithsimilarobservablepropertiesintodifferent applicationtypes. TheworkstudiesHTTP,FTP,SMTP,IMAP,NTPandDNS traffic.Packetsina6-hourAuckland-VItracearedivided intobi-directionalflows.Flowfeatures(listedinTableI)are calculatedonafull-flowbasis.Flowsarenottimedout,except whentheyexceedthelengthofthetraffictrace. Basedonthesefeatures,theEMalgorithmisusedtogroup thetrafficflowsintoasmallnumberofclustersandthen createclassificationrulesfromtheclusters.Fromtheserules, featuresthatdonothavealargeimpactontheclassificationare identifiedandremovedfromtheinputtothelearningmachine andtheprocessisrepeated.Thework’simplementationof EMhasanoptiontoallowthenumberofclusterstobefound automaticallyviacross-validation.Theresultingestimationof performancewasusedtoselectthebestcompetingmodel (hencethenumberofclusters). Thealgorithmwasfoundtoseparatetrafficintoanumberof classesbasedontraffictype(bulktransfer,smalltransactions, multipletransactionsetc.).However,currentresultsarelimited inidentifyingindividualapplicationsofinterest.Nonetheless, itmaybesuitabletoapplythisapproachasthefirststepof classificationwherethetrafficiscompletelyunknown,and possiblygivesahintonthegroupofapplicationsthathave similartrafficcharacteristics. 2)AutomatedapplicationidentificationusingAutoClass: TheworkofZanderetal.[46],proposedin2005,uses AutoClass[50],whichisanunsupervisedBayesianclassifier, usingtheEMalgorithmtodeterminethebestclusterssetfrom thetrainingdata.EMisguaranteedtoconvergetoalocal maximum.Tofindtheglobalmaximum,autoclassrepeatsEM searchesstartingfrompseudo-randompointsintheparameter space.Themodelwiththeparametersethavingthehighest probabilityisconsideredthebest. Autoclasscanbepreconfiguredwiththenumberofclasses (ifknown)oritcantrytoestimatethenumberofclassesitself. Firstlypacketsareclassifiedintobi-directionalflowsandflow characteristicsarecomputedusingNetMate[51].Anumberof featuresarecalculatedforeachflow,ineachdirection(listed inTableI).Featurevaluesarecalculatedonafull-flowbasis. Aflowtimeoutof60secondsisused. Samplingisusedtoselectasubsetoftheflowdatafor thelearningprocess.Oncetheclasses(clusters)havebeen learnt,newflowsareclassified.Theresultsofthelearning andclassificationareexportedforevaluation.Theapproach isevaluatedbasedonrandomsamplesofflowsobtained fromthree24-hourtraffictraces(Auckland-VI,NZIX-IIand Leipzig-IItracesfromNLANR[52]). Takingafurtherstepfrom[48],theworkproposedamethod forclusterevaluation.Ametriccalledintra-classhomogeneity, H,forassessingthequalityoftheresultingclassesand classificationisintroduced.Hofaclassisdefinedasthe largestfractionofflowsononeapplicationintheclass.The overallhomogeneityHofasetofclassesisthemeanofthe classhomogeneities.ThegoalistomaximiseHtoachievea goodseparationbetweendifferentapplications. Theresultshaveshownthatsomeseparationbetweenthe differentapplicationscanbeachieved,especiallyforcertain particularapplications(suchasHalf-Lifeonlinegametraffic) incomparisonwiththeothers. Withdifferentsetsoffeatures used,theauthorsshowthatHincreaseswiththeincreasein numberoffeaturesused.Hreachesamaximumvalueof between85%and89%,dependingonthetrace.However, theworkhasnotaddressedthetrade-offsbetweennumber offeaturesusedandtheirconsequencesofcomputational overhead. Tocomputetheaccuracyforeachapplicationtheauthors mapeachclasstotheapplicationthatisdominatingtheclass (byhavingthelargestfractionofflowsinthatclass).The authorsusedaccuracy(Recall)asanevaluationmetric.Median accuracyis ≥ 80% forallapplicationsacrossalltraces. However,therearesomeexceptionalcases,forexample,for theNapsterapplicationthereisonetracewhereitisnot dominatinganyoftheclasses(hencetheaccuracyis0%). TheresultsalsoshowthatFTP,WebandTelnetseemtohave themostdiversetrafficcharacteristicsandarespreadacross manyclasses. Ingeneral,althoughthemappingofclasstoapplication showspromisingresultsinseparatingthedifferentappli- cations,thenumberofclassesresultedfromtheclustering algorithmishigh(approximately50classesfor8selected applications).Forclassandapplicationmapping,itisa challengetoidentifyapplicationsthatdonotdominateany oftheclasses. 3)TCP-basedapplicationidentificationusingSimpleK- Means: In2006Bernailleetal.[53]proposedatechnique usinganunsupervisedML(SimpleK-Means)algorithmthat classifieddifferenttypesofTCP-basedapplicationsusingthe firstfewpacketsofthetrafficflow. Incontrasttothepreviouslypublishedwork,themethod proposedinthispaperallowedearlydetectionoftrafficflow bylookingatonlythefirstfewpacketsofaTCPflow. Theintuitionbehindthemethodisthatthefirstfewpackets capturetheapplication’snegotiationphase,whichisusually apre-definedsequenceofmessagesandisdistinctamong applications. Thetrainingphaseisperformedoffline.Theinputisaone- hourpackettraceofTCPflowsfromamixofapplications. Flowsaregroupedintoclustersbasedonthevaluesoftheir firstPpackets.FlowsarerepresentedbypointsinaP- dimensionalspace,whereeachpacketisassociatedwitha dimension;thecoordinateondimensionpisthesizeofpacket pintheflow.Bi-directionalflowsareused.PacketssentbytheNGUYENandARMITAGE:ASURVEYOFTECHNIQUESFORINTERNETTRAFFICCLASSIFICATIONUSINGMACHINELEARNING65 TCP-serveraredistinguishedfrompacketssentbytheTCP- clientbyhavinganegativecoordinate. SimilaritybetweenflowsismeasuredbytheEuclidean distancebetweentheirassociated spatialrepresentations.After naturalclustersareformed,themodelingstepdefinesarule toassignanewflowtoacluster.(Thenumberofclusters waschosenbytrialwithdifferentnumberofclustersfor theK-meansalgorithm).Theclassificationruleissimple: theEuclideandistancebetween thenewflowandthecentre ofeachpre-definedclusteriscomputed,andthenewflow belongstotheclusterforwhichthedistanceisaminimum. Thetrainingsetalsoconsistsofpayload,sothatflowsin eachclustercanbelabeledwithitssourceapplication.The learningoutputconsistsoftwosets:onewiththedescription ofeachcluster(thecentreofthecluster)andtheotherwiththe compositionofitsapplications.Bothsetsareusedtoclassify flowsonline. Intheclassificationphase,packetsareformedintoabi- directionalflow.ThesizesofthefirstPpacketsofthe connectionarecapturedandusedtomapthenewflowto aspatialrepresentation.Aftertheclusterisdefined,theflow isassociatedwiththeapplicationthatisthemostprevalentin thecluster. Theresultsshowthatmorethan80%oftotalflowsare correctlyidentifiedforanumberofapplicationsbyusingthe firstfivepacketsofeachTCPflow.Oneexceptionalcaseis thePOP3application.Theclassifierlabels86%ofPOP3flows asNNTPand12.6%asSMTP,becausePOP3flowsalways belongtoclusterswherePOP3isnotthedominantapplication. Theresultsofthisworkareinspiringforearlydetection ofthetrafficflow.However,itassumesthattheclassifiercan alwayscapturethestartofeachflow.Theeffectivenessofthe approachwhentheclassifiermissesthefirstfewpacketsofthe trafficflowhasnotbeendiscussedoraddressed.Also,withthe useofunsupervisedalgorithmanditsclassificationtechnique, theproposalfacesthechallengeofclassifyinganapplication whenitdoesnotdominateanyoftheclustersfound. 4)IdentifyingWebandP2Ptrafficinthenetworkcore: TheworkofErmanetal.[47]inearly2007addressedthe challengeoftrafficclassificationatthecoreofthenetwork, wheretheavailableinformationabouttheflowsandtheir contributorsmightbelimited.Theworkproposedtoclassify aflowusingonlyuni-directionalflowinformation.While showingthatforaTCPconnection,server-to-clientdirection mightprovidemoreusefulstatisticsandbetteraccuracythan thereversedirection,itmaynotalwaysbefeasibletoobtain trafficinthisdirection.Theyalsodevelopedandevaluatedan algorithmthatcouldestimatemissingstatisticsfromauni- directionalpackettrace. Theapproachproposedmakesuseofclusteringmachine learningtechniqueswithademonstrationofusingtheK- Meansalgorithm.Similartootherclusteringapproaches,Eu- clideandistanceisusedtomeasurethesimilaritybetweentwo flowvectors. Uni-directionaltrafficflowsaredescribedbyafull-flow basedfeaturesset(listedinTableII).Possibletrafficclasses includeWeb,P2P,FTP...Forthetrainingphase,itisassumed thatlabelsforalltrainingflowsareavailable(manually classifiedbasedonpayloadcontentandprotocolsignatures), andaclusterismappedbacktoatrafficclassthatmakes upthemajorityofflowsinthatcluster.Anunseenflowwill bemappedtothenearestclusterbasedonitsdistancetothe clusters’centroids. Theapproachisevaluatedwithflowaccuracyandbyte accuracyasperformancemetrics.Threedatasetsareconsid- ered:datasetscontainingonlyclient-to-serverpackets,data setscontainingonlyserver-to-clientpacket,anddatasets containingarandom mixtureofeachdirection.K-Means algorithmrequiresthenumberofclustersasaninput,ithas beenshownthatbothflowandbyteaccuraciesimprovedask increasedfrom25to400.Overall,theserver-to-clientdatasets consistentlygivethebestaccuracy(95%and79%intermsof flowsandbytesrespectively).Withtherandomdatasets,the averageflowandbyteaccuracyis91%and67%respectively. Fortheclient-to-serverdatasets,94%oftheflowsand57% ofthebytesarecorrectlyclassified. Thealgorithmtoestimatethemissingflowstatisticsisbased onthesyntaxandsemanticsoftheTCPprotocol.Soitonly workswithTCP,notothertransportprotocoltraffic.Theflow statisticsaredividedintothreegeneralcategories:duration, numberofbytes,andnumberofpackets.Theflowdurationin themissingdirectionisestimatedasthedurationcalculated withthefirstandthelastpacketseenintheobserveddirec- tion.Thenumberofbytestransmittedisestimatedaccording toinformationcontainedinACKspackets.Thenumberof packetssentisestimatedwiththetrackingofthelastsequence numberandacknowledgementnumberseenintheflow,with regardstotheMSS.Anumberofassumptionshavebeen made.Forexample,MSSisusedasacommonvalueof 1460bytes,simpleacknowledgmentstrategyofanACK(40- bytedataheaderwithnopayload)foreverydatapacket,and assumingthatnopacketlossandretransmissionoccurred.An evaluationoftheestimationalgorithmisreported,theresults werepromisingforflowdurationandbytesestimation,with relativelylargererrorrangefornumberofpacketsestimation. Theworkaddressedaninterestingissueofthepossibility ofusinguni-directionalflowstatisticsfortrafficclassification andproposedamethodtoestimatethemissingstatistics.A relatedissueofdirectionalityintheuseofbi-directionaltraffic flowswasaddressedintheworkof[54]. B.SupervisedLearningApproaches 1)Statisticalsignature-basedapproachusingNN,LDAand QDAalgorithms: In2004Roughanetal.[18]proposedto usethenearestneighbours(NN),lineardiscriminateanalysis (LDA)andQuadraticDiscriminantAnalysis(QDA)MLalgo- rithmstomapdifferentnetworkapplicationstopredetermined QoStrafficclasses. Theauthorslistanumberofpossiblefeatures,andclassify themintofivecategories: • PacketLevel:e.g.packetlength(meanandvariance,root meansquare) • FlowLevel:flowduration,datavolumeperflow,number ofpacketsperflow(allwithmeanandvariancevalues) etc.Uni-directionalflowisused. • ConnectionLevel:e.g.advertisedTCPwindowsizes, throughputdistributionandthesymmetryoftheconnec- tion.66IEEECOMMUNICATIONSSURVEYS&TUTORIALS,VOL.10,NO.4,FOURTHQUARTER2008 • Intra-flow/connectionfeatures:e.g.packetinter-arrival timesbetweenpacketsinflows. • Multi-flow:e.g.multipleconcurrentconnectionsbetween thesamesetofend-systems. Ofthefeaturesconsidered,thepairofmostvaluewerethe averagepacketlengthandflowduration.Thesefeaturesare computedperfull-flow,thenperaggregateofflowswithin24- hourperiods(anaggregateisacollectionofstatisticsindexed byserverportandserverIPaddress). Threecasesofclassificationareconsidered.Thethree-class classificationlooksatthreetypesofapplication:Bulkdata (FTP-data),Interactive(Telnet),andStreaming(RealMedia); thefour-classclassificationlooksatfourtypesofapplica- tions:Interactive(Telnet),Bulkdata(FTP-data),Streaming (RealMedia)andTransactional(DNS);andtheseven-class classificationlooksatsevenapplications:DNS,FTP-data, HTTPS,Kazaa,RealMedia,TelnetandWWW. Theclassificationprocessisevaluatedusing10-timescross validation.Theclassificationerrorratesareshowntovary basedonthenumberofclassesittriedtoidentify.Thethree- classclassificationhasthelowesterrorrate,varyingfrom 2.5%to3.4%fordifferentalgorithms,whilethefour-class classificationhadtheerrorrateintherangeof5.1%to7.9%, andtheseven-classonehadthehighesterrorrateof9.4%to 12.6%. 2)ClassificationusingBayesiananalysistechniques: In 2005MooreandZuev[14]proposedtoapplythesupervised MLNaiveBayestechniquetocategoriseInternettrafficby application.Trafficflowsinthedatasetusedaremanuallyclas- sified(baseduponflowcontent)allowingaccurateevaluation. 248full-flowbasedfeatureswereusedtotraintheclassifier (asummaryislistedinTableI).SelectedtrafficforInternet applicationswasgroupedintodifferentcategoriesforclas- sification,e.g.bulkdatatransfer,database,interactive,mail, services,www,p2p,attack,gamesandmultimedia. Toevaluatetheclassifier’sperformance,theauthorsused AccuracyandTrust(equivalenttoRecall)asevaluationmet- rics.TheresultsshowedthatwiththesimpleNaiveBayes technique,usingthewholepopulationofflowfeatures,ap- proximately65%flowaccuracycouldbeachievedinclassifi- cation.Tworefinementsfortheclassifierwereperformed,with theuseofNaiveBayesKernelEstimation(NBKE)andFast Correlation-BasedFilter(FCBF)methods 2 .Theserefinements helpedtoreducethefeaturespaceandimprovedtheclassifier performancetoaflowaccuracybetterthan95%overall.With thebestcombinationtechnique,theTrustvalueforindividual classofapplicationranged,forinstance,from98%forwww, to90%forbulkdatatransfer,toapproximately44%for servicestrafficand55%forP2P. TheworkisextendedwiththeapplicationofBayesian neuralnetworkapproachin[55].Ithasbeendemonstrated thataccuracyisfurtherimprovedcomparetoNaiveBayes 2TheNBKEmethodisageneralisationofNaiveBayes.Itaddressesthe problemofapproximatingeveryfeaturebyanormaldistribution.Instead ofusinganormaldistributionwithparametersestimatedfromthedata,it useskernelestimationmethods.FCBFisafeatureselectionandredundancy reductiontechnique.InFCBF,goodnessofafeatureismeasuredbyits correlationwiththeclassandothergood features.Thatfeaturebecomesgood ifitishighlycorrelatedwiththeclass,yetisnotcorrelatedwithanyother goodfeatures[14] technique.Bayesiantrainedneuralnetworkapproachisable toclassifyflowswithupto99%accuracyfordatatrainedand testedonthesameday,and95%accuracyfordatatrained andtestedeightmonthsapart.Thepaperalsopresentsa listoffeatureswiththeirdescriptionsandrankingintheir importance. 3)Real-timetrafficclassificationusingMultipleSub-Flows features: AsnotedinsectionIII-Ctimelyandcontinuous classificationisanimportantconstraintforthepracticalem- ploymentofatrafficclassifier.In2006NguyenandArmitage [56]proposedamethodtoaddresstheissuebyproposing classificationbasedononlythemostrecentNpacketsofa flow-calledaclassificationslidingwindow.Theuseofasmall numberofpacketsforclassificationensuresthetimeliness ofclassificationandreducesthebufferspacerequiredto storepackets’informationfortheclassificationprocess.The approachdoesnotrequiretheclassifiertocapturethestartof eachtrafficflow(asrequiredin[53]and[57]).Thisapproach allowsclassificationtobeinitiatedatanypointintimewhen trafficflowsarealreadyinprogress.Itoffersapotentialof monitoringtrafficflowduringitslifetimeinatimelymanner withtheconstraintsofphysicalresources. TheworkproposestrainingMLclassifiersonmultiplesub- flowsfeatures.First,extracttwoormoresub-flows(ofN packets)fromeveryflowthatrepresentstheclassoftrafficone wishestoidentifyinthefuture.Eachsub-flowshouldbetaken fromplacesintheoriginalflow havingnoticeablydifferent statisticalproperties(forexample,thestartandmiddleofthe flow).Eachsub-flowwouldresultinasetofinstanceswith featurevaluesderivedfromitsNpackets.ThentraintheML classifierwiththecombinationofthesesub-flowsratherthan theoriginalfullflows. ThisoptimisationisdemonstratedusingtheNaiveBayes algorithm.Bi-directionalflowswereused.Differenttraining andtestingdatasetswereconstructedfromthetwoseparate month-longtracescollectedduringMayandSeptember2005 atapubliconlinegameserverinAustraliaandtwo24-hour periodscollectedbytheUniversityofTwente,Netherland [58].Withthefeaturesetused(listedinTableI),classifier builtbasedonfull-flowfeaturesisdemonstratedtoperform poorlywhentheclassifiermissedthestartofatrafficflow. However,withtheapplicationoftheproposedmethod,results showtheclassifiermaintainsmorethan95%Recalland98% Precision(flowaccuracy)evenwhenclassificationisinitiated mid-waythroughaflowusingonlyatotalof25packetsin bothdirections. However,theworkhasonlybeendemonstratedwithan exampleofidentifyinganonlinegameapplication(UDP- basedFirstPersonShootergame-EnemyTerritory[59]). Interferencetrafficincludeda rangeofotherInternetappli- cations(Web,DNS,NTP,SMTP,SSH,Telnet,P2P...).The authorsalsosuggestedthepotentialbenefitsofusingclustering algorithmsinautomatingthesub-flowsselectionprocess. 4)Real-timetrafficclassificationusingMultipleSynthetic Sub-FlowsPairs: AsnotedinsectionIII-Cdirectionalneu- tralityanimportantconstraintforthepracticalemploymentof atrafficclassifier.In2006NguyenandArmitage[54]further extendedtheirworkin[56]toovercomethisproblem.The authorsproposetrainingtheMLclassifierusingstatisticalNGUYENandARMITAGE:ASURVEYOFTECHNIQUESFORINTERNETTRAFFICCLASSIFICATIONUSINGMACHINELEARNING67 featurescalculatedovermultipleshortsub-flowsextracted fromfull-flowgeneratedby thetargetapplication and their mirror-imagedreplicasasiftheflowisinthereversedirection. TheoptimisationisdemonstratedwhenappliedtotheNaive BayesandDecisionTreealgorithmswithanexampleofiden- tifyingaUDP-basedFirstPersonShootergame-EnemyTer- ritory[59]traffic.Usingthesamedatasetsasspecifiedin[56] theydemonstratethatbothclassifiersperformpoorlywhen theclassifiersaretrainedwithbi-directionalflowfeaturesthat makeassumptionsabouttheforwardandbackwarddirection. However,trainingonSyntheticSub-FlowsPairsresultsin significantimprovementtoclassificationperformance(with upto99%Recalland98%Precision(flowaccuracy)forthe exampleapplication)evenwhenclassificationisinitiatedmid- waythroughaflow,withoutpriorknowledgeoftheflow’s directionandusingwindowsassmallas25packetslong. 5)GA-basedclassificationtechniques: In2006Parketal. [60]madeuseoffeatureselectiontechniquebasedonGenetic Algorithm(GA).Usingthesamefeaturesetspecifiedin[44] (listedinII),threeclassifiersweretestedandcompared:the NaiveBayesianclassifierwithKernelEstimation(NBKE), DecisionTreeJ48andtheReducedErrorPruningTree (REPTree)classifier.Theirresultssuggesttwodecisiontree classifiersprovidemoreaccurate classificationresultsthanthe NBKEclassifier.Theworkalsosuggeststheimpactofusing trainingandtestingdatafromdifferentmeasurementpoints. Earlyflowclassificationisalsobrieflymentioned.Accuracy asafunctionofthenumberofpacketsusedforclassificationis presentedforJ48andREPTreeclassifiers.Thefirst10packets usedforclassificationseemstoprovidethemostaccurate result.However,theaccuracyresultisprovidedasoverall result.Itisnotclearhowitwouldbedifferentfordifferent typesofInternetapplications. 6)Simplestatisticalprotocolfingerprintmethod: Crottiet al.[61]inearly2007proposedaflowclassificationmechanism basedonthreepropertiesofthecapturedIPpackets:packet length,inter-arrivaltimeandpacketarrivalorder.Theydefined astructurecalled protocolfingerprints whichexpressthethree trafficpropertiesinacompactwayandusedanalgorithm basedon normalisedthresholds forflowclassification. Therearetwophasesintheclassificationprocess:training andclassifying.Inthetrainingphase,pre-labeledflowsfrom theapplicationtobeclassified(thetrainingdataset)areanal- ysedtobuildtheprotocolfingerprints.Aprotocolfingerprint isaPDFvector,estimatedfromasetofflowsofthesame protocolfromthetrainingdataset.ThePDFi isbuiltonallthe i th pairsofPi (Pi = {si, Δti})wheresi representsthesize ofpacketiand Δti representstheinter-arrivaltimebetween packetiandpacket(i-1).Inordertoclassifyanunknown trafficflowgivenasetofdifferentPDFs,theauthorscheck whetherthebehaviouroftheflowisstatisticallycompatible withthedescriptiongivenbyatleastoneofthePDFs,and choosewhichPDFdescribesitbetter.An anomalyscore thatgivesavaluebetween0and1isusedtoindicatehow ‘statisticallydistant’anunknownflowisfromagivenprotocol PDF.Itshowsthecorrelationbetweentheunknownflow’s i th packetandtheapplicationlayerprotocoldescribedby thespecificPDFused;thehigherthevalue,thehigherthe probabilitythattheflowwasgeneratedbythatprotocol. Theirresultsshowflowaccuracyofmorethan91%for classifyingthreeapplications:HTTP,SMTPandPOP3,using thefirstfewpacketsofeachapplication’strafficflow. InasimilarwaytotheworkofBernailleetal.[53] reviewedabove,thisapproachdemonstratesadvancedresults fortimelinessoftheclassification.However,ithasthesame limitationinassumingthattheclassifiercanalwayscapturethe startofeachflow,andisawareofthelocationsofclientand server(forconstructingthePDFofclient-serverandserver- clientdirections).Theeffectivenessoftheapproachwhen theclassifiermissesthefirstfewpacketsofthetrafficflow (assumedtocarrytheprotocolfingerprint),orsuffersfrom packetlossandpacketre-orderinghasnotbeenaddressed. C.HybridApproaches Ermanetal.in[62]inearly2007proposeda semi- supervised trafficclassificationapproachwhichcombinesun- supervisedandsupervisedmethods.Motivationstothepro- posalareduetotwomainreasons:Firstlylabeledexamples arescarceanddifficulttoobtain,whilesupervisedlearning methodsdonotgeneralisewellwhenbeingtrainedwith fewexamplesinthedataset.Secondly,newapplicationsmay appearovertime,andnotallofthemareknownasapriori, traditionalsupervisedmethodsmapunseenflowinstancesinto oneoftheknownclasses,withouttheabilitytodetectnew typesofflows[62]. Toovercomethechallenges,theproposedclassification methodconsistsoftwosteps.First,atrainingdatasetconsist- ingoflabeledflowscombined withunlabeledflowsarefed intoaclusteringalgorithm.Second,theavailablelabeledflows areusedtoobtainamappingfromtheclusterstothedifferent knownclasses.Thisstepsallowssomeclusterstoberemained. Tomapaclusterwithlabeledflowsbacktoanapplication type,aprobabilisticassignmentisused.Theprobabilityis estimatedbythemaximumlikelihoodestimate, njk nk where njk isthenumberofflowsthatwereassignedtoclusterk withlabelj,and nk isthetotalnumberoflabeledflowsthat wereassignedtoclusterk.Clusterswithoutanylabeledflows assignedtothemarelabeled‘Unknown’asapplicationtype. Finallyanewunseenflowwillbeassignedtothenearest clusterwiththedistancemetricchosenintheclusteringstep. Thisnewproposedapproachhaspromisingresults.Prelim- inaryresultshavebeenshownin[62]withtheemploymentof K-Meansclusteringalgorithm.Theclassifierisprovidedwith 64,000unlabeledflows.Oncetheflowsareclustered,afixed numberofrandomflowsineachclusterarelabeled.Results showthatwithtwolabeledflowsperclusterandK=400, theapproachresultsin94%flowaccuracy.Theincreasein classificationaccuracyismarginalwhenfiveormoreflows arelabeledpercluster.Moredetailsresultscanbefoundin [63]. Asclaimedbytheauthors[63]theproposalhasadvantages intermsoffastertrainingtimewithsmallnumberoflabeled flowsmixedwithalargenumberofunlabeledflows,being abletohandlepreviouslyunseenapplicationsandthevariation ofexistingapplication’scharacteristics,andthepossibilityof enhancingtheclassifier’sperformancebyaddingunlabeled flowsforiterativeclassifiertraining.However,anevaluation68IEEECOMMUNICATIONSSURVEYS&TUTORIALS,VOL.10,NO.4,FOURTHQUARTER2008 oftheseadvantageshasnotbeendemonstratedinthecurrent paper. D.ComparisonsandRelatedWork 1)Comparisonofdifferentclusteringalgorithms: In2006 Ermanetal.[45]comparedthreeunsupervisedclusteringalgo- rithms:K-Means,DBSCANandAutoClass.Thecomparison isperformedontwoempiricaldatatraces:onepublictrace fromtheUniversityofAuckland andoneself-collectedtrace fromtheUniversityofCalgary. Theeffectivenessofeachalgorithmisevaluatedusing over- allaccuracy andthenumberofclustersitproduces. Overall accuracy measurementdetermineshowwelltheclustering algorithmisabletocreateclustersthatcontainonlyasingle trafficcategory.Aclusterislabeledbythetrafficclassthat makesupthemajorityofitstotalconnections(bi-directional trafficflows).Anyconnectionthathasnotbeenassigned toaclusterislabeledasnoise.Thenoverallaccuracyis determinedbytheportionofthetotalTPforallclustersout oftotalnumberofconnectionstobeclassified.Likeanyother clusteringalgorithms,thenumberofclustersproducedbya clusteringalgorithmisanimportantevaluationfactorasit affectstheperformanceofthealgorithminclassificationstage. TheirresultsshowthattheAutoClassalgorithmproduces thebestoverallaccuracy.Onaverage,AutoClassis92.4% and88.7%accurateintheAucklandandCalgarydatasets respectively.Itproducesonaverageof167clustersforthe Aucklanddataset(forlessthan10groupsofapplications) and247clustersfortheCalgarydataset(for4groupsof applications).ForK-Means,thenumberofclusterscanbe set,theoverallaccuracysteadilyimprovesasthenumberof clusters(K)increases.WhenKisaround100,overallaccuracy is79%and84%onaveragefortheAucklandandCalgary datasetsrespectively.Accuracyisimprovedonlyslightlywith greatervalueofK.DBSCANalgorithmproducesloweroverall accuracy(upto75.6%fortheAucklandand72%forthe Calgarydatasets);however,itplacesthemajorityofthe connectionsinasmallsubsetoftheclusters.Lookingatthe accuracyforparticulartrafficclasscategories,theDBSCAN algorithmhasthehighestprecisionvalueforP2P,POP3and SMTP(lowerthanAutoclassforHTTPtraffic). Theworkmentionsbrieflyaboutthecomparisonofmodel buildtime,andhasnotlookedatotherperformanceevaluation measurements,suchasprocessingspeed,CPUandmemory usage,orthetimelinessofclassification. 2)Comparisonofclusteringvs.supervisedtechniques: Ermanetal.[64]evaluatetheeffectivenessofsupervised NaiveBayesandclusteringAutoClassalgorithm.Threeac- curacymetricswereusedforevaluation:recall,precisionand overallaccuracy(overallaccuracyisdefinedthesameas[45] reviewedintheprevioussections). ClassificationmethodusingthesupervisedNaiveBayes algorithmisstraightforward.ForclassificationusingAuto- Class,onceAutoClasscomesupwiththemostprobableset ofclustersfromthetrainingdata,theclusteringistransformed intoaclassifier.Aclusterislabeledwiththemostcommon trafficcategoryoftheflowsinit.Iftwoormorecategories aretied,thenalabelischosenrandomlyamongstthetied categorylabels.Anewflowisthenclassifiedwiththetraffic classlabeloftheclusteritismostsimilarto[64]. Theevaluationwasperformedontwo72-hourdatatraces providedbytheUniversityofAuckland(NLANR).Acon- nectionisdefinedasabi-directionalflow.Thefeaturesetis showninTableIII. Thepapershowsthatwiththedatasetusedandnine applicationclasses(HTTP,SMTP,DNS,SOCKS,IRC,FTP control,FTPdata,POP3andLIMEWIRE),AutoClasshasan averageoverallaccuracyof91.2%whereastheNaiveBayes classifierhasanoveralaccuracyof82.5%.AutoClassalso performsbetterintermsofprecisionandrecallforindividual trafficclasses.Onaverage,forNaiveBayes,theprecisionand recallforsixoutofnineclasseswereabove80%;whereas forAutoClass,allclasseshaveprecisionandrecallvalues above80%,sixoutofthenineclasseshaveaverageprecision valuesabove90%,andsevenhaveaveragerecallvaluesabove 90%.However,intermsoftimetakentobuildclassification model,AutoClasstakesmuchlongertimethanNaiveBayes algorithm(2070secondsvs.0.06secondsforthealgorithm implementation,dataandequipmentused). TheconclusionthattheunsupervisedAutoClassoutper- formsthesupervisedNaiveBayesintermsofoverallaccuracy mightbecounterintuitivetosomereadersonthesurface. Whilethetestingmethodologyofthepaperissound,the resultsmightbeimpactedbythesizeofthetrainingdata sets(thecurrentworkuses1000samplesperapplication),the specificdatasetused,howtheNaiveBayesclassifierisbeing trained(singleapplicationclassificationatatime,ormultiple applicationsclassificationata time),andthespecificfeature setused. Anissueofclusteringapproachesisthereal-timeclassifica- tionspeed,asthenumberofclustersresultedfromthetraining phaseistypicallylargerthanthenumberofapplicationclasses. However,thishasnotbeenevaluatedinthepaper. 3)ComparisonofdifferentsupervisedMLalgorithms: Williamsetal.[65]providesinsightsintotheperformance aspectofMLtrafficclassification.Theworkslookatanumber ofsupervisedMLalgorithms:NaiveBayeswithDiscretisation (NBD),NaiveBayeswithKernelDensityEstimation(NBK), C4.5DecisionTree,BayesianNetwork,andNaiveBayesTree. Thesealgorithms’computationalperformanceisevaluatedin termsofclassificationspeed(numberofclassificationspersec- ond)andthetimetakentobuildtheassociatedclassification model. Resultsarecollectedbyexperimentsonthreepublic NLANRtraces.Featuresusedforanalysisincludethefull setof22features,andtwobestreducedfeaturesetsselected bycorrelation-basedfeatureselection(CFS)andconsistency- basedfeatureselection(CON)algorithms.Thefeaturessetis showninTableIII. Theresultsshowthatmostalgorithmsachievehighflow accuracywiththefullsetof22features(onlyNBKalgorithm achieves > 80%accuracyandtherestofthealgorithms achievegreaterthan95%accuracy).Withthereducedsetsof 8(CFS)and9(CON)features,theresultsachievedbycross- validationshowonlyslightchangesintheoverallaccuracy comparedtotheuseoffullfeatureset.ThelargestreductionNGUYENandARMITAGE:ASURVEYOFTECHNIQUESFORINTERNETTRAFFICCLASSIFICATIONUSINGMACHINELEARNING69 inaccuracywere2-2.5%forNBDandNBKwiththeuseof CONreducedfeatureset. Despitethesimilarityinclassificationaccuracy,thepaper showssignificantdifferencesin classificationcomputational performance.C4.5algorithmwas seenasthefastestalgorithm whenusinganyofthefeatureset(withmaximumof54,700 classificationspersecondona3.4GHzPentium4workstation runningSUSELinux9.3withWEKAimplementation).Al- gorithmsrankedindescendingorderintermsofclassification speedsare:C4.5,NBD,BayesianNetwork,NaiveBayesTree, NBK. Intermsofthemodelbuildtime,NaiveBayesTreetakes significantlongertimethantheremainingalgorithms.Algo- rithmsrankedindescendingorderintermsofmodelbuild timeare:NaiveBayesTree,C4.5,BayesianNetwork,NBD, NBK. Resultsofthepaperalsoshowfeaturereductiongreatly improvesperformanceofthealgorithmsintermsofmodel buildtimeandclassificationspeedsformostalgorithms. 4)ACAS:Classificationusingmachinelearningtechniques onapplicationsignatures: Haffneretal.[57]in2005pro- posedanapproachforautomatedconstructionofapplication signaturesusingmachinelearningtechniques.Differentfrom theotherworks,thisworkmakesuseofthefirstn-Bytesof adatastreamasfeatures.Thoughithasthesamelimitation withthoseworksthatrequireaccessingtopacketpayload, weincludeitinthesurveyasitisalsomachinelearning based,anditsinterestingresultsmaybeusefulinacomposite machinelearningbasedapproachthatcombinesdifferent informationsuchasstatisticalcharacteristics,contents,and communicationpatterns. Threelearningalgorithms:NaiveBayes,AdaBoostand MaximumEntropyhavebeeninvestigatedinconstructing applicationsignaturesforavariousrangeofnetworkappli- cations:ftpcontrol,smtp,pop3,imap,https,httpandssh. Aflowinstanceischaracterised withn-Bytesrepresentedin binaryvalue,andorderedbythepositionoftheByteinthe flowstream.Collectionofflowinstanceswithbinaryfeature isusedasinputbythemachinelearningalgorithms. Usingofthefirst64bytesofeachTCPunidirectional flowtheoverallerrorrateisbelow0.51%forallapplications considered.AdaboostandMaximumEntropyprovidebest resultswithmorethan99%ofallflowsclassifiedcorrectly. Precisionisabove99%forallapplicationsandRecallrateis above94%forallapplicationexceptssh(86.6%)(Thepoor performanceonsshapplicationwassuspectedduetothesmall amountofsampleinstancesinthetrainingdataset). 5)Unsupervisedapproachforprotocolinferenceusingflow content: CloselytoHaffneretal.[57]’swork,in2006,Ma etal.[66]introducedandanalysedalternativemechanisms forautomaticidentificationoftraffic,basedsolelyonflow content.Unsupervisedlearningwasappliedinthreedifferent modelingtechniquesforcapturingstatisticalandstructural aspectsofmessagesexchangedinaprotocol,namely product distribution, Markovprocesses,and commonsubstringgraphs (CSG). Differentfromotherworkthatmadeuseofflowclassi- fication,theworkfocusedonprotocolinference,inwhicha protocol wasdefinedas‘apairofdistributionsonflows’-one wasabytesequencefromtheinitiatortotheresponderandone wasabytesequencefromtherespondertotheinitiator(which doesnotincludepacket-levelinformationsuchasinter-arrival time,framesizeorheaderfields). Productdistributionmodeltreatseachn-byteflowdistri- butionasaproductofnindependentbytedistributions.Each byteoffsetinaflowisrepresentedbyitsownbytedistribution thatdescribesthedistributionofbytesatthatoffsetinthe flow.TheMarkovprocessisdescribedasarandomwalk onaweighteddirectedgraph.Thenodesofthegraphsare labeledwithuniquebytevalues.Eachedgeisweightedwith atransitionprobabilitysuchthat,foranynode,thesumofall itsout-edgeweightsis1.Thenextnodeischosenaccording toweightoftheedgefromthecurrentnodetoitsneighbors. Andcommonsubstringgraphscapturethecommonstructural informationabouttheflowsfromwhichitisbuilt. Detailedmodeldescriptions,howtoconstructeachmodel, howtomerge,compareandclassifynewinstancesarede- scribedin[66].Overall,theproductdistributionresultedin thelowesttotalmisclassificationerror(1.68%-4.15%),while Markovprocesseshadthehighest(3.33-9.97%)andCSGsin themiddle(2.08-6.19%). 6)BLINC:Multileveltrafficclassificationinthedark: Karagiannisetal.[15]developedanapplicationclassification methodbasedonthebehavioursofthesourcehostatthe transportlayer,dividedintothreedifferentlevels.Thesocial levelcapturesandanalysestheinteractionsoftheexamined hostwithotherhosts,intermsofthenumbersofthemit communicateswith.Thehost’spopularityandthatofother hostsinitscommunity’scircleareconsidered.Theroleofthe host,inactingasaproviderortheconsumerofaservice, isclassifiedatthefunctionallevel.Finally,transportlayer informationisused,suchasthe4-tupleofthetraffic(source anddestinationIPaddresses,andsourceanddestinationports), flowcharacteristicssuchasthetransportprotocol,andthe averagepacketsize. Arangeofapplicationtypeswasstudiedinthiswork, includingweb,p2p,datatransfer,networkmanagementtraffic, mail,chat,mediastreaming, andgaming.Byanalysingthe socialactivitiesofthehost,theauthorsconcludethatamong thehost’scommunities,neighbouringIPsmayofferthesame service(aserverfarm)iftheyusethesameserviceport,exact communitiesmightindicateattacks,whilepartialcommunities maysignifyp2porgamingapplications.Inaddition,most IPsactingasclientshaveaminimumnumberofdestination IPs.Thus,focusingontheidentificationofthatsmallnumber ofserverscanhelpclientidentification,leadingtotheclas- sificationofalargeamountoftraffic.Classificationatthe functionallevelshowsthatahostislikelytobeprovidinga serviceifduringadurationoftimeitusesasmallnumberof sourceports,normallylessthanorequaltotwoforalloftheir flows.Typicalclientbehaviourisnormallyrepresentedwhen thenumberofsourceportsisequaltothenumberofdistinct flows.Theconsistencyofaveragepacketsizeperflowacross allflowsattheapplicationlevelissuggestedtobeagood propertyforidentifyingcertainapplications,suchasgaming andmalware. Completenessandaccuracyarethetwometricsusedforthe classificationapproach.Completenessisdefinedastheratio70IEEECOMMUNICATIONSSURVEYS&TUTORIALS,VOL.10,NO.4,FOURTHQUARTER2008 ofthenumberofflows(bytes)classifiedbyBLINCoverthe totalnumberofflows(bytes),indicatedbypayloadanalysis. TheresultsshowthatBLINCcanclassify80%to90%traffic flowswithmorethan95%flowaccuracy(70%to90%for byteaccuracy). Thismethodhastogatherinformationfromseveralflows foreachhostbeforeitcandecideontheroleofonehost.Such requirementsmightpreventtheemploymentofthismethodin real-timeoperationalnetworks. 7)Pearson’sChi-SquaretestandNaiveBayesclassifier: Bonfiglioetal.[67]recentlyproposedaframeworkbasedon twotechniquestoidentifySkypetrafficinrealtime.Thefirst, basedonPearsonsChi-Squaretest,detectsSkypesfingerprint throughanalysisofthemessagecontentrandomnessintro- ducedbytheencryptionprocess.Thesecond,basedonthe NaiveBayestheorem,detectsSkypestrafficfrommessage sizeandarrivalratecharacteristics. Usingtwospecifictestdatasets,theauthors’compared theperformanceofeachtechniquerelativetoclassification usingdeep-packetinspection.TheyshowedtheirNaiveBayes techniquetobeeffectiveinidentifyingvoicetrafficoverIP regardlessofsourceapplication.TheirPearsonsChi-Square testeffectivelyidentifiedSkypetraffic(includingSkype voice/video/chat/datatraffic)overUDPandallencryptedor compressedtrafficforTCPflows.Whenusedincombination thetwotechniquesdetectedSkypevoicetraffic(UDPflows) with0%falsepositivesand9.82%falsenegativesforonetest dataset,and0.11%falsepositivesand2.40%falsenegatives fortheother.Thesefalsepositivesratesareanimprovement comparedtoeachtechniquebeingusedindividually.However, thefalsenegativesratesareslightlyworse.Alsoitisimportant tonotethegreatimbalancebetweentheamountofSkype trafficcomparedtoothertrafficinthetestdatasets.Theresults shouldalsobeevaluatedintermsofprecisionandrecall,to reflecttheclassifiers’performancepertrafficclass,insteadof onlytheoverallfalsepositivesandfalsenegatives. Bothtechniquesofferreal-timetrafficclassification.The Chi-Squaretechniquelooksatthefirstfewbytesofthe message.TheNaiveBayestechniquelooksatthestatistical characteristicsforeachwindowof30consecutivepackets. E.Challengesforoperationaldeployment Wewrapupoursurveywithaqualitativelookattheextent towhichthereviewedworksoverlapSectionIII-C’sadditional constraintsandrequirementsforusingMLtechniquesinside real-timeIPtrafficclassifiers. 1)Timelyandcontinuousclassification: Mostofthere- viewedworkhasevaluatedtheefficacyofdifferentML algorithmswhenappliedtoentiredatasetsofIPtraffic,trained andtestedoverfull-flowsconsistingofthousandsofpackets (suchas[14][18][46][48][64]and[65]). Some([53]and[61])haveexploredtheperformanceofML classifiersthatutiliseonlythefirstfewpacketsofaflow,but theycannotcopewithmissingtheflow’sinitialpackets.Others ([56])haveexploredtechniquesforcontinuousclassification offlowsusingasmallslidingwindowacrosstime,without needingtoseetheinitialpacketsofaflow. 2)Directionalneutrality: Theassumptionthatapplication flowsarebi-directional,andtheapplication’sdirectionmay beinferredpriortoclassification,permeatesmanyofthe workspublishedtodate([14][48][53][46][68]).Most workhasassumedthattheywillseethefirstpacketofeach bi-directionalflow,thatthisinitialpacketisfromaclient toaserver.Theclassificationmodelistrainedusingthis assumption,andsubsequentevaluationshavepresumedthe MLclassifiercancalculatefeatureswiththecorrectsenseof forwardandreversedirection. Asgettingthedirectionwrongwilldegradeclassification accuracy,[54]exploresthecreationofclassifiermodelsthat donotrelyonexternalindicationsofdirectionality. 3)Efficientuseofmemoryandprocessors: Thereare definitetrade-offstobemadebetweentheclassificationper- formanceofaclassifierandtheresourceconsumptionofthe actualimplementation.Forexample,[14]and[55]revealex- cellentpotentialforclassificationaccuracy.However,theyuse alargenumberoffeatures,manyofwhicharecomputationally challenging.Theoverheadofcomputingcomplexfeatures (suchaseffectivebandwidthbaseduponentropy,orFourier Transformofthepacketinter-arrivaltime)mustbeconsidered againstthepotentiallossofaccuracyifonesimplydidwithout thosefeatures. Williamsetal.[65]providesomepertinentwarningsabout thetrade-offbetweentraining timeandclassificationspeed. (Forexample,amongfiveMLalgorithmsstudied,NaiveBayes withKernelEstimationtooktheshortesttimetobuildclassifi- cationmodels,yetperformedslowestintermsofclassification speed.) Techniquesfortimelyandcontinuousclassificationhave tendedtosuggestaslidingwindowoverwhichfeaturesare calculated.Increasingthelengthofthiswindow([56][54] and[57])mightincreaseclassificationaccuracy.However, dependingontheparticularimplementation(opportunitiesfor pipelining,stepsizewithwhichthewindowslidesacross theincomingpacketstreams,etc.)thismaydecreasethe timelinesswithwhichclassificationdecisionsaremade(and increasethememoryrequiredtobufferpacketsduringfeature calculations).Mostofthereviewedworkhasnot,todate, closelyinvestigatedthisaspect. 4)PortabilityandRobustness: Noneofthereviewedworks seriouslyconsideredoraddressedtheissueofclassification modelportabilitymentionedinsectionIII-C. Noneofthereviewedworkshasaddressedandevaluate theirmodel’srobustnessintermsofclassificationperformance withtheintroductionofpacketloss,packetfragmentation, delayandjitter.Unsupervisedapproacheshavethepotential todetecttheemergenceofnewtypesoftraffic.However,this issuehasnotbeenevaluatedinmostoftheworks.Itwas brieflymentionedin[62]. 5)Qualitativesummary: TableIVprovidesaqualitative summaryofthereviewedworksagainstthefollowingcriteria: • Real-timeClassification – No: Theworkmakesuseoffeaturesthatrequireflow completiontocompute(e.g.Flowduration,totalflow bytescount) – Yes: Theworkrequiresthecaptureofasmallnumber ofpackets/bytesofaflowtodoclassificationNGUYENandARMITAGE:ASURVEYOFTECHNIQUESFORINTERNETTRAFFICCLASSIFICATIONUSINGMACHINELEARNING71 TABLEI ASUMMARYOF RESEARCH REVIEWEDIN SECTION IV Work MLAlgorithms Features DataTraces TrafficConsid- ered Classification Level McGregoretal.[48] Expectation Maximization • Packetlengthstatistics(min,max,quar- tiles, ...) • Inter-arrivalstatistics • Bytecounts • Connectionduration • Numberoftransitionsbetweentransac- tionmodeandbulktransfermode • Idletime Calculatedonfullflows NLANRand Waikatotrace Amixtureof HTTP,SMTP, FTP(control), NTP,IMAP, DNS... Coarse grained(bulk transfer,small transactions, multiple transactions ...) Zanderetal.[46] AutoClass • Packetlengthstatistics(meanandvari- anceinforwardandbackwarddirections) • Inter-arrivaltimestatistics(meanand varianceinforwardandbackwarddirec- tions) • Flowsize(bytes) • Flowduration Calculatedonfull-flows Auckland-VI, NZIX-IIand Leipzig-IIfrom NLANR Half-Life, Napster,AOL, HTTP,DNS, SMTP,Telnet, FTP(data) Finegrained (8applications studied) Roughanetal.[18] Nearest Neighbour, Linear Discriminate Analysisand Quadratic Discriminant Analysis • PacketLevel • FlowLevel • ConnectionLevel • Intra-flow/Connectionfeatures • Muli-flowfeatures Calculatedonfullflows Waikatotrace andsection logsfroma commercial streaming services Telnet,FTP (data),Kazaa, RealMedia Streaming, DNS,HTTPS Finegrained (three,fourand sevenclasses ofindividual applications) MooreandZuev[14] Baysian Techniques (NaiveBayes andNaive Bayeswith Kernel Estimation andFast Correlation- BasedFilter method) Totalof248features,amongthemare • Flowduration • TCPport • Packetinter-arrivaltimestatistics • Payloadsizestatistics • Effectivebandwidthbaseduponentropy • Fouriertransformofpacketinter-arrival time Calculatedonfullflows Proprietary HandClassified Traces Alargerange ofDatabase, P2P,Buck, Mail,Services, ...traffic Coarsegrained Barnailleetal.[53] SimpleK- Means Packetlengthsofthefirstfewpacketsofbi- directionaltrafficflows Proprietary traces eDonkey, FTP,HTTP, Kazaa,NTP, POP3,SMTP, SSH,HTTPS, POP3S Finegrained (10applications studied) Parketal.[44][44] NaiveBayes withKernel Estimation, Decision TreeJ48and ReducedError PrunningTree • Flowduration • InitialAdvertisedWindowbytes • Numberofactualdatapackets • Numberofpacketswiththeoptionof PUSH • Packetlengths • Advertisedwindowbytes • Packetinter-arrivaltime • Sizeoftotalburstpackets NLANR, USC/ISI, CAIDA WWW, Telnet,Chat (Messenger), FTP,P2P (Kazaa, Gnutella), Multimedia, SMTP,POP, IMAP,NDS, Oracle,X11 N/A(compari- sonwork) NguyenandArmitage [56] Supervised NaiveBayes • Packetlengths(min,max,mean,standard deviation) • Inter-Packetlengthsstatistics(min,max, mean,standarddeviation) • PacketInter-arrivaltimesstatistics(min, max,mean,stddev.) • Calculatedoverasmallnumber(e.g.25 packets)ofconsecutivepackets(classifi- cationwindows)takenatvariouspoints oftheflowlifetime-wherethechanges inflow’scharacteristicsaresignificant Traces collectedat anonline gameserver inAustralia andprovided byUniversity ofTwente, Netherland OnlineGame (Enemy Territory) traffic,Others (HTTP, HTTPS,DNS, NTP,SMTP, Telnet,SSH, P2P...) Application specific(Online Game,UDP based,First PersonShooter, Enemy Territory traffic)72IEEECOMMUNICATIONSSURVEYS&TUTORIALS,VOL.10,NO.4,FOURTHQUARTER2008 TABLEII ASUMMARYOF RESEARCH REVIEWEDIN SECTION IV(CONTINUED) Work MLAlgorithms Features DataTraces TrafficConsid- ered Classification Level NguyenandArmitage [54] NaiveBayes andDecision Treein combination withClustering algorithms forautomated sub-flows selection • Packetlengthsstatistics(min,max,mean, stddev.) • Inter-Packetlengthsstatistics(min,max, mean,stddev.) • PacketInter-arrivaltimesstatistics(min, max,mean,stddev.) • Calculatedoverasmallnumber(e.g.25 packets)ofconsecutivepackets(classifi- cationwindows)takenatvariouspoints oftheflowlifetime-wherethechanges inflow’scharacteristicsaresignificant. • Furtherextensionwithsyntheticmirror- ingfeatures. Traces collectedat anonline gameserver inAustralia andprovided byUniversity ofTwente, Netherland OnlineGame (Enemy Territory) traffic,Others (HTTP, HTTPS,DNS, NTP,SMTP, Telnet,SSH, P2P...) Application specific(Online Game,UDP based,First PersonShooter, Enemy Territory traffic) Ermanetal.[47] K-Means • Totalnumberofpackets • Meanpacketlength • meanpayloadlengthexcludingheaders • Numberofbytestransferred • Flowduration • Meaninter-arrivaltime Self-collected8 1-hourcampus tracesbetween April6-9,2006 Web,P2P,FTP, Others Coarsegrained (29different protocols groupedinto anumberof application categoriesfor studies) Crottietal.[61] Protocol fingerprints (Probability Density Function vectors)and Anomaly score(from protocolPDFs toprotocol fingerprints) • Packetlengths • Inter-arrivaltime • Packetarrivalorder 6-monthself- collectedtraces attheedge gatewayofthe Universityof Bresciadata centrenetwork TCP applications (HTTP,SMTP, POP3,SSH) Finegrained (fourTCP protocols) Haffneretal.[57] NaiveBayes, AdaBoost, Regularized Maximum Entropy Discretebyteencodingof thefirstn-bytespay- loadofaTCPunidirectionalflow Proprietary FTP(control), SMTP,POP3, IMAP,HTTPS, HTTP,SSH Finegrained Maetal.[66] Unsupervised learning (product distribution, Markov processes, and common substring graphs) Discretebyteencodingof thefirstn-bytespay- loadofaTCPunidirectionalflow Proprietary FTP(control), SMTP,POP3, IMAP,HTTPS, HTTP,SSH Finegrained Auldetal.[55] BayesianNeu- ralNetwork 246featuresintotal,including: • Flowmetrics(duration,packet-count,to- talbytes) • Packetinter-arrivaltimestatistics • SizeofTCP/IPcontrolfields • Totalpacketsineachdirectionandtotal forbi-directionalflow • Payloadsize • Effectivebandwidthbaseduponentropy • Top-tenFouriertransformcomponentsof packetinter-arrivaltimesforeachdirec- tion • NumerousTCP-specificvaluesderived fromtcptrace(e.g.totalpayloadbytes transmitted,totalnumberofPUSHED packets,totalnumberofACKpackets carryingSACKinformationetc.) Proprietary handclassified traces Alargerange ofDatabase, P2P,Buck, Mail,Services, Multimedia, Web...traffic CoarsegrainedNGUYENandARMITAGE:ASURVEYOFTECHNIQUESFORINTERNETTRAFFICCLASSIFICATIONUSINGMACHINELEARNING73 TABLEIII ASUMMARYOF RESEARCH REVIEWEDIN SECTION IV(CONTINUED) Work MLAlgorithms Features DataTraces TrafficConsid- ered Classification Level Williamsetal.[65] Naive Bayeswith Discretisation, NaiveBayes withKernel Estimation, C4.5Decision Tree,Bayesian Networkand NaiveBayes Tree • Protocol • Flowduration • Flowvolumeinbytesandpackets • Packetlength(minimum,mean,maxi- mumandstandarddeviation) • Inter-arrivaltimebetweenpackets(mini- mum,mean,maximumandstandardde- viation) NLANR FTP(data), Telnet,SMTP, DNS,HTTP N/A(Compari- sonwork) Ermanetal.[45] K-Means,DB- SCANandAu- toClass • Totalnumberofpackets • Meanpacketlength • Meanpayloadlengthexcludingheaders • Numberofbytestransfered(ineachdi- rectionandcombined) • Meanpacketinter-arrivaltime NLANRanda self-collected 1-hourtrace fromthe Universityof Calgary HTTP,P2P, SMTP,IMAP, POP3,MSSQL, Other N/A(Compari- sonwork) Ermanetal.[64] NaiveBayes andAutoClass • Totalnumberofpackets • Meanpacketlength(ineachdirectionand combined) • Flowduration • Meandatapacketlength • Meanpacketinter-arrivaltime NLANR HTTP,SMTP, DNS,SOCKS, FTP(control), FTP(data), POP3, Limewire N/A(Compari- sonwork) Bonfiglioetal.[67] NaiveBayes andPearson’s Chi-Squatetest • Messagesize(thelengthofthemessage encapsulatedintothetransportlayerpro- tocolsegment) • Averageinterpacketgap Twoselfcol- lecteddatasets Skypetraffic Application specific TABLEIV REVIEWEDWORKINLIGHTOFCONSIDERATIONSFOROPERATIONALTRAFFICCLASSIFICATION Work Real-timeClassification FeatureComputation Overhead ClassifyFlowsIn Progress Directionalneutrality McGregoretal.[48] No Average Notaddressed No Zanderetal.[46] No Average Notaddressed No Roughanetal.[18] No Average Notaddressed N/A MooreandZuev[14] No High Notaddressed No Barnailleetal.[53] Yes Low Notaddressed No Parketal.[44] No Average Notaddressed Notclear NguyenandArmitage[56] Yes Average Yes Yes NguyenandArmitage[54] Yes Average Yes Yes Ermanetal.[47] No Average Notaddressed No Crottietal.[61] Yes Average Notaddressed No Haffneretal.[57] Yes Average Notaddressed N/A Maetal.[66] No Average Notaddressed No Auldetal.[55] No High Notaddressed No Williamsetal.[65] N/A Average N/A N/A Ermanetal.[45] N/A Average N/A N/A Ermanetal.[64] N/A Average N/A N/A Bonfiglioetal.[67] Yes Average Notaddressed Notclear74IEEECOMMUNICATIONSSURVEYS&TUTORIALS,VOL.10,NO.4,FOURTHQUARTER2008 • FeatureComputationOverhead Low: Theworkmakesuseofasmallnumberof features(e.gsizesofthefirstfewpackets,binary encodedofthefirstfewbytesofaunidirectional flow) – Average: Theworkmakesuseofanaveragesetof features(suchaspacketlengthandinter-arrivaltimes statistics,flowduration,bytescount) – High: Theworkmakesuseofalarge(comparatively withotherworkinthearea)includingcomputa- tionalcomplexfeatures(suchasFouriertransform ofpacketinter-arrivaltime) • ContinuousClassification – Notaddressed: Theissuewasnotconsideredinthe work – Yes: Theissuewasconsideredandsolvedinthework • DirectionalNeutrality – No: Theworkmakesuseofbi-directinalflowand featurescalculations,butdidnotconsidertheissue – Yes: Theworkmakesuseofbi-directionalflow andfeaturecalculations,addressedtheissuesand proposedsolution – N/A: Theworkmakesuseofuni-directionalflowand theissueisnotapplicable – Notclear:Notclearlystatedinthepaper V.CONCLUSION Thispapersurveyssignificantworksinthefieldofmachine learningbasedIPtrafficclassificationduringthepeakperiod of2004toearly2007.Motivatedbyadesiretomoveaway fromport-basedorpayload-basedtrafficclassification,itis clearthatMLcanbeappliedwellinthetask.Theuseof anumberofdifferentMLalgorithmsforofflineanalysis, suchasAutoClass,ExpectationMaximisation,DecisionTree, NaiveBayesetc.hasdemonstratedhighaccuracy(upto99%) foravariousrangeofInternetapplicationstraffic.Early MLtechniquesreliedonstatic,offlineanalysisofpreviously capturedtraffic.Morerecentworkisbeginningtoaddress therequirementsforpractical,ML-basedreal-timeIPtraffic classificationinoperationalnetworks.Inthissurveypaper,we haveoutlinedanumberofcriticaloperationalrequirementsfor real-timeclassifiersandqualitativelycritiquedthereviewed worksagainsttheserequirements. Thereisstillalotofroomforfurtherresearchinthefield. Whilemostoftheapproachesbuildtheirclassificationmodels basedonsampledatacollectedat certainpointsoftheInternet, thosemodels’usabilityneedstobecarefullyevaluated.The accuracyevaluatedonthetestdatasetcollectedatthesame pointofmeasurementmightnotbetruewhenbeingappliedin differentpointofmeasurement.Therearestillopenquestions astohowwelltheycanmaintain theirperformanceinthepres- enceofpacketloss,latencyjitter,andpacketfragmentation. EachMLalgorithmmayperformdifferentlytowarddifferent Internetapplications,andmayrequiredifferentparameter configurations.Theuseofacombinationofclassification modelsisworthinvestigating.Parallelprocessingforreal- timeclassificationattrafficaggregationpointsinthenetwork maybeusefulwhentheclassifiersneedtocopewithmillions ofconcurrentflowssimultaneously.Andtheapplicationof MLalgorithmsfornewerapplications(suchasSkype,video streaming,voiceoverIPandpeertopeerfile-sharing)isstill aninterestingopenfield. Nevertheless,thepromisingresultsofML-basedIPtraffic classificationmayopenmanynewavenuesforrelatedresearch areas,suchastheapplicationofMLinintrusiondetections, anomalydetectioninuserdataandcontrol,routingtraffic, andbuildingnetworkprofilesfor proactivenetworkreal-time monitoringandmanagement.Theclassificationoftrafficin greynetordarknetnetworksisalsoaninterestingpossible extensionoftheresearchfield. ACKNOWLEDGMENTS Wewouldliketothanktheanonymousreviewersfor theirveryhelpfulcommentsandfeedbackstoimprovethe manuscript. REFERENCES [1]Snort-Thedefactostandardforintrusiondetection/prevention, http://www.snort.org,asofAugust14,2007. [2]Brointrusiondetectionsystem-Brooverview,http://bro-ids.org,asof August14,2007. [3]V.Paxson,“Bro:Asystemfordetectingnetworkintrudersinreal-time,” ComputerNetworks,no.31(23-24),pp.2435–2463,1999. [4]L.Stewart,G.Armitage,P.Branch, andS.Zander,“Anarchitecturefor automatednetworkcontrolofQoSoverconsumerbroadbandlinks,”in IEEEInternationalRegion10Conference(TENCON05),Melbourne, Australia,November2005. [5]F.Baker,B.Foster,andC.Sharp,“Ciscoarchitectureforlawfulintercept inIPnetworks,”InternetEngineeringTaskForce,RFC3924,2004. [6]T.Karagiannis,A.Broido,N.Brownlee,andK.Claffy,“IsP2Pdying orjusthiding?”in Proc.47thannualIEEEGlobalTelecommuni- cationsConference(Globecom2004),Dallas,Texas,USA,Novem- ber/December2004. [7]S.Sen,O.Spatscheck,andD.Wang,“Accurate,scalableinnetwork identificationofP2Ptrafficusingapplicationsignatures,”in WWW2004, NewYork,NY,USA,May2004. [8]R.Braden,D.Clark,andS.Shenker,“IntegratedservicesintheInternet architecture:anoverview,”RFC1633,IETF,1994. [9]S.Blake,D.Black,M.Carlson,E.Davies,Z.Wang,andW.Weiss,“An architecturefordifferentiatedservices,”RFC2475,IETF,1998. [10]G.Armitage, QualityofServiceInIPNetworks:Foundationsfora Multi-ServiceInternet.MacmillanTechnicalPublishing,April2000. [11]L.Burgstahler,K.Dolzer,C.Hauser,J.Jahnert,S.Junghans,C.Ma- cian,andW.Payer,“Beyondtechnology:themissingpiecesforQoS success,”in RIPQoS’03:Proc.ACMSpecialInterestGrouponData Communication(SIGCOMM)workshoponRevisitingIPQoS.New York,NY,USA:ACMPress,August,2003,pp.121–130. [12]UbicomInc., SolvingPerformanceProblemswithInteractive ApplicationsinaBroadbandEnvironmentusingStreamEngine Technology,http://www.ubicom.com/pdfs/whitepapers/StreamEngine- WP-20041031.pdf,asofAugust14,2007. [13]J.But,N.Williams,S.Zander,L.Stewart,andG.Armitage,“ANGEL- AutomatedNetworkGamesEnhancementLayer,”in Proc.of5thAnnual WorkshoponNetworkandSystemsSupportforGames(Netgames)2006, Singapore,October2006. [14]A.MooreandD.Zuev,“InternettrafficclassificationusingBayesian analysistechniques,”in ACMInternationalConferenceonMeasure- mentandModelingofComputerSystems(SIGMETRICS)2005,Banff, Alberta,Canada,June2005. [15]T.Karagiannis,K.Papagiannaki,andM.Faloutsos,“Blinc:Multilevel trafficclassificationinthedark,”in Proc.oftheSpecialInterestGroup onDataCommunicationconference(SIGCOMM)2005,Philadelphia, PA,USA,August2005. [16]J.Erman,A.Mahanti,andM.Arlitt,“Byteme:acaseforbyte accuracyintrafficclassification,”in MineNet’07:Proc.3rdannual ACMworkshoponMiningnetworkdata.NewYork,NY,USA:ACM Press,June2007,pp.35–38. [17] InternetAssignedNumbersAuthority(IANA), http://www.iana.org/assignments/port-numbers,asofAugust14,2007.NGUYENandARMITAGE:ASURVEYOFTECHNIQUESFORINTERNETTRAFFICCLASSIFICATIONUSINGMACHINELEARNING75 [18]M.Roughan,S.Sen,O.Spatscheck,andN.Duffield,“Class-of-service mappingforQoS:Astatisticalsignature-basedapproachtoIPtraffic classification,”in Proc.ACM/SIGCOMMInternetMeasurementCon- ference(IMC)2004,Taormina,Sicily,Italy,October2004. [19]H.Schulzrinne,S.Casner,R. Frederick,andV.Jacobson,“RTP:A TransportProtocolforReal-TimeApplications,”RFC1889,IETF,1996. [20]A.MooreandK.Papagiannaki,“Towardtheaccurateidentification ofnetworkapplications,”in Proc.PassiveandActiveMeasurement Workshop(PAM2005),Boston,MA,USA,March/April2005. [21]A.MadhukarandC.Williamson,“AlongitudinalstudyofP2Ptraffic classification,”in 14thIEEEInternationalSymposiumonModeling, Analysis,andSimulationofComputerandTelecommunicationSystems, September2006. [22]V.Paxson,“Empiricallyderivedanalyticmodelsofwide-areaTCP connections,” IEEE/ACMTrans.Networking,vol.2,no.4,pp.316–336, 1994. [23]C.Dewes,A.Wichmann,andA.Feldmann,“AnanalysisofInternet chatsystems,”in ACM/SIGCOMMInternetMeasurementConference 2003,Miami,Florida,USA,October2003. [24]K.Claffy,“Internettrafficcharacterisation,”PhDThesis,Universityof California,SanDiego,1994. [25]T.Lang,G.Armitage,P.Branch, andH.-Y.Choo,“Asynthetictraffic modelforHalf-life,”in Proc.AustralianTelecommunicationsNetworks andApplicationsConference2003ATNAC2003,Melbourne,Australia, December2003. [26]T.Lang,P.Branch,andG.Armitage,“Asynthetictrafficmodelfor Quake3,”in Proc.ACMSIGCHIInternationalConferenceonAdvances incomputerentertainmenttechnology(ACE2004),Singapore,June 2004. [27]Z.Shi, PrinciplesofMachineLearning.InternationalAcademic Publishers,1992. [28]H.Simon,“Whyshouldmachineslearn?”in R.S.Michalski,J.G. Carbonell,andT.M.Mitchell(editors)MachineLearning:AnArtificial IntelligenceApproach. MorganKaufmann,1983. [29]I.WittenandE.Frank, DataMining:PracticalMachineLearningTools andTechniqueswithJavaImplementations(SecondEdition).Morgan KaufmannPublishers,2005. [30]B.Silver,“Netman:Alearningnetworktrafficcontroller,”in Proc.Third InternationalConferenceonIndustrialandEngineeringApplicationsof ArtificialIntelligenceandExpertSystems,AssociationforComputing Machinery,1990. [31]J.Frank,“Machinelearningandintrusiondetection:Currentandfuture directions,”in Proc.National17thComputerSecurityConference, Washington,D.C.,October1994. [32]Y.ReichandJ.S.Fenves,“Theformationanduseofabstractconcepts indesign,”in Fisher,D.H.andPazzani,M.J.(editors),ConceptForma- tion:KnowledgeandExperienceinUnsupervisedLearning.Morgan Kaufmann,1991. [33]H.D.Fisher,J.M.Pazzani,andP.Langley, ConceptFormation:Knowl- edgeandExperienceinUnsupervisedLearning.MorganKaufmann, 1991. [34]R.Duda,P.Hart,andD.Stork, PatternClassification(2ndedition). JWiley-Interscience,2001. [35]O.CarmichaelandM.Hebert,“Shape-basedrecognitionofwiryob- jects,” IEEETrans.PatternAnal.MachineIntell.,vol.26,no.12,pp. 1537–1552,2004. [36]W.Rand,“Objectivecriteriafortheevaluationofclusteringmethods,” J.AmericanStatisticalAssociation,vol.66,no.336,pp.846–850,1971. [37]M.Halkidi,Y.Batistakis,andM.Vazirgiannis,“Clustervaliditymeth- ods:partI,” SIGMODRec.,vol.31,no.2,pp.40–45,2002. [38]R.XuandD.Wunsch,“Surveyofclusteringalgorithms,” IEEETrans. NeuralNetworks,no.Vol.16,Issue3,pp.645–678,May2005. [39]M.Halkidi,Y.Batistakis,andM.Vazirgiannis,“Clusteringvalidity checkingmethods:partII,” SIGMODRec.,vol.31,no.3,pp.19–27, 2002. [40]M.HallandG.Holmes,“Benchmarkingattributeselectiontechniques fordiscreteclassdatamining,” IEEETrans.KnowledgeDataEng., vol.15,no.6,pp.1437–1447,2003. [41]D.Goldberg, GeneticAlgorithmsinSearch,OptimizationandMachine Learning.Boston,MA,USA:Addison-WesleyLongmanPublishing Co.,Inc.,1989. [42]R.KohaviandG.H.John,“Wrappersforfeaturesubsetselection,” ArtificialIntelligent,vol.97,no.1-2,pp.273–324,1997. [43]P.H.Winston, Artificialintelligence(2nded.).Boston,MA,USA: Addison-WesleyLongmanPublishingCo.,Inc.,1984. [44]J.Park,H.-R.Tyan,andK.C.-C.J.,“Internettrafficclassification forscalableQoSprovision,”in IEEEInternationalConferenceon MultimediaandExpo,Toronto,Ontario, Canada,July2006. [45]J.Erman,M.Arlitt,andA.Mahanti,“Trafficclassificationusingclus- teringalgorithms,”in MineNet’06:Proc.2006SIGCOMMworkshop onMiningnetworkdata.NewYork,NY,USA:ACMPress,2006,pp. 281–286. [46]S.Zander,T.Nguyen,andG.Armitage,“Automatedtrafficclassifi- cationandapplicationidentificationusingmachinelearning,”in IEEE 30thConferenceonLocalComputerNetworks(LCN2005),Sydney, Australia,November2005. [47]J.Erman,A.Mahanti,M.Arlitt,andC.Williamson,“Identifyingand discriminatingbetweenwebandpeer-to-peertrafficinthenetworkcore,” in WWW’07:Proc.16thinternationalconferenceonWorldWideWeb. Banff,Alberta,Canada:ACMPress,May2007,pp.883–892. [48]A.McGregor,M.Hall,P.Lorier,andJ.Brunskill,“Flowclusteringusing machinelearningtechniques,”in Proc.PassiveandActiveMeasurement Workshop(PAM2004),AntibesJuan-les-Pins,France,April2004. [49]A.Dempster,N.Laird,andD.Rubin,“Maximumlikelihoodfrom incompletedataviatheEMalgorithm,” J.RoyalStatisticalSociety, vol.30,no.1,1997. [50]P.CheesemanandJ.Stutz,“Bayesianclassification(AutoClass):Theory andresults,”in AdvancesinKnowledgeDiscoveryandDataMining, 1996. [51]NetMate,http://sourceforge.net/projects/netmate-meter/,asofAugust 14,2007. [52]TheNationalLaboratoryforAppliedNetworkResearch(NLANR), TrafficMeasurementDataRepository,http://pma.nlanr.net/Special/,as ofAugust14,2007. [53]L.Bernaille,R.Teixeira,I.Akodkenou,A.Soule,andK.Salamatian, “Trafficclassificationonthefly,” ACMSpecialInterestGroupon DataCommunication(SIGCOMM)ComputerCommunicationReview, vol.36,no.2,2006. [54]T.NguyenandG.Armitage,“Syntheticsub-flowpairsfortimelyand stableIPtrafficidentification,”in Proc.AustralianTelecommunication NetworksandApplicationConference,Melbourne,Australia,December 2006. [55]T.Auld,A.W.Moore,andS.F.Gull,“Bayesianneuralnetworksfor Internettrafficclassification,” IEEETrans.NeuralNetworks,no.1,pp. 223–239,January2007. [56]T.NguyenandG.Armitage,“Trainingonmultiplesub-flowstooptimise theuseofMachineLearningclassifiersinreal-worldIPnetworks,” in Proc.IEEE31stConferenceonLocalComputerNetworks,Tampa, Florida,USA,November2006. [57]P.Haffner,S.Sen,O.Spatscheck,andD.Wang,“ACAS:automated constructionofapplicationsignatures,”in MineNet’05:Proceeding ofthe2005ACMSIGCOMMworkshoponMiningnetworkdata. Philadelphia,Pennsylvania,USA:ACMPress,August2005,pp.197– 202. [58]TheUniversityofTwente, TrafficMeasurementDataRepository, http://arch.cs.utwente.nl/projects/m2c/m2c-D15.pdf,asof17thAugust 2007. [59] WolfensteinEnemyTerritory,http://www.enemyterritory.com/,asof17th August2007. [60]J.Park,H.-R.Tyan,andK.C.-C.J.,“GA-BasedInternetTrafficClassi- ficationTechniqueforQoSProvisioning,”in Proc.2006International ConferenceonIntelligentInformationHidingandMultimediaSignal Processing,Pasadena,California,December2006. [61]M.Crotti,M.Dusi,F.Gringoli,andL.Salgarelli,“Trafficclassification throughsimplestatisticalfingerprinting,” SIGCOMMComput.Commun. Rev.,vol.37,no.1,pp.5–16,2007. [62]J.Erman,A.Mahanti,M.Arlitt,I.Cohen,andC.Williamson,“Semi- supervisednetworktrafficclassification,” ACMInternationalConference onMeasurementandModelingofComputerSystems(SIGMETRICS) PerformanceEvaluationReview,vol.35,no.1,pp.369–370,2007. [63]——,“Offline/realtimenetworktrafficclassificatioinusingsemi- supervisedlearning,”DepartmentofComputerScience,Universityof Calgary,Tech.Rep.,February2007. [64]J.Erman,A.Mahanti,andM.Arlitt,“Internettrafficidentification usingmachinelearningtechniques,”in Proc.of49thIEEEGlobal TelecommunicationsConference(GLOBECOM2006),SanFrancisco, USA,December2006. [65]N.Williams,S.Zander,andG.Armitage,“Apreliminaryperformance comparisonoffivemachinelearningalgorithmsforpracticalIPtraffic flowclassification,” SpecialInterestGrouponDataCommunication (SIGCOMM)ComputerCommunicationReview,vol.36,no.5,pp.5– 16,2006. [66]J.Ma,K.Levchenko,C.Kreibich,S.Savage,andG.Voelker,“Un- expectedmeansofprotocolinference,”in IMC’06:Proc.6thACM SIGCOMMonInternetmeasurement.RiodeJaneriro,Brazil:ACM Press,October2006,pp.313–326.76IEEECOMMUNICATIONSSURVEYS&TUTORIALS,VOL.10,NO.4,FOURTHQUARTER2008 [67]D.Bonfiglio,M.Mellia,M.Meo,D.Rossi,andP.Tofanelli,“Revealing Skypetraffic:whenrandomnessplayswithyou,”in SIGCOMM’07: Proc.2007conferenceonApplications,technologies,architectures,and protocolsforcomputercommunications.NewYork,NY,USA:ACM, August2007,pp.37–48. [68]N.Williams,S.Zander,andG.Armitage,“Evaluatingmachinelearning methodsforonlinegametrafficidentification,”CentreforAdvancedIn- ternetArchitectures,http://caia.swin.edu.au/reports/060410C/CAIA-TR- 060410C.pdf,Tech.Rep.060410C,asofAugust14,2007. ThuyT.T.Nguyen ([email protected])receivedtheB.Eng.Dip.Prac.in telecommunicationsengineeringfromtheUniversityofTechnology,Sydney in2002withFirstClassHonour.SheisnowaPhDcandidateattheCentre forAdvancedInternetArchitectures,SwinburneUniversityofTechnology, Melbourne,Australia.HerresearchinterestsincludeInternettrafficcharac- terizationandclassification,Internetpricingandchargingsystems,QoSand performanceevaluationofbroadbandandwirelessnetworks.Sheisamember ofIEEECommunicationsSociety. GrenvilleArmitage [M03]([email protected])earnedaB.Eng (Elec)(Hons)in1988andaPhDinelectronicengineeringin1994,bothfrom theUniversityofMelbourne,Australia.Since2002hehasbeenAssociate ProfessorofTelecommunicationsEngineeringandDirectoroftheCentre forAdvancedInternetArchitecturesat SwinburneUniversityofTechnology, Melbourne,Australia.Heauthored QualityofServiceInIPNetworks: FoundationsforaMulti-ServiceInternet(MacmillanTechnicalPublishing, April2000)andco-authoredNetworkingandOnlineGames-Understanding andEngineeringMultiplayerInternetGames(JohnWiley&Sons,UK,April 2006).AssociateProfessorArmitageisalsoamemberofACMandACM SIGCOMM.