手机版

Clustering using firefly algorithm Performance study(3)

发布时间:2021-06-07   来源:未知    
字号:

萤火虫算法

166J.Senthilnathetal./SwarmandEvolutionaryComputation1(2011)164–171

whereKisthenumberofclusters,foragivennpatternxi(i=1,...,n)thelocationoftheithpatternandck(k=1,...,K)isthekthclustercenter,tobefoundbyEq.(6):ck=

xi

(6)

i∈Ck

nk

wherenkisthenumberofpatternsinthekthcluster.

Theclusteranalysisformstheassignmentofdatasetintoclusterssothatitcanbegroupedintosameclusterbasedonsomesimilaritymeasures[23].Distancemeasurementismostwidelyusedforevaluatingsimilaritiesbetweenpatterns.TheclustercentersarethedecisionvariableswhichareobtainedbyminimizingthesumofEuclideandistanceonalltrainingsetinstancesinthed-dimensionalspacebetweengenericinstancexiandthecenteroftheclusterck.Thecost(objective)functionforthepatterniisgivenbyEq.(7),asin[9,14]f

Train

i=

1

DDd(x,

CLknown(xj)jpTraini

)

(7)

j=1

whereDTrainisthenumberoftrainingdatasetwhichisusedtonormalizethesumthatwillrangeanydistancewithin[0.0,1.0]andpCLknown(xj)

todatabase.

idefinestheclassthatinstancebelongstoaccordingNotethatinourFAalgorithm,thedecisionvariablesaretheclustercenters.TheobjectivefunctioninourFAalgorithmisgivenbyEq.(7).Inourstudy,weconsiderthestandard13benchmarkproblemsgivenin[14].Foragivendataset,letnbethenumberofdatapoints,dbethedimension,cbethenumberofclasses.Agivendatapointbelongstoonlyoneofthesecclasses.Ofthegivendataset,75%ofthedatasetarerandomlyselectedtoobtaintheclustercentersusingEq.(7).Inthiswayweobtaintheclustercentersforallthecclasses.Theremaining25%ofdatasetisused(calledtestdataset)toobtaintheclassificationerrorpercentage(CEP).AnillustrativeexampleofthisFAalgorithmanditsperformancemeasure,isgiveninthenextsection.

4.Performancemeasuresandanillustrativeexample

Asdiscussedintheearliersection,ingtheseclustercenters,thetestingdatasetareclassifiedandtheperformanceofclassificationareanalyzed.

4.1.Performanceevaluation

TheperformanceoftheextractedknowledgeintheformofclustercentersbytheFAisevaluatedusingClassificationErrorPercentage(CEP)andclassificationefficiency.CEPdependsonlyontestdataandtheclassificationefficiencydependsonbothtrainingandtestingdata.

4.1.1.ClassificationErrorPercentage(CEP)

CEPisobtainedonlyusingthetestdata[9].Foreachproblem,wereporttheCEPwhichisthepercentageofincorrectlyclassifiedpatternsofthetestdatasetsasgivenin[9],tomakeareliablecomparison.

Theclassificationofeachpatternisdonebyassigningittotheclasswhosedistanceisclosesttothecenteroftheclusters.Then,theclassifiedoutputiscomparedwiththedesiredoutputandiftheyarenotexactlythesame,thepatternisseparatedasmisclassified[9].Thisprocedureisappliedtoalltestdataandthetotalmisclassifiedpatternnumberispercentagedtothesizeoftestdataset,whichisgivenbyCEP=

numberofmisclassifiedsamples

totalsizeoftestdataset

×100.

(8)

20

Class 2

15

y

training dataClass 1

testing data

10

5

0510

152025

x

Fig.1.Datadistribution.

4.1.2.Classificationefficiency

Classificationefficiencyisobtainedusingboththetrainingandtestdata.Theclassificationmatrixisusedtoobtainthestatisticalmeasuresfortheclass-levelperformance(individualefficiency)andtheglobalperformance(averageandoverallefficiency)oftheclassifier[24].Theindividualefficiencyisindicatedbythepercentageclassificationwhichtellsushowmanysamplesbelongingtoaparticularclasshavebeencorrectlyclassified.Thepercentageclassification(ηi)fortheclassciisgivenbyEq.(9).

ηii

i=

qn(9)

qji

j=1

whereqiiisthenumberofcorrectlyclassifiedsamplesandnisthenumberofsamplesfortheclassciinthedataset.Theglobalperformancemeasuresaretheaverage(ηa)andoverall(ηo)classification,whicharedefinedas

η1

nca=nηi

(10)

ci=1

η1

nco=

Nqii(11)i=1

wherencisthetotalnumberofclassesandNisthenumberofpatterns.

4.2.Illustrativeexample

WeillustratehowtheFireflyAlgorithm(FA)isusedforclusteringwiththefollowingsyntheticdata.Althoughtheproposedalgorithmcanbeusedforanytypeofmixturemodel,wefocusonaGaussianmixture.LetusconsidertwoGaussianmixturesthathavetwoinputfeatures,namelyxandy.Here,themeanvaluesµ1=[8,8]Tandµ2=[16,16]T,co-variancematrix(x,y)={(6,3);(3,2)}areassumedandeachclasshaveequalnumberofsamples.Inourexperimentation100samplesaregeneratedrandomlyforeachclass.Ofthese75datapointsareusedfortrainingandtheremaining25isusedfortestingineachclass.ThissyntheticdatageneratedisshowninFig.1.

Weusethefireflyalgorithmontrainingdatatoobtainclustercenters.Letxibeoneofthesolutions(clustercenters)andJibetheobjectivefunctionvalueforthisclustercenter.

Weconsiderapopulationsizeof5firefliesatlocationsx1,x2,x3,x4andx5within2d-dimensional,searchspace.NowevaluatethefitnessofthepopulationJ1,J2,J3J4,andJ5usingEq.(7)whichisdirectlyproportionaltolightintensityI1,I2,I3,I4andI5.Nowcomparetheintensityvaluesofafirefly,if(I2<I1)thenmovefirefly2toward1usingEq.(4),similarlycomparealltheagents

Clustering using firefly algorithm Performance study(3).doc 将本文的Word文档下载到电脑,方便复制、编辑、收藏和打印
×
二维码
× 游客快捷下载通道(下载后可以自由复制和排版)
VIP包月下载
特价:29 元/月 原价:99元
低至 0.3 元/份 每月下载150
全站内容免费自由复制
VIP包月下载
特价:29 元/月 原价:99元
低至 0.3 元/份 每月下载150
全站内容免费自由复制
注:下载文档有可能出现无法下载或内容有问题,请联系客服协助您处理。
× 常见问题(客服时间:周一到周五 9:30-18:00)