手机版

推荐系统netflix获奖算法(9)

发布时间:2021-06-07   来源:未知    
字号:

赢得netflix推荐系统大奖的算法

representation).ThemovierepresentationisstillbasedonthetimeSVD++model.TheresultingRMSEisalso0.8661.

Finally,weaddedk-NNfeaturesontopofthetimeSVD++features.Thatis,foreachu ipair,wefoundthetop20moviesmostsimilartoi,whichwereratedbyu.Weaddedthemoviescores,eachmultipliedbytheirrespectivesimilaritiesasadditionalfeatures.SimilaritiesherewereshrunkPearsoncorrelations[1].ThisslightlyreducestheRMSEto0.8660.AnotherusageofGBDTisforsolvingaregressionproblempermovie.Foreachuserwecomputeda50-Dcharacteristicvectorformedbythevaluesofthe50hiddenunitsofarespectiveRBM.Then,foreachmovieweusedGBDTforsolvingtheregressionproblemoflinkingthe50-Duservectorstothetrueuserratingsofthemovie.Theresult,withRMSE=0.9248,willbedenotedas[PQ7]inthefollowingdescription.

B.ListofBellKor’sProbe-Qualifyingpairs

Welistthe24BellKorpredictorswhichparticipatedintheGBDTblending.Noticethatmanymoreofourpredictorsareinthe nalblendofQualifyingresults(asmentionedearlierinthisarticle).However,onlyforthoselistedbelowwepossesscorrespondingProberesults,whichrequireextracomputationalresourcestofullyre-trainthemodelwhileexcludingtheProbesetfromthetrainingset.

PostProgressPrize2008predictors

Thosewerementionedearlierinthisdocument:1)PQ12)PQ23)PQ34)PQ45)PQ56)PQ67)PQ7

ProgressPrize2008predictors

Thefollowingisbasedonournotationin[3]:8)SVD++(1)(f=200)

9)Integrated3)(f=100,k=300)10)SVD++((f=500)

11)FirstneighborhoodmodelofSec.2.2of[3]

(RMSE=0.9002)

12)Aneighborhoodmodelmentionedtowardstheendof

Sec.2.2of[3](RMSE=0.8914)ProgressPrize2007predictors

Thefollowingisbasedonournotationin[2]:13)Predictor#4014)Predictor#3515)Predictor#6716)Predictor#75

17)NNMF(60factors)withadaptiveuserfactors18)Predictor#8119)Predictor#73

20)100neighborsUser-kNNonresidualsofallglobal

effectsbutthelast421)Predictor#8522)Predictor#45

9

23)Predictor#8324)Predictor#106

OnelastpredictorwithRMSE=0.8713isinthe nalblend.Itisbasedontheblendingtechniquedescribedinpage12of[3].Thetechniquewasappliedtothefourpredictorsindexedaboveby:2,9,12,and13.

VIII.CONCLUDINGREMARKS

GrantingthegrandprizecelebratestheconclusionoftheNet ixPrizecompetition.Wideparticipation,extensivepresscoverageandmanypublicationsallre ecttheimmensesuc-cessofthecompetition.Dealingwithmovies,asubjectclosetotheheartsofmany,wasde nitelyagoodstart.Yet,muchcouldgowrong,butdidnot,thankstoseveralenablingfactors.The rstsuccessfactorisontheorganizationalside–Net ix.Theydidagreatservicetothe eldbyreleasingapreciousdataset,anactwhichissorare,yetcourageousandimportanttotheprogressofscience.Beyondthis,bothdesignandconductofthecompetitionwere awlessandnon-trivial.Forexample,thesizeofthedatawasrightontarget.Muchlargerandmorerepresentativethancomparabledatasets,yetsmallenoughtomakethecompetitionaccessibletoanyonewithacommodityPC.Asanotherexample,Iwouldmentionthesplitofthetestsetintothreeparts:Probe,Quiz,andTest,whichwasessentialtoensurethefairnessofthecompetition.Despitebeingplannedwellahead,itprovedtobeadecisivefactorattheverylastminuteofthecompetition,threeyearslater.Thesecondsuccessfactoristhewideengagementofmanycompetitors.Thiscreatedpositivebuzz,leadingtofurtherenrollmentofmanymore.Muchwassaidandwrittenonthecollaborativespiritofthecompetitors,whichopenlypublishedanddiscussedtheirinnovationsonthewebforumandthroughscienti cpublications.Thefeelingwasofabigcommunityprogressingtogether,makingtheexperiencemoreenjoyableandef cienttoallparticipants.Infact,thisfacilitatedthena-tureofthecompetition,whichproceededlikealongmarathon,ratherthanaseriesofshortsprints.

Anotherhelpfulfactorwassometouchofluck.Themostprominentoneisthechoiceofthe10%improvementgoal.Anysmalldeviationfromthisnumber,wouldhavemadethecompetitioneithertooeasyorimpossiblydif cult.Inaddition,thegoddessofluckensuredmostsuspenseful nishlinesinboth2007ProgressPrizeand2009GrandPrize,matchingbestsportsevents.

Thescienceofrecommendersystemsisaprimebene ciaryofthecontest.Manynewpeoplebecameinvolvedinthe eldandmadetheircontributions.Thereisaclearspikeinrelatedpublications,andtheNet ixdatasetisthedirectcatalysttodevelopingsomeofthebetteralgorithmsknowninthe eld.Outofthenumerousnewalgorithmiccontributions,Iwouldliketohighlightone–thosehumblebaselinepredictors(orbiases),whichcapturemaineffectsinthedata.Whiletheliteraturemostlyconcentratesonthemoresophisticatedalgorithmicaspects,wehavelearnedthatanaccuratetreatmentofmaineffectsisprobablyatleastassigni cantascomingupwithmodelingbreakthroughs.

Finally,wewereluckytowinthiscompetition,butrecog-nizetheimportantcontributionsofthemanyothercontestants,

推荐系统netflix获奖算法(9).doc 将本文的Word文档下载到电脑,方便复制、编辑、收藏和打印
×
二维码
× 游客快捷下载通道(下载后可以自由复制和排版)
VIP包月下载
特价:29 元/月 原价:99元
低至 0.3 元/份 每月下载150
全站内容免费自由复制
VIP包月下载
特价:29 元/月 原价:99元
低至 0.3 元/份 每月下载150
全站内容免费自由复制
注:下载文档有可能出现无法下载或内容有问题,请联系客服协助您处理。
× 常见问题(客服时间:周一到周五 9:30-18:00)