手机版

统计学习[The Elements of Statistical Learning]第四章习题

发布时间：2024-11-28 来源：未知

小中大

字号：

The Elements of Statistical Learning: Data Mining, Inference, and Prediction.(Second Edition)《统计学习基础——数据挖掘、推理与预测》部分章节习题的部分参考解答。

TheElementofStatisticalLearning–Chapter4

oxstar@SJTUJanuary6,2011

Ex.4.1ShowhowtosolvethegeneralizedeigenvalueproblemmaxaTBasubjecttoaTWa=1bytransformingtoastandardeigenvalueproblem.

AnswerWisthecommoncovariancematrix,andit’spositive-semide nite,sowecande ne

b=Wa,

1

a=W b,

1

aT=bTW 1

Hencethegeneralizedeigenvalueproblem

max(aTBa)=max(bTW BW b)

a

b

1

subjectto

aTWa=bTW WW b=1

11

Sotheproblemistransformedtoastandardeigenvalueproblem.

Ex.4.2Supposewehavefeaturesx∈Rp,atwo-classresponse,withclasssizesN1,N2,andthetargetcodedas N/N1,N/N2.

1.ShowthattheLDAruleclassiestoclass2if

N N 1T 11T 112 1 xΣ( µ2 µ 1)>µ 2Σµ 2 µ 1Σµ 1+log log,

22NN

T

andclass1otherwise.

2.Considerminimizationoftheleastsquarescriterion

N i=1

(yi β0 βTxi)2.

satis esShowthatthesolutionβ

NN12 + Bβ=N( (N 2)ΣΣµ2 µ 1)N B=( (aftersimpli cation),whereΣµ2 µ 1)( µ2 µ 1)T. Bβisinthedirection( 3.HenceshowthatΣµ2 µ 1)andthus

∝Σ 1( βµ2 µ 1)

Thereforetheleastsquaresregressioncoe cientisidenticaltotheLDAcoe cient,uptoa

scalarmultiple.

4.Showthatthisresultholdsforany(distinct)codingofthetwoclasses.

The Elements of Statistical Learning: Data Mining, Inference, and Prediction.(Second Edition)《统计学习基础——数据挖掘、推理与预测》部分章节习题的部分参考解答。

0,andhencethepredictedvaluesf =β 0+β Tx.Considerthefollowing5.Findthesolutionβ

rule:classifytoclass2ify i>0andclass1otherwise.ShowthisisnotthesameastheLDAruleunlesstheclasseshaveequalnumbersofobservations.(Fisher,1936;Ripley,1996)Proof

1.Considerthelog-ratioofeachclassdensity(equation4.9intextbook)

log

π21Pr(G=2|X=x)

=log (µ2+µ1)TΣ 1(µ2 µ1)+xTΣ 1(µ2+µ1)

Pr(G=1|X=x)π12

Whenit>0,theLDArulewillclassifyxtoclass2,meanwhile,weneedtoestimatethe

parametersoftheGaussiandistributionsusingourtrainingdata

1 1( 1( µ2+µ 1)TΣµ2 µ 1)+log(π1) log(π2)xTΣµ2 µ 1)>( 2

N N 1T 11T 112

=µ 2Σµ 2 µ 1Σµ 1+log log22NN

2.Letβ =(β,β0)TandcomputethepartialdeviationoftheRSS(β ),thenwehave

N RSS(β )

= 2(yi β0 βTxi)=0

β0

i=1N RSS(β )

= 2xi(yi β0 βTxi)=0

βi=1

(1)(2)

Wecanalsoderivethat

N1 β0=(yi βTxi)//from(1)

Ni=1

NNN 1 T

xi[β(xi x¯)]=xiyi yj//from(2)(3)

Ni=1i=1j=1

(3)(4)

=

2 k=1gi=k

xi

N1y1+N2y2yk

N

(5)

gi=k

==

2 k=1

Nkµ k

Nyk (N1y1+N2y2)

N

//xi=Nkµ k(6)(7)(8)

N1N2

(y2 y1)( µ2 µ 1)N=N( µ2 µ 1)//y1=N/N1,y2=N/N2

The Elements of Statistical Learning: Data Mining, Inference, and Prediction.(Second Edition)《统计学习基础——数据挖掘、推理与预测》部分章节习题的部分参考解答。

Wealsohave

=(N 2)Σ

2 k=1gi=k

(xi µ k)(xi µ k)T(xixT T kµ Ti 2xiµk+µk)xixT kµ Ti Nkµk

//

//xT k=xiµ Tiµk

gi=k

(9)

=

2 k=1gi=k

(10)

=

2

gi=k

xi=Nkµ k(11)

k=1

=

N i=1

xixT 1µ T 2µ Ti (N1µ1+N2µ2)

xk

2

k=1

(12)(13)(14)

xix¯T=

=

2

k=1gi=k

N

gi=k

xk

T

1

(N1µ 1+N2µ 2)(N1µ 1+N2µ 2)T N

i=1

Meanwhile

N i=1

xi[βT(xi x¯)]=

N i=1

xi[(xi x¯)Tβ]=xixTi

N i=1

xix¯Tβ

(15)

1TTT =(N 2)Σ+(N1µ 1µ 1+N2µ 2µ 2) (N1µ 1+N2µ 2)(N1µ 1+N2µ 2)β//from(12)(14)

N

(16)

+N1N2( =(N 2)Σµ2µ T 1µ T 2µ T 1µ T(17)2 µ2 µ1+µ1)β

NN12 Bβ=N( +Σµ2 µ 1)//from(8)(18)=(N 2)Σ

N3.

Bβ=( Σµ2 µ 1)( µ2 µ 1)Tβ

Bβisinthedirection( ( µ2 µ 1)Tβisascalar,thereforeΣµ2 µ 1),and

NN112 = 1N( Σµ2 µ 1) ΣBβ//from(18)β

N 2N

1N1N2

1( =N ( µ2 µ 1)TβΣµ2 µ 1)

N 2N 1( ∝Σµ2 µ 1)4.ByreplacingNwith

=β

N1N2

(y2

(19)(20)(21)

N1N2N(N 2)

y1)(from(7)and(8))andfrom(20),westillhave

T 1( 1( (y2 y1) ( µ2 µ 1)βΣµ2 µ 1)∝Σµ2 µ 1)

5.Theboundaryconditionisyi=0,sofrom(3)wehave

N 1 0= Txi=β Tµ ββ =µ Tβ

Ni=1

The Elements of Statistical Learning: Data Mining, Inference, and Prediction.(Second Edition)《统计学习基础——数据挖掘、推理与预测》部分章节习题的部分参考解答。

Whentheclasseshaveequalnumbersofobservations,i.e.N1=N2=N/2

µ =

Thenwehavepredictedvalues

=(x µ ∝(x µ 1( f )Tβ )TΣµ2 µ 1)(23)1 1( 1( µ1+µ 2)TΣ∝xTΣµ2 µ 1) ( µ2 µ 1)//from(22)(24)2

WhiletheLDAruleindicatethatthelog-ratioofeachclassdensity=zeroistheboundarycondition(N1=N2soπ1=π2)

log

1Pr(G=2|X=x)

= (µ2+µ1)TΣ 1(µ2 µ1)+xTΣ 1(µ2+µ1)

Pr(G=1|X=x)2

(25)

1

( µ1+µ 2)2

(22)

Compare(24)with(25),wecan ndthattheyarethesamerule.ButwhenN1=N2,theserulesareobviouslydi erent.

vialinearregression.Indetail,letEx.4.3SupposewetransformtheoriginalpredictorsXtoY

=X(XTX) 1XTY=XB ,whereYistheindicatorresponsematrix.SimilarlyforanyinputY

Tx∈RK.ShowthatLDAusingY isidenticaltoLDAx∈Rp,wegetatransformedvectory =B

intheoriginalspace.

Tx,sowehaveProofTransformedvectory =B

Ty igi=kBxigi=ky Tµ==B xµ k=k

NkNi

Ty igi= Bxigi= y Tµ==B xµ =

N Ni N y y T

( y µ )( y µ )iik=1g=kkki yΣ=

N K

N T T x xk)(xi µk)Bk=1gi=kB(xi µ=

N K

TΣ xB =B

Substitute(26)(27)(30)forµgiandΣinequation(4.9)

log

=yπkPr(G=k|Y )1xT T 1 T=logµk+µ xB( µx x ( )B(BΣxB)k µ )π 2Pr(G= |Y=y )

(B TΣ xB ) 1B T( +xTBµx µ x)

k

(26)(27)(28)(29)(30)

T( 1µx µ (B TΣ xB ) 1Bµx x xInfact,ifBk µ )=Σx( k ),LDAusingYisidenticaltoLDAinthe

originalspace.Sowewillproveitbellow.

Yisaindicatorresponsematrix,therefore

Nkµ x=xi=XTyk(31)k

NK 1T x=ΣxixTNkµ xµxi k( k)i=1

k=1

K 1T

=XTX XT(ykyk)XN K

k=1

gi=k

//from(12)(32)(33)(34)

=

1

(XTX XTYYTX)

N K

The Elements of Statistical Learning: Data Mining, Inference, and Prediction.(Second Edition)《统计学习基础——数据挖掘、推理与预测》部分章节习题的部分参考解答。

xB (B TΣ xB ) 1B Twhichsatis esHH=H.WehireWeneedaprojectionmatrixH=Σ

xB (B TΣ xB ) 1B TXTYforassistance,where(bythede nitionofB )HXTY=Σ

T=((XTX) 1XTY)T=YTX(XTX) 1B

1 xB =(XTX XTYYTX)(XTX) 1XTYΣ

N K1

XTY(I YTX(XTX) 1XTY)=

N K

TΣ xB ) 1=(YTX(XTX) 11XTY(I YTX(XTX) 1XTY)) 1(B

N KT

=(N K)(YX(XTX) 1XTY(I YTX(XTX) 1XTY)) 1

TXTY=YTX(XTX) 1XTY,wehaveLetP=B

xB =Σ

TΣ xB ) 1(B

1

XTY(I P)

N K

=(N K)(P(I P)) 1

P(I P)isinvertible,soPandI Pareinvertible,hence

TΣ xB ) 1=(N K)(I P) 1P 1(B

1

XTY(I P)(N K)(I P) 1P 1YTX(XTX) 1XTYHXTY=

N K=XTYHerewecanprove

HXTY=XTY HXTyk=XTyk

HNkµ x x//from(31)k=Nkµk

xB (B TΣ xB ) 1B Tµ Σ x x//de nitionofHk=µk (B TΣ xB ) 1B Tµ 1µ x=Σ x B

k (B TΣ xB ) 1B T( Bµxk

x

µ x )

k

x 1( =Σ xxµk µ )

统计学习[The Elements of Statistical Learning]第四章习题.doc 将本文的Word文档下载到电脑，方便复制、编辑、收藏和打印

下载这篇word文档

上一篇：2K311000 城镇道路工程讲义

下一篇：研究匀变速直线运动规律

×

二维码

相

关

文

章

韦周中心小学2010-2011学年第一学期艺术教育工作计划 2024-11-25
2014年11月稀土市场分析 2024-11-25
猴子种果树教学设计 2021-06-08
企业如何实现有效的集团管控？——最专业咨询 2021-06-05

分类导航

幼儿教育小学教育初中教育教学研究专业资料资格考试教育文库外语考试求职职场高等教育高中教育实用文档

统计学习[The Elements of Statistical Learning]第四章习题

分类导航

今日头条

每日精选

猜你喜欢

精彩图片

热门标签

统计学习[The Elements of Statistical Learning]第四章习题

推荐阅读

分类导航

今日头条

每日精选

猜你喜欢

精彩图片

热门标签