手机版

MIT CLRS NOTES 2 mit 算法导论课堂笔记2

发布时间:2024-11-10   来源:未知    
字号:

MIT CLRS NOTES mit 算法导论课堂笔记

Lecture 7

Table T holding n records.

record is a hunk of data broken into what we called fields. dynamic set Operations Insert(T,x) Delete(T,x) Search(T,k)

Sup. the set of keys is K 0,1,...,m 1 and keys are distinct. Set up array T[0,...,m-1] or “direct-access” table x if x“dynamic set”

nil otherwise

x can be a pointer or just record itself P223 figure 11.1 ops take (1)time

practical application can be bit vector to sort keys, just like counting sort(count once or not)

Problem: usu. the range of keys is large 264 nearly = 16*1018 too huge

MIT CLRS NOTES mit 算法导论课堂笔记

…,m-1} P255 figure 11.2

We have Universe U of keys, a much smaller set we care about is K. what the hash function is going to do is map keys into a table T[0..m-1] occurs. Perfert hashing with no collision will be talked next lecture. Records in same slot are linked into a list. Ex. P225 figure 11.3

Worst case: into the same slot.

Ave. case: Each key in K is equally likely to be hashed to any slot in T indep. of where other keys are hashed.

n key, m slots

load factor =n/m=ave. # keys per slot

Exp. Search cost= (1 ) 1 represents hash & access slot ; search list. = (1) if = (1), i.e. if n= (m)

Should distribute keys uniformly into slots

Regualriy in the key distribution should not affect this uniformity

MIT CLRS NOTES mit 算法导论课堂笔记

integer not floating division H(k)=k mod m

Don’t want m to have a small divisor d Worse: m=2r - does not depend on all bits of k

Pick m to be prime ????? not too close to power of 2 or 10

faster than division

Faster because multiplicaion software implementation of our computer is the inner loop of division impl. m=2r , computer has w-bit words h(k)=(A*k mod 2w)rsh(w-r)

A is odd integer 2w-1 <A<2w A’s first(not sigh) and last bit is on; rsh is bit wise right shift op Don’t pick A too close to 2w P232 Ex

Modular wheel ????

Another method to collision resolution is

MIT CLRS NOTES mit 算法导论课堂笔记

no storage outside of hash table

probe systematically until an empty slot is found

h: U univers of keys {0,1,.., m-1} prob #-> {0,1,.., m-1} slot probe sequence for a given key should be a permutation table may fill up( chaining may cause long list); del. is diffcult Open address is used where with no deletions, Although deletion is possible in this method. eg: compiler Ex. Insert K=496

Search- same probe seq. -successful –find record -unsuccessful –find nil

MIT CLRS NOTES mit 算法导论课堂笔记

Probing strategies

Linear probing : look seq. through table

“primary clustering” –long runs of filled slots

Double hashing h(k,i)=(h1(k)+ih2(k)) mod m (excellent) usu. pick m=2r and h2 (k) odd

– asumme ea. key equally likely to have any one of the m! perms. as its prob seq. indep. of other keys P241 formal one Proof.

1 probe always necessary with prob n/m collision -> 2nd probe nec. 2 prob. n-1/m-1 collision -> 3nd probe nec.

n in

< = for i=1,2,…,n-1 m im

nn 1n 2

E[# probes]=1+(1+(1+(…

mm 1m 2

1

if 1 (i.e., n<m) 1

Note:

1 (1 (.... 1 2....

1 1

When this is constant time

MIT CLRS NOTES mit 算法导论课堂笔记

Lecture 8

For any choice of hash function, bad set of keys that all hash to same slot.

Let U be a universe of keys, and let be H a finite collection of hash function mapping U to {0,1, ... , m-1} U, where x y, h H:h(x) h(y)}

Hm

the chance of collision betw. x and y is 1/m if h is chosen randomly from H

{hH

n keys into m slots in table T. Then, for a given key x, E[#

Proof. Let Cx be random variable denote total # collisions of keys in T with x, and let

MIT CLRS NOTES mit 算法导论课堂笔记

1 if h(x)=h(y)

cxy = 0 otherwise

Note: E[cxy]=1/m (because H is universal) and Cx =therefore E[Cx ]=E[

y T {x}

c

xy

y T {x}

c

xy]=(n-1)/m <

Let m be prime. Decompose key k U into r+1 “digits” k=<k0 , k1 , ... , kr > where 0 ki m

the following random a will determine which hash function to use. Pick a=< a0 , a1 , ... , ar >, each ai randomly from{0,1, ... ,m-1} Define ha(k)=( aiki) mod m (Dot product modulo m)

i 0r

How big is H={ha}? mr+1

Let x=< x0 , x1 , ... , xr >

y=< y0 , y1 , ... , yr > be distinct keys

They differ in at least one digit, wlog(without loss of generality) pos. 0. For how many ha H do x and y collide? Must have ha(x)= ha(y)

MIT CLRS NOTES mit 算法导论课堂笔记

-> aixi aiyi(mod m) { means congruent}

i 0r

i 0

rr

-> ai(xi yi) 0 (mod m)

i 0

->a0(x0 y0) ai(xi yi) 0(mod m)

i 1

r

->a0(x0 y0) ai(xi yi)(mod m)

i 1

r

Number theory fact

Let m be prime, for any z m (integers mod m) $(such that) z 0, unique z-1 m $ z z-1 1(mod m) z z-1

Since x0 y0, ( x0-y0)-1

->a0 ( ai(xi yi))(x0 y0) 1(mod m)

i 1r

1 2 3 4 5 6 1 4 5 2 3 6

Thus, for any choices of a1, a2, ... ar , exactly one of m choices for a0 cause x and y to collide.

# ha’s choices that cause x, y collide = m (choices for a

1) m(choices for a2) ...m(choices for ar)

1(choices for a0, the above one)= mr =H

m

MIT CLRS NOTES mit 算法导论课堂笔记

Given n keys, construct a static hash table of size m=O(n) $ search tales

(1)time in the worst case.

Idea: 2 level scheme with universal hashing in both levels No collisions at level 2!

2 slots using random h in universal H E[#collisions]<1/2 verse of birthday paradox Proof Prob 2 given keys collide under h is 1/m=1/n2

n 2 pairs of keys -> E[#collisions]=

n 1 2 n2<1/2

1/2 Pf. Markov’s inequality for r.v. X 0

E[X]

t

E[#collisions]

Pr{ 1 collision} <1/2

1

Pr{X t}

MIT CLRS NOTES mit 算法导论课堂笔记

For level 1, choose m=n, and let ni be r.v. for # keys that hash to slot i in T. Use ni2 slots for each level 2 table Si.

E[total storage]=E[ (ni2)]= (n)

i 0n 1

Lecture 9

T for i 1 to n

do Insert(T,A[i])

Perform inorder tree walk of T

Tree-walk time= (n)

How long to build BST? n2 worst-case Observation: BST sort quicksort -Same comparisons, different order

MIT CLRS NOTES mit 算法导论课堂笔记

BST

Quicksort

comparison is the same, but have different comparison order. Depth=# comparisons during insert Ave. node depth=( (#comparisons/node))/n

nodes

=

1

O(nlgn) (QS analysis) n

=O(lgn)

Ave. BST height=O(lgn) but this argument doesn’t show it. Ex:

MIT CLRS NOTES mit 算法导论课堂笔记

Ave. node depth to choose

nlgn

n

n n=O(nlgn)/n=O(lgn)

n n

large (not dominate the nlgn) will make the BST 2

height increase: the thresh h will be nlgn

1. Prove Jensen’s inequality, which says f(E[X]) E[f(X)] for any convex function f and r.v. X

2. Analyze “exponential height” of randomly built BST on n nodes: Yn=2X where Xn is r.v. for height.

n

3. Prove that

2

E[Xn]

Xn3

E[Y]E[2]O[n] =n=

therefore E[Xn] O[lgn]

A function f: R->R is convex if , 0 $ 1, we have

f( x y) f(x) f(y) x,y R

f

MIT CLRS NOTES mit 算法导论课堂笔记

Let f be convex and let { 1, 2, ... , n} be nonnegtive constants $

nk 1

x

1. Then, for any set {x1,x2, ... ,xn}, we have

n

f( kxk) kf(xk)

k 1

here { 1, 2, ... , n}will be probability for us to use Proof: Induction on n.

For n=1, 1 1 then f( 1x1) 1f(x1) trivial

Let f be convex, and x be a r.v. Then f(E[X]) E[f(X)]. Proof. f(E[X])=f( kPr{X k})

k

f( kxk) f( nxn (1 n)

k 1

n

k

xk) 1 k 1n

n 1

nf(xn) (1 n)f( nf(xn) (1 n)

n 1

k

xk) convexity 1 k 1n

n 1

k

f(xk) I.H.

k 11 n

= kf(xk)

k 1

n

k

f(k)Pr{X k}=E[f(X)]

Xn=r.v. denoting height of BST( all perms equally likely)

Xn2Yn=

MIT CLRS NOTES mit 算法导论课堂笔记

Root has rank k Xn=1+max{Xk-1,Xn-k} Yn=2 max{Yk-1,Yn-k} Def. Z

nk

1 0

if root has rank k

otherwise

random v. indicator , like quick sort Pr{Znk=1}=E[Znk]=1/n Yn= Znk(2max(Yk-1,Yn-k))

k 1n

E[Yn] =E[ Znk(2max(Yk-1,Yn-k))]

k 1

n

= E[Znk(2max(Yk-1,Yn-k))]

k 1n

n

= E[Znk]E[(2max(Yk-1,Yn-k))] independence

k 1

2n

E[Yk-1 Yn-k] nk 1

4n 1

= E[Yi] ni 0

Solve Use subst. to show E[Yn] Cn3 for const C large enough for initial condition.

4n 13

E[Yn] Ck {substitution}

nk 0

MIT CLRS NOTES mit 算法导论课堂笔记

4Cn3

xdx n0

4Cn43

()=Cn

n4

2E[Xn] E[2Xn]

=E[Yn]

Cn

3

Jensen

therefore E[Xn] 3lgn+O(1)

max {a, b} a+b

when a b gets big, a+b approaches max {a, b} very slowly max{2a, 2b} 2a+2b

when a b gets big, 2a+2b approaches max {2a, 2b} very quickly, really tight bound.

sum of exponential like2a+2bis not easy to deal with, but product of exponential is easy, that is where Jensen comes in using convexity of exponentals.

MIT CLRS NOTES mit 算法导论课堂笔记

Lecture 10

Binary search tree data structure with guaranteed O(lg n) height supporting dynamic set operations

VL trees, B-treses/ 2-3 trees, red-black trees

these five loop invariance will maintain the balance 1. every node is either red or black 2. root is black

3. leaves (nil’s) are black

4. if a node is red, then its parent is black = no red adjacent we don’t want too many red, you can have lots of black nodes, black will be good. 5. for any node x

all simple paths from x to a descendent leaf have same # of black nodes. we do not count red nodes because there are not too many red nodes , so this will do us well.

nil’s)

MIT CLRS NOTES 2 mit 算法导论课堂笔记2.doc 将本文的Word文档下载到电脑,方便复制、编辑、收藏和打印
    ×
    二维码
    × 游客快捷下载通道(下载后可以自由复制和排版)
    VIP包月下载
    特价:29 元/月 原价:99元
    低至 0.3 元/份 每月下载150
    全站内容免费自由复制
    VIP包月下载
    特价:29 元/月 原价:99元
    低至 0.3 元/份 每月下载150
    全站内容免费自由复制
    注:下载文档有可能出现无法下载或内容有问题,请联系客服协助您处理。
    × 常见问题(客服时间:周一到周五 9:30-18:00)