手机版

Estimating the quality of data in relational databases(5)

时间:2025-04-29   来源:未知    
字号:

Asimpleapproachtogoodnessistoconsidertheintersectionoftheextensions;thatis,thetuplesthatappearinbothvandv0.Let|v|denotethenumberoftuplesinv.Then

|v∩v0|

|v|

expressestheproportionofthedatabaseextensionthatappearsinthetrueextension.Hence,itisameasureofthesoundnessofv.Similarly,

|v∩v0|

|v0|

expressestheproportionofthetrueextensionthatappearsinthedatabaseextension.Hence,itisameasureofthecompletenessofv.

Itiseasytoverifythatsoundnessandcompletenesssatisfyalltherequirementsofagood-nessmeasure.1Soundnessandcompletenessaresimilartoprecisionandrecallininformationretrieval[15].

Adisadvantageofthesemeasuresisthatadatabasetupleisassumedtobesound(andcontributetothesoundnessmeasure)onlyifitidenticaltoatupleoftheidealdatabase(sim-ilarlyinthecaseofcompleteness).Thus,atuplethatiscorrectinallbutoneattribute,andatuplethatisincorrectinallitsattributesaretreatedidentically.Anessentialre nementofthesemeasuresistoconsiderthegoodnessofindividualattributes.

AssumeaviewVhasattributesA0,A1,...,An,whereA0isthekey.2WedecomposeVintonkey-attributepairs(A0,Ai)(i=1,...,n),ingdecom-posedextensionsinthepreviously-de nedmeasuresimprovestheirusefulnessconsiderably,andweshallassumedecomposedextensionsthroughout.

Soundnessandcompletenesscanalsobeapproachedbymeansofprobabilitytheory[11].Forexample,thede nitionofsoundnesscanbeinterpretedastheprobabilityofdrawingacorrectpairfromagivenextension.Probabilisticinterpretationsgivenewinsightintothenotionsofsoundnessandcompletenessandalsohelpustoconnectthisresearchwithalargebodyofworkonuncertaintymanagementininformationsystems[8].

Thedataqualitymeasuresthathavebeenmentionedmostfrequentlyasessentialareaccuracy,completeness,currentness,andconsistency[5,18].Itispossibletorelatethesequalitymeasurestoourowngoodnessmeasures[11].

Whenvisempty,soundnessis0/0.Ifv0isalsoemptythensoundnessisde nedtobe1;otherwiseitisde nedtobe0.Similarlyforcompleteness,whenv0isempty.

2Weconsideratupleasarepresentationoftherealworldentityidenti edbyakeyattribute;thenonkeyattributesthencapturethepropertiesofthisentity.Forsimplicity,weassumethatkeysconsistofasingleattribute.1

…… 此处隐藏:84字,全部文档内容请下载后查看。喜欢就下载吧 ……
Estimating the quality of data in relational databases(5).doc 将本文的Word文档下载到电脑,方便复制、编辑、收藏和打印
×
二维码
× 游客快捷下载通道(下载后可以自由复制和排版)
VIP包月下载
特价:29 元/月 原价:99元
低至 0.3 元/份 每月下载150
全站内容免费自由复制
VIP包月下载
特价:29 元/月 原价:99元
低至 0.3 元/份 每月下载150
全站内容免费自由复制
注:下载文档有可能出现无法下载或内容有问题,请联系客服协助您处理。
× 常见问题(客服时间:周一到周五 9:30-18:00)