skip to main content
10.1145/1718487.1718504acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

Corroborating information from disagreeing views

Published:04 February 2010Publication History

ABSTRACT

We consider a set of views stating possibly conflicting facts. Negative facts in the views may come, e.g., from functional dependencies in the underlying database schema. We want to predict the truth values of the facts. Beyond simple methods such as voting (typically rather accurate), we explore techniques based on "corroboration", i.e., taking into account trust in the views. We introduce three fixpoint algorithms corresponding to different levels of complexity of an underlying probabilistic model. They all estimate both truth values of facts and trust in the views. We present experimental studies on synthetic and real-world data. This analysis illustrates how and in which context these methods improve corroboration results over baseline methods. We believe that corroboration can serve in a wide range of applications such as source selection in the semantic Web, data quality assessment or semantic annotation cleaning in social networks. This work sets the bases for a wide range of techniques for solving these more complex problems.

References

  1. S. Abiteboul, M. Preda, and G. Cobena. Adaptive on-line page importance computation. In Proc. WWW, Budapest, Hungary, May 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M. Arenas, L. Bertossi, and J. Chomicki. Consistent query answers in inconsistent databases. In Proc. PODS, Philadelphia, Pennsylvania, USA, May 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. E. Brill, S. Dumais, and M. Banko. An analysis of the AskMSR question-answering system. In Proc. EMNLP, July 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Brin and L. Page. The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems, 30(1-7):107--117, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. C.-H. Chang, M. Kayed, M.R. Girgis, and K.F. Shaalan. A survey of Web information extraction systems. IEEE Transactions on Knowledge and Data Engineering, 18(10):1411--1428, Oct. 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. A.P. Dempster, N.M. Laird, and D.B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, 39(1):1--38, 1977.Google ScholarGoogle Scholar
  7. X. Dong, L. Berti-Equille, and D. Srivastava. Integrating conflicting data: The role of source dependence. In Proc. VLDB, Lyon, France, 2009.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. X. Dong, L. Berti-Equille, and D. Srivastava. Truth discovery and copying detection in a dynamic world. In Proc. VLDB, Lyon, France, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. D. Downey, O. Etzioni, and S. Soderland. A probabilistic model of redundancy in information extraction. In Proc. IJCAI, Edinburgh, United Kingdom, July 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. Fuxman, E. Fazli, and R.J. Miller. Conquer: efficient management of inconsistent databases. In Proc. SIGMOD, Baltimore, Maryland, USA, June 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. Galland, S. Abiteboul, A. Marian, and P. Senellart. Corroboration de vues discordantes fondées sur la confiance. In Proc. BDA, Namur, Belgium, Oct. 2009. Conference without formal proceedings.Google ScholarGoogle Scholar
  12. S. Golder and B.A. Huberman. Usage patterns of collaborative tagging systems. Journal of Information Science, 32(2):198--208, April 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. O. Häggström. Finite Markov chains and algorithmic applications, volume 52 of London Mathematical Society Student Texts. Cambridge University Press, Cambridge, United Kingdom, 2002.Google ScholarGoogle Scholar
  14. A. Jøsang, S. Marsh, and S. Pope. Exploring different types of trust propagation. In Proc. Trust Management, Pisa, Italy, May 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. C.C.T. Kwok, O. Etzioni, and D.S. Weld. Scaling question answering to the Web. In Proc. WWW, Hong Kong, China, May 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. C.D. Manning, P. Raghavan, and H. Schutze. Introduction to Information Retrieval. Cambridge University Press, Cambridge, United Kingdom, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. G.A. Mihaila, L. Raschid, and M.-E. Vidal. Using quality of data metadata for source selection and ranking. In Proc. WebDB, Dallas, Texas, USA, May 2000.Google ScholarGoogle Scholar
  18. D. Osherson and M.Y. Vardi. Aggregating disparate estimates of chance. Games and Economic Behavior, 56(1):148--173, July 2006.Google ScholarGoogle ScholarCross RefCross Ref
  19. N.E. Taylor and Z.G. Ives. Reconciling while tolerating disagreement in collaborative data sharing. In Proc. SIGMOD, Chicago, Illinois, USA, June 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. Wu and A. Marian. Corroborating answers from multiple Web sources. In Proc. WebDB, Beijing, China, June 2007.Google ScholarGoogle Scholar
  21. X. Yin, J. Han, and P.S. Yu. Truth discovery with multiple conflicting information providers on the Web. In Proc. KDD, San Jose, California, USA, Aug. 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Corroborating information from disagreeing views

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              WSDM '10: Proceedings of the third ACM international conference on Web search and data mining
              February 2010
              468 pages
              ISBN:9781605588896
              DOI:10.1145/1718487

              Copyright © 2010 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 4 February 2010

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article

              Acceptance Rates

              Overall Acceptance Rate498of2,863submissions,17%

              Upcoming Conference

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader