The Robustness of Knowledge: Analysis of Certainty in Narrative Knowledge Bases
2010 (English)Other (Other (popular science, discussion, etc.))Text
Knowledge constitutes the lifeblood of many organizations. Efforts to manage it systematically necessitate mapping and appraising the existing body of knowledge, not least in order to learn of any gaps that need to be bridged. The extent to which knowledge can be relied upon bears heavily on the decision-making apparatus of organizations. The ability to monitor the state of organizational knowledge is greatly facilitated by codifying it in a highly formalized form, which typically incorporates a certainty factor. Despite such ongoing knowledge management endeavors, most of it is nonetheless yet to be formalized: valuable knowledge is retained in scores of documents, which are loosely structured at best. Means of appraising the reliability of such knowledge are ostensibly lacking. In recent years, uncertainty in text has increasingly come into the research spotlight. This is here extrapolated to a slightly different domain in an attempt to discover ways of gauging the reliability of knowledge based on the subjective perspective of the author, who, through a particular choice of words and expressions, implicitly or explicitly judges the certainty of knowledge he or she wishes to communicate. A holistic perspective on certainty—in which not only uncertainty is considered, but also signs of increased certainty—requires a classification of statements into various certainty levels. Moreover, a differentiation between types of statement is important due to their—to different degrees—varying claims of constituting knowledge. Based on previous approaches and an extensive literature review, a set of guidelines for the annotation of certainty in knowledge-intensive text is proposed. The feasibility of the approach put forward is evaluated on a small corpus comprising documents from The World Bank, which are annotated in two sets. Analysis of the resulting annotations is conducted by comparing the level of agreement between different annotators. Difficulties in distinguishing between types of statement— statements that give an account of something and statements that express clear knowledge claims—and determining the level of certainty—a four-scale classification from very certain to very uncertain—lead to significant inconsistencies (0.28 F1-score for exact matches and 0.41 F1-score for partial matches). Further refinement of the guidelines is therefore necessary before automatic classification can be attempted.
Place, publisher, year, edition, pages
certainty, uncertainty, hedging, annotation, knowledge management
Research subject Computer and Systems Sciences
IdentifiersURN: urn:nbn:se:su:diva-132855OAI: oai:DiVA.org:su-132855DiVA: diva2:955409