Knowledge representation as surrogate D. Wissensrepräsentation „Knowledgeȱrepresentation is mostȱfundamentallyȱaȱsurrogate,ȱaȱsubstituteȱ forȱtheȱthingȱitself,ȱusedȱtoȱenableȱanȱentityȱtoȱdetermineȱconsequencesȱbyȱ thinkingȱratherȱthanȱacting.ȱ[...]ȱReasoning is aȱprocess that goes onȱ internally [ofȱaȱperson or program],ȱwhile most things it wishes toȱreason about exist only externally.“ Slide 1 Prinzipien der Wissensrepräsentation Slide 3 Knowledge representation as as set ontological commitments Davis, Shrobe, and Szolovits (1993) discussed five principles for a knowledge representation: Aȱknowledge representation “is a set ofȱontological commitments,ȱi.e.,ȱanȱ answer toȱthe following question:ȱ‚Inȱwhat terms should we think about the 1. 2. 3. 4. 5. A knowledge representation is a surrogate, a set of ontological commitments, a fragmentary theory of intelligent reasoning, a medium for pragmatically efficient computation, and a medium of human expression. world?ȱ[...]ȱInȱselecting any representation,ȱwe are [...]ȱmaking aȱset ofȱ decisions about how andȱwhat toȱsee inȱthe world.ȱ[...]ȱWe (andȱour reasoning machines)ȱneed guidance inȱdeciding what inȱthe world toȱattend to,ȱandȱwhat toȱignore.“ • The way a knowledge representation focuses on these principles characterizes its spirit. • Each knowledge representation is a trade-off between these principles. Slide 2 Slide 4 Knowledge representation as theory of intelligent reasoning Knowledge representation as medium of human expression „The initial conception ofȱaȱknowledge representation is typically motivated by some insight indicating how peopleȱreason intelligently,ȱor by some beliefȱabout what it means toȱreason intelligently atȱall.ȱ[...]ȱȱ „Knowledge representations are [...]ȱthe medium ofȱexpression andȱ It is aȱfragmentary theory ofȱintelligentȱreasoning,ȱexpressed inȱterms ofȱthree components:ȱ (i) communication inȱwhich we tell the machine (andȱperhaps one another)ȱ about the world.ȱ[...]ȱ the representation‘s fundamentalȱconception ofȱintelligentȱreasoning; (ii)ȱthe set ofȱinferences the representation sanctions; Knowledge representation is thus aȱmedium ofȱexpression andȱ (iii)ȱandȱthe set ofȱinferences it recommends. communication for the use by us.ȱ[...]ȱ The authors consider five fields which have provided notions of intelligent reasoning: • Mathematical Logic (e.g., Prolog) Aȱrepresentation is the language inȱwhich we communicate,ȱhence we must • Psychology (e.g., frames) be able toȱspeak it without heroic effort.“ • Biology (e.g., neural networks) • Statistics (e.g., bayesian networks) • Economics (e.g., rational agents) (Siehe drei Folien weiter.) Slide 5 Slide 7 Wissen Knowledge representation as medium for efficient computation Es gibt verschiedene Wurzeln der Wissensverarbeitung, die zu unterschiedlichen Modellen von Wissen führen: Knowledge representation „is aȱmedium for pragmatically efficient computation,ȱ i.e.,ȱthe computational environment inȱwhich thinking is accomplished.ȱOneȱ • Biologie: Vernetzung Æ neuronale Netze contribution toȱthis pragmatic efficiency is supplied by the guidance aȱ representation provides for organizing information soȱasȱtoȱfacilitate • mathematische Logik: Deduktion Æ Logikkalküle, Prolog making the recommended inferences.“ • Statistik: Unsicherheit Æ Fuzzy-Logik, Bayessche Netze • Psychologie: Begriffe Æ wissensbasierte Systeme Slide 6 • Ökonomie: Ziele, Bewertungen Æ Agenten Slide 8 Neuronale Netze Neuronale Netze Vorbild Biologie • Wir werden in dieser Vorlesung den Schwerpunkt auf die begriffliche Wissensrepräsentation legen. Gehirn • Vorher aber ein kleiner Ausflug in die neuronalen Netze, um die Verschiedenheit der Ansätze zu beleuchten. In der Vergangenheit wurden Forscher aus den verschiedensten Fachgebieten durch das biologische Vorbild motiviert. Geflecht aus Neuronen • Neuronale Netze werden detailliert in der Vorlesung „Knowledge Discovery“ besprochen. Das Gehirn besteht aus • ca. 1011 Neuronen, die mit • ca. 104 anderen Neuronen • ca. 1013 Synapsen verschaltet sind. Slide 9 Konnektionistische Wissensrepräsentation Slide 11 Neuronale Netze Vorbild Biologie (Künstliche) Neuronale Netze sind massiv parallel verbundene Netzwerke aus • einfachen (üblicherweise adaptiven) Elementen in • hierarchischer Anordnung oder Organisation, die mit der Welt in der selben Art wie biologische Nervensysteme interagieren sollen. [Kohonen 1984] Neuronen Synapse Slide 10 Slide 12 Neuronale Netze x1 v11 x2 ... v12 y1 • Forschung: • Modellierung & Simulation biologischer neuronaler Netze • Funktionsapproximation • Speicherung von Informationen • ... vnm y2 w1 Wofür nutzt man sie? xm w2 y3 w3 ... yn wn • Anwendungen (z.B.): • Interpretation von Sensordaten • Prozeßsteuerung • Medizin • Elektronische Nase • Schrifterkennung • Risikomanagement • Zeitreihenanalyse und -prognose • Robotersteuerung o Slide 13 Slide 15 Neuronale Netze Begriffliches Wissen • Wir werden in dieser Vorlesung den Schwerpunkt auf die begriffliche Wissensrepräsentation legen. • Die psychologische Perspektive führt zu begrifflichen Modellen von Wissen: – – – – – – Semantische Netze Frames Begriffliche Graphen Formale Begriffsanalyse Beschreibungslogiken Ontologien/Semantic Web Figure 1: Neural Network learning to steer an autonomous vehicle (Mitchell 1997) Slide 14 Slide 16 Begriffliche Wissensrepräsentation Begriffliche Wissensrepräsentation Wir stellen nun kurz ein paar Formalismen zur Repräsentation begrifflichen Wissens vor: Was ist ein Begriff? Begriff. Denkeinheit,ȱdieȱausȱeinerȱMengeȱvonȱGegenständenȱunterȱErmittlungȱ derȱdiesenȱGegenständenȱgemeinsamenȱEigenschaftenȱmittelsȱAbstraktionȱ gebildetȱwird.ȱȱ • • • • • • • Benennung. AusȱeinemȱWortȱoderȱmehrerenȱWörternȱbestehendeȱBezeichnung. Definition. BegriffsbestimmungȱmitȱsprachlichenȱMitteln. Letzteren werden wir im Verlauf der Vorlesung wiederbegegnen als formale Grundlage für Wissensrepräsentation im WWW. [DIN 2342 Teil 1, Begriffe der Terminologielehre - Grundbegriffe, 10.92] The set ofȱcharacteristics that comeȱtogether asȱaȱunit toȱformȱthe concept is called the intension.ȱThe objects viewed asȱaȱset andȱconceptualized into aȱconcept are known asȱthe extension.ȱThe two,ȱextension andȱintension,ȱare interdependent.ȱForȱ example,ȱthe characteristics making upȱthe intension ofȱ‚lead pencil‘ determines the extension,ȱthose objects that qualify asȱlead pencils andȱvice versa. [ISO 704: Terminology Work: Principles and Methods, ISO 2000] Natural Language Concept hiararchies Existential Graphs Semantic Networks Frames Conceptual Graphs Description Logics (siehe u.a. J. Sowa: Semantic Networks. http://www.jfsowa.com/pubs/semnet.htm ) Slide 17 Begriffliche Wissensrepräsentation Slide 19 Knowledge Representation by Natural Language DIN 2330: Begriffe und Benennungen. Allgemeine Grundsätze Advantages of natural language: Definition Benennung • Very expressive. • Easy to understand for humans. Begriff Begriffsebene Merkmal a Merkmal b Merkmal c Problems with natural language (for automatic processing): • Natural language is often ambiguous. • Semantics are not formally defined. • There is little uniformity in the structure of sentences. Gegenstandsebene Gegenstand 1 Gegenstand 2 Gegenstand 3 Eigenschaft A Eigenschaft B Eigenschaft C Eigenschaft D Eigenschaft A Eigenschaft B Eigenschaft C Eigenschaft E Eigenschaft A Eigenschaft B Eigenschaft C Eigenschaft F Slide 18 Slide 20 Existential Graphs Knowledge Representation by Natural Language Excursion: Workaround for Information Retrieval and Text Mining: • “Bag of Words” Model (see lecture on Information Retrieval) • (Is not considered as ‘Knowledge Representation’ in the narrow sense.) • Introduced by Charles S. Peirce in 1897(!). • Advantages: – intuitive reading – intuitive calculus – formal semantics (but hard to extract from Peirce’s manuscripts) • • • Disadvantage: – non-linear notation difficult to represent in the computer Advantages: – easy computation in a vector space – “similarity” between documents can be measured Disadvantage: – internal structure and meaning of the text gets lost Slide 21 Concept hierarchies • going back to Aristotle/Porphyry • Advantages: – easy to understand – widely used Disadvantage: – not very expressive (but sufficient for many applications – eg, the file system on your PC) • Slide 23 Semantic Networks Slide 22 • "Semantic Nets" were first invented for computers by Richard H. Richens of the Cambridge Language Research Unit in 1956 as an „interlingua“ for machine translation of natural languages. • They were developed by Robert F. Simmons in the early 1960s and later featured prominently in the work of M. Ross Quinlan in 1966. • Did not provide clear, formal semantics. Æ Different implementations could treat the same data differently! Slide 24 Semantic Networks Conceptual graphs • Based on Peirce’s existential graphs • Aim at providing a formal semantics, by mapping to predicate logic. The example above shows default reasoning: Usually, elephants are grey. • Problem: How are royal elephants treated? Æ The formalism is ambiguous. Different implementations may treat the same data differently. A formal semantic would answer this question. • • Advantages: – intuitive reading – expressive (beyond first order logic, see example above) – provides linear notation for easier computation Later, Description Logics where introduced to overcome this problem by introducing such a semantic. • Disadvantages: – mapping to predicate logic is not (yet) well defined Æ formal semantics are still missing Slide 25 Frames • • • Slide 27 Description Logics Much like a semantic network except each node represents prototypical concepts and/or situations. Each node has several property slots whose values may be specified or inherited by default. Frames are similar to object-oriented modelling • • • Advantages: – is intuitive for many users • Disadvantages – formal semantics are missing • • Slide 26 Beschreibungslogiken sind eine Familie von entscheidbaren Fragmenten der Prädikatenlogik. Im Gegensatz zu Frames und semantischen Netzen bieten sie eine formale Semantik. Erster Vertreter: KL-ONE (1985) Heutzutage Grundlage für das Semantic Web (Æ Kapitel E & F) Slide 28 Conclusion • There are many formalisms for representing conceptual knowledge. • Depending on the purpose, a different formalism might be the most suitable. • In this light, we will discuss (in the next section) three formalisms in more details that were established for the (Semantic) Web: XML, RDF, OWL. Slide 29