1-2 of 2 Results

  • Keywords: lexical knowledge x
Clear all


Network Morphology  

Andrew Hippisley

The morphological machinery of a language is at the service of syntax, but the service can be poor. A request may result in the wrong item (deponency), or in an item the syntax already has (syncretism), or in an abundance of choices (inflectional classes or morphological allomorphy). Network Morphology regulates the service by recreating the morphosyntactic space as a network of information sharing nodes, where sharing is through inheritance, and inheritance can be overridden to allow for the regular, irregular, and, crucially, the semiregular. The network expresses the system; the way the network can be accessed expresses possible deviations from the systematic. And so Network Morphology captures the semi-systematic nature of morphology. The key data used to illustrate Network Morphology are noun inflections in the West Slavonic language Lower Sorbian, which has three genders, a rich case system and three numbers. These data allow us to observe how Network Morphology handles inflectional allomorphy, syncretism, feature neutralization, and irregularity. Latin deponent verbs are used to illustrate a Network Morphology account of morphological mismatch, where morphosyntactic features used in the syntax are expressed by morphology regularly used for different features. The analysis points to a separation of syntax and morphology in the architecture of the grammar. An account is given of Russian nominal derivation which assumes such a separation, and is based on viewing derivational morphology as lexical relatedness. Areas of the framework receiving special focus include default inheritance, global and local inheritance, default inference, and orthogonal multiple inheritance. The various accounts presented are expressed in the lexical knowledge representation language DATR, due to Roger Evans and Gerald Gazdar.


Computational Semantics  

Katrin Erk

Computational semantics performs automatic meaning analysis of natural language. Research in computational semantics designs meaning representations and develops mechanisms for automatically assigning those representations and reasoning over them. Computational semantics is not a single monolithic task but consists of many subtasks, including word sense disambiguation, multi-word expression analysis, semantic role labeling, the construction of sentence semantic structure, coreference resolution, and the automatic induction of semantic information from data. The development of manually constructed resources has been vastly important in driving the field forward. Examples include WordNet, PropBank, FrameNet, VerbNet, and TimeBank. These resources specify the linguistic structures to be targeted in automatic analysis, and they provide high-quality human-generated data that can be used to train machine learning systems. Supervised machine learning based on manually constructed resources is a widely used technique. A second core strand has been the induction of lexical knowledge from text data. For example, words can be represented through the contexts in which they appear (called distributional vectors or embeddings), such that semantically similar words have similar representations. Or semantic relations between words can be inferred from patterns of words that link them. Wide-coverage semantic analysis always needs more data, both lexical knowledge and world knowledge, and automatic induction at least alleviates the problem. Compositionality is a third core theme: the systematic construction of structural meaning representations of larger expressions from the meaning representations of their parts. The representations typically use logics of varying expressivity, which makes them well suited to performing automatic inferences with theorem provers. Manual specification and automatic acquisition of knowledge are closely intertwined. Manually created resources are automatically extended or merged. The automatic induction of semantic information is guided and constrained by manually specified information, which is much more reliable. And for restricted domains, the construction of logical representations is learned from data. It is at the intersection of manual specification and machine learning that some of the current larger questions of computational semantics are located. For instance, should we build general-purpose semantic representations, or is lexical knowledge simply too domain-specific, and would we be better off learning task-specific representations every time? When performing inference, is it more beneficial to have the solid ground of a human-generated ontology, or is it better to reason directly with text snippets for more fine-grained and gradual inference? Do we obtain a better and deeper semantic analysis as we use better and deeper manually specified linguistic knowledge, or is the future in powerful learning paradigms that learn to carry out an entire task from natural language input and output alone, without pre-specified linguistic knowledge?