Unlike morphology (the internal formal structure of words) and semantics (the study of the meaning of words and sentences), morphosemantics is concerned with the link between marker and meaning. Traditional approaches to morphosemantics such as semiotics and construction grammar argue that the relationship between image acoustique and concept is symbolic. This works well if the links are known (in the “proficiency mode”). In this talk I argue that there is a statistical alternative which is particularly useful if the links are not known (in the “discovery mode”). Meanings and markers form collocations in texts which can be measured by means of collocation measures. However, there is a considerable non-isomorphism between marker and meaning. As is well known a marker can have many different meanings (polysemy). Somewhat less well known is that a meaning is often expressed by many different markers, both paradigmatically and syntagmatically (polymorphy).
To make meanings and markers commensurable, they must be converted into units of the same kind. This same kind is the set of contexts in a text or corpus where a marker or meaning occurs. If the distribution of a meaning in a corpus is known, its corresponding marker complex can be determined which consists of a paradigmatically and syntagmatically ordered set of simple markers. The markers considered here are surface markers of two types: word forms and morphs (continuous character strings within word forms). More abstract marker types such as lexemes, grammatical categories and word classes might often be better markers than surface markers, but they are not available in the discovery mode.
Marker complexes are a simple construction type. A procedural approach to construction grammar is adopted where marker complexes are viewed as an intermediate stage in a processing chain of increasingly more complex construction types from simple markers via marker complexes to syntactic constructions. Marker complexes have the advantage that they can be extracted automatically from massively parallel texts, i.e. translations of the same text into many languages, such as the New Testament used here. In parallel texts the same meanings (with certain restrictions) are expressed across different languages. This means that a functional domain can be defined as a set of contexts where a certain meaning occurs.
The same procedure is applied to cross-linguistically similar material and the procedure applied to cross-linguistic data is fully explicit and therefore replicable. It can be implemented in a computer program and run without the intervention of a typologist (algorithmic typology). The underlying idea is that the procedure of extraction is invariant (procedural universal) whereas the extracted structures can be highly variable depending on the texts and languages to which they are applied.
The talk considers to what extent surface markers are sufficient as input for the identification of constructions in a range of grammatical and lexical domains in a world-wide convenience sample of somewhat more than 50 languages. One of the domains considered in more detail is comparison of inequality. Comparison of inequality is expressed in most languages of the sample by an at least bipartite marker complex consisting of the parts standard marker (‘than’) and predicate intensifier (‘more’, ‘-er’). It will be argued here that both of them are intrinsic parts of the comparative construction. These findings are not fully in accordance with Leon Stassen’s typology of comparison – a classical study in functional domain typology – which is based exclusively on the encoding of the standard NP. Other domains considered in the talk include negation, ‘want’, future, and predicative possession.
Berner Zirkel für Sprachwissenschaft, Universität Bern, Institut für Sprachwissenschaft