The Coll Corpus: Towards a corpus of web-based college student newspapers
2002 (English)In: New Frontiers of Corpus Research: Papers from the 21st International Conference on English Language Research on Computerized Corpora, Amsterdam: Rodopi, 2002, 71-90 p.Chapter in book (Other academic)
Unlike major English-language corpora hitherto released, on-line college student newspapers provide an unexplored record from much younger writers. In these newspapers, 20-year-olds address their peers in a situation that largely parallels standard newspaper writing as regards formal correctness and time pressure. Nearly unconstrained by outside intervention or house style sheets, they deal with a range of university student interests, including creative writing.
This preliminary version of the Coll Corpus consists of one issue each of nearly all 300-plus college and university newspapers available on the Web as of spring 1999, with a total of 3.88 million words. Although AmE dominates, the resultant geographical distribution is relatively well matched to actual population ratios. In its present form, the corpus already allows exploration of numerous lexical and semantic features along temporal and geographic dimensions. Given its on-line accessibility, future versions should be easily expandable by several orders of magnitude.
Place, publisher, year, edition, pages
Amsterdam: Rodopi, 2002. 71-90 p.
corpus linguistics, corpora, electronic newspapers, Internet, web newspapers
IdentifiersURN: urn:nbn:se:su:diva-131850ISBN: 90-420-1237-4OAI: oai:DiVA.org:su-131850DiVA: diva2:184551