Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Bayesian Word Alignment for Massively Parallel Texts
Stockholm University, Faculty of Humanities, Department of Linguistics, Computational Linguistics.ORCID iD: 0000-0002-6027-4156
2014 (English)In: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, volume 2: Short Papers, Association for Computational Linguistics, 2014, 123-127 p.Conference paper, Published paper (Refereed)
Abstract [en]

There has been a great amount of work done in the field of bitext alignment, but the problem of aligning words in massively parallel texts with hundreds or thousands of languages is largely unexplored. While the basic task is similar, there are also important differences in purpose, method and evaluation between the problems. In this work, I present a non-parametric Bayesian model that can be used for simultaneous word alignment in massively parallel corpora. This method is evaluated on a corpus containing 1144 translations of the New Testament.

Place, publisher, year, edition, pages
Association for Computational Linguistics, 2014. 123-127 p.
Keyword [en]
word alignment, bayesian models, nonparametric models, gibbs sampling, parallel corpora, massively parallel corpora
National Category
Language Technology (Computational Linguistics)
Identifiers
URN: urn:nbn:se:su:diva-103011OAI: oai:DiVA.org:su-103011DiVA: diva2:714290
Conference
14th Conference of the European Chapter of the Association for Computational Linguistics
Available from: 2014-04-26 Created: 2014-04-26 Last updated: 2014-04-28Bibliographically approved

Open Access in DiVA

eacl2014.pdf(208 kB)96 downloads
File information
File name FULLTEXT01.pdfFile size 208 kBChecksum SHA-512
2cc880d73b6b0cf45ed4f8ab0aeba983863aa8d0c33b8a22e3d752b6be5fb40024ca69eb66211681542c82adf5cae7baf457247f56b46d9369d480a5909d9f56
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Östling, Robert
By organisation
Computational Linguistics
Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar
Total: 96 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 181 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf