|
Re: Using MTurk for markup tasks (was Cost of part of speech tagging): msg#00160science.linguistics.corpora
Alexandre Rafalovitch wrote: An interesting approach would be to use Amazon Mechanical Turk for Don't know what languages you're interested in. I have thought about "wikifying" other sorts of projects (like finding and keeping track of on-line computational resources, or building bilingual text collections) for "low density" languages. I have never actually tried this, but it may be instructive to look at the languages for which there are substantial Wikipedia and Wiktionary resources. Last time I looked, the usual suspects (the major and some "minor" European languages, plus Japanese) had at least 100k Wikipedia articles, while there was a slightly wider variety of languages with at least 10k Wikipedia articles (including Arabic (= MSA), Persian, Hebrew, Bahasa Indonesian, Korean, Malay, Thai, Turkish and Chinese). For comparison, the English Wikipedia has 1.5 million articles. My guess is that "wikification" (including the Amazon Mechanical Turk under this) will work best for languages where there are a substantial number of speakers with idle time, sufficient income to afford the computer and network connection, and sufficient education for the specific annotation task. -- Mike Maxwell maxwell@xxxxxxxxxxxxxx |
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | Using MTurk for markup tasks (was Cost of part of speech tagging): 00160, Alexandre Rafalovitch |
|---|---|
| Next by Date: | Re: Using MTurk for markup tasks (was Cost of part: 00160, Dragomir R. Radev |
| Previous by Thread: | Using MTurk for markup tasks (was Cost of part of speech tagging)i: 00160, Alexandre Rafalovitch |
| Next by Thread: | Re: Using MTurk for markup tasks (was Cost of part: 00160, Dragomir R. Radev |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
| News | FAQ | advertise |