|
SenseClusters version 0.85 released!: msg#00049science.linguistics.corpora
It has been a year or two since I have sent a SenseClusters announcement to this list, so I thought it was time to update you on the current state of the project. SenseClusters is a free software package that clusters similar contexts using a variety of lexical features and representation methods. It includes support for SVD and a range of clustering algorithms. It also provides several methods for automatically determining the number of clusters in your input contexts. It is language independent, and we hope easy to use! We have mostly applied SenseClusters to word sense and name discrimination, but it is really much more general than that. For example, we have done some experiments clustering email that have been quite promising. You can download and install SenseClusters on your own Linux or Unix system. If you would prefer not to install, or you do not have access to Linux or Unix system, you can use our web interface, OR you can run off of a Knoppix CD we have created with SenseClusters already installed. You can find SenseClusters at the following site, which includes a link to the web interface, and a link to download the system. http://senseclusters.sourceforge.net/ If you would like a Knoppix CD, please visit our demo at NAACL this June in New York City, or write to us and we can either send you a CD or make the iso image available to you. The most current version of SenseClusters is 0.85. This features our adaptation of the Gap Statistic, a state of the art method for automatically finding the number of clusters in a data set. In addition to clustering contexts, SenseClusters does provide some support for finding word clusters, and one of the things we will be working on this summer is adding support for Latent Semantic Analysis. So, there is a lot already included in SenseClusters, and more planned. Please check it out, and let us know if you have any questions, comments, or suggestions! Cordially, Ted -- Ted Pedersen http://www.d.umn.edu/~tpederse |
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | Postdoc / Research Scientist Position at NIH: 00049, Mehmet Kayaalp |
|---|---|
| Next by Date: | Density of Language Taxa: 00049, Yuri Tambovtsev |
| Previous by Thread: | Postdoc / Research Scientist Position at NIHi: 00049, Mehmet Kayaalp |
| Next by Thread: | Density of Language Taxa: 00049, Yuri Tambovtsev |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
| News | FAQ | advertise |