logo       

RE: Request for help concerning a LSA problem: msg#00034

science.linguistics.corpora

Subject: RE: Request for help concerning a LSA problem

Dear Cecilie D. Widsteen,

I'm not familiar with the Jama Matrix Package, but recently
I conducted a search for existing implementations of LSA,
so I thought you might find these useful:

1) Text to Matrix Generator (TMG) - Matlab toolbox
http://scgroup.hpclab.ceid.upatras.gr/scgroup/Projects/TMG/
2) A package for the R Project for Statistical Computing
http://cran.r-project.org/src/contrib/Descriptions/lsa.html
3) General Text Parser (GTP) - C++ code
http://www.cs.utk.edu/~lsi/gtp-request.html
4) Links to additional LSA-related software are available at
http://www.cs.utk.edu/~lsi/soft.html

Regards,

Evgeniy.

--
Evgeniy Gabrilovich
Ph.D. student in Computer Science
Department of Computer Science, Technion - Israel Institute of Technology
Technion City, Haifa 32000, Israel
Email: gabr@xxxxxxxxxxxxxxxxx WWW: http://www.cs.technion.ac.il/~gabr
Phone: +972-4-8294948


> -----Original Message-----
> From: owner-corpora@xxxxxxxxxxxx
> [mailto:owner-corpora@xxxxxxxxxxxx] On Behalf Of Cecilie
> Desiree Widsteen
> Sent: Thursday, May 04, 2006 10:29
> To: corpora@xxxxxx
> Subject: [Corpora-List] Request for help concerning a LSA problem
>
> Hello all,
>
> I´m currently trying to implement Latent Semantic Analysis, as part of
> an automatic classification system. I´m programming in Java, and using
> the Jama Matrix package for the matrix stuff. I have stumbled
> over some
> strange problems, and would be grateful if anyone on this list could
> offer some help.
> My problem is: I have implemented a class which takes care of
> building a
> matrix representation of a corpus, and performs SVD over the
> term-by-document matrix. Most of the operations are done by the Jama
> class "Matrix". This works fine, except for the fact that when I ran
> the program over various small test corpora (like, for
> instance, the one
> from Chapter 15 in Schütze and Manning´s book Foundations of
> Statistical
> NLP) most of the righ and left singular vectors contained the correct
> values but with wrong/reversed sign?! E.g. a vector that
> should have the
> values [-0.75,-0.28,-0.20, ...] are assigned the values [0.75,0.28,
> ...]. Unfortunately, I have limited experience with linear algebra and
> the like so now I find myself completely at loss in debugging this...
> As far as I can understand, this means that my vectors are pointing in
> the opposite direction from the one they should, but why this
> is escapes
> my understanding :)
> Any help, hints, tricks and the like are extremely welcome! I can also
> send over the source code on request.
>
> Regards,
> --
> Cecilie D. Widsteen
> Department of Linguistics
> University of Oslo
>
>
>





<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | FAQ | advertise