logo       

Entities.encode is not UTF-8 compliant: msg#00210

nutch-user.lucene.apache.org

Subject: Entities.encode is not UTF-8 compliant

I'm testing NUTCH in a French setup, I just came accross an accent problem
when doing search

Debugging I found :
+ search.jsp works all over in UTF-8, so the query string is UTF-8 encoded
+ yet it calls Entities.encode which assumes strings 8bits encoding,
probably iso-latin-1

Anyone came across this issue or having a patch for this ?

Just to make sure I replaced all utf-8 declaration in search.jsp, by
iso-8859-1, works fine & perfect.

Looks like a bug to me.

--
-MilleBii-
<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | Mail Home | sitemap | FAQ | advertise