|
|
Choosing A Webhost: |
Re: a simple algorithm problem: msg#00003search.snowball
On Thu, Jan 06, 2005 at 09:12:34AM +0000, Martin Porter wrote: > So one idea is to declare 'utf8' in the Snowball script, allowing character > defs in the range 0-64K, as in the 2-byte character version. Characters > could be written with their Unicode values. Presumably this still restricts Snowball to code points in the BMP? Or does it just restrict it to recognising and doing things with characters at code points in the BMP, passing through any others? There's not a huge amount outside it yet, so this may not matter at all. > and encoded in utf-8 form in strings. What's the character encoding of snowball scripts at the moment? It isn't touched upon in the manual, so I'm guessing at present it's expected to be ASCII or similar. Cheers, James -- /--------------------------------------------------------------------------\ James Aylett xapian.org james@xxxxxxxxxxxx uncertaintydivision.org
|
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | Re: a simple algorithm problem, Martin Porter |
|---|---|
| Next by Date: | Re: a simple algorithm problem, Martin Porter |
| Previous by Thread: | Re: a simple algorithm problem, Martin Porter |
| Next by Thread: | Re: a simple algorithm problem, Olly Betts |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
Free MagazinesCisco NewsReceive a free quarterly e-newsletter with exclusive articles on how Cisco IT uses its own products and solutions to enable the business. subscribe Systems Management News, the newspaper for IT systems administration and data center managers! Each issue of Systems Management News is chock-full of news and analysis to help you understand what's happening in your field. subscribe The Enterprise Newsweekly eWeek is the essential technology information source for builders of e-business. subscribe Oracle Magazine Oracle Magazine contains technology strategy articles, sample code, tips, Oracle and partner news, how to articles for developers and DBAs, and more. Oracle (NASDAQ: ORCL) is the world's largest enterprise software company. subscribe Total Telecom Total Telecom is "The Economist of the communications industry". subscribe |