osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: does calcite support chinese?


Thanks for taking this up, Ted.

One change that I think would be useful would be deprecating the SaffronProperties.defaultCharset, .defaultNationalCharset, .defaultCollationName, .defaultCollationStrength,  and making them regular properties (i.e. in CalciteConnectionProperty). Probably remove “default” from their names. Then they can be set by each connection, without modifying a .properties file.

Julian



> On Nov 9, 2018, at 11:29 PM, Ted Xu <frankxus@xxxxxxxxx> wrote:
> 
> +1 for full charset support.
> 
> There may be multiple changes to make, not only related to literals but
> also to type system, casts, (jdbc) binary protocols etc.
> 
> I'd like to propose some design ideas since I'm from a CJK country and
> we've already tackled most of Calcite's charset issues.
> 
> On Thu, Nov 8, 2018 at 3:22 AM Julian Hyde <jhyde@xxxxxxxxxx> wrote:
> 
>>> I can't remember another database that allows just ISO-8859-1 in simple
>>> string literals.
>>> That makes it very surprising.
>> 
>> Try SQL Server on rextester.com <http://rextester.com/>. You need to
>> prefix literals with ’N’, like this: N’привет'
>> 
>>> What is the reason for Calcite to enforce ISO-8859-1 by default?
>>> 
>>> In other words, I can't imagine a project that would pick ISO-8859-1 as a
>>> default string literal encoding if they HAVE to make an explicit choice
>>> (e.g. no default within Calcite).
>> 
>> I really don’t know.
>> 
>> I would very much appreciate if someone took ownership of this issue, took
>> the time to understand what Calcite does today, document it, understand
>> what the SQL standard says, and make improvements.
>> 
>> Julian
>> 
>> 
>>> On Nov 7, 2018, at 11:08 AM, Vladimir Sitnikov <
>> sitnikov.vladimir@xxxxxxxxx> wrote:
>>> 
>>>> The issue is not the encoding of our Java code. The issue is the
>> encoding
>>> of the SQL we process. That SQL may or may not come from Java source
>> files.
>>> 
>>> I can't remember another database that allows just ISO-8859-1 in simple
>>> string literals.
>>> That makes it very surprising.
>>> 
>>> For instance, https://rextester.com/l/postgresql_online_compiler allows
>>> <<select 'привет'='привет' >>
>>> 
>>> What is the reason for Calcite to enforce ISO-8859-1 by default?
>>> 
>>> In other words, I can't imagine a project that would pick ISO-8859-1 as a
>>> default string literal encoding if they HAVE to make an explicit choice
>>> (e.g. no default within Calcite).
>>> 
>>> Vladimir
>> 
>>