logo       

Sponsor
FREE Network Mapping Tool for Microsoft® Office Visio® Professional 2007
Don't map your network by hand - let LANsurveyor Exx press for Microsoft Visio Professional 2007 automatically create network diagrams for you!

Re: Continuing encoding fun....: msg#00195

db.postgresql.odbc

Subject: Re: Continuing encoding fun....



> -----Original Message-----
> From: pgsql-odbc-owner@xxxxxxxxxxxxxx
> [mailto:pgsql-odbc-owner@xxxxxxxxxxxxxx] On Behalf Of Marc Herbert
> Sent: 22 November 2005 09:33
> To: pgsql-odbc@xxxxxxxxxxxxxx
> Subject: Re: [ODBC] Continuing encoding fun....
>
> "Dave Page" <dpage@xxxxxxxxxxxxxxxxxx> writes:
>
> >> I agree that 4) can never work, because ODBC does not seem
> compatible
> >> with multibyte apps by design. ODBC caters for "ANSI" and "Unicode"
> >> strings, that's all.
> >> <http://blogs.msdn.com/oldnewthing/archive/2004/05/31/144893.aspx>
> >
>
> > Actually our ANSI driver works quite nicely in various non-Unicode
> > multibyte encodings such as Shift-JIS, EUC_CN, JOHAB and more. It'll
> > even work with pure UTF-8 in multibyte mode using the ANSI API.
>
> Great.
>
> Out of curiosity, is this because all the ODBC code has a "don't
> touch" attitude in this full-ANSI case, leaving all string data as is?
> Or is there something more clever? Who performs the conversion if the
> database is in UTF-8 for instance? Multibyte cases seem to
> fall outside
> the scope of the ODBC spec, which refers only to "ANSI" and "Unicode".

No, Multibyte support was intentionally added by Eiji Tokuya in 2001. Don't ask
me how it works though as I really don't know. Much of the code for it is in
multibyte.c if you want to take a peek.


> Very interesting. Maybe the driver manager does so only because the it
> cannot/fails to get the active codepage, falling back on CP-1252?
> (CP1252 ~= latin1,
> <http://czyborra.com/charsets/codepages.html#CP1252>)

The docs are somewhat fuzzy on this point, simply stating that

"If the driver is a Unicode driver, the Driver Manager makes function calls as
follows:" ... "Converts an ANSI function (with the A suffix) to a Unicode
function (with the W suffix) by converting the string arguments into Unicode
characters and passes the Unicode function to the driver."


(http://msdn.microsoft.com/library/default.asp?url=/library/en-us/odbc/htm/odbcunicode_applications.asp)

My assertion that the driver does the conversion comes from the SQL Server
driver which allows you to turn conversion on or off:

"Perform translation for character data check box

When selected, the SQL Server ODBC driver converts ANSI strings sent between
the client computer and SQL Server by using Unicode. The SQL Server ODBC driver
sometimes converts between the SQL Server code page and Unicode on the client
computer. This requires that the code page used by SQL Server be one of the
code pages available on the client computer.

When cleared, no translation of extended characters in ANSI character strings
is done when they are sent between the client application and the server. If
the client computer is using an ANSI code page (ACP) different from the SQL
Server code page, extended characters in ANSI character strings may be
misinterpreted. If the client computer is using the same code page for its ACP
that SQL Server is using, the extended characters are interpreted correctly."

If Microsoft intended the DM to do the conversion when they wrote the spec, why
would they then add the same functionality to their driver?

> >> Is this "bug" true for every driver manager out there?
>
> > It's not really a bug, but I believe so, yes.
>
> including unixodbc and iodbc for instance?

If they follow the parts of the spec I quoted above, and interpret them in the
same when, then yes. However I'm not overly familiar with either DM, so I can't
say for sure.


> > It gets corrected by
> > the more advanced drivers though - for example, the SQL server
> > driver might see a 'Š' character (8A). It knows the local charset is
> > LATIN4, so it can then rewrite that character to 0160, the Unicode
> > equivalent.
>
> Are you saying that the SQL server driver is fixing the flawed
> conversion job of the driver manager, finally taking the codepage into
> account? Surprising to say the least!
>
> By the way 0x8A is not in the range of latin4
> <http://czyborra.com/charsets/iso8859.html#ISO-8859-4>

http://www.gar.no/home/mats/8859-4.htm says differently, however, I can't claim
to know enough about encoding issues to refute either. I've been forced to
learn what I can about the subject to help maintain this driver and certainly
may have got the wrong end of the stick on one or more points!

Regards, Dave.

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend



<Prev in Thread] Current Thread [Next in Thread>
Sponsor
FREE Network Mapping Tool for Microsoft® OfficeVisio Professional 2007
Don't map your network by hand - let LANsurveyor Express for Microsoft Visio Professional 2007
automatically create network diagrams for you!
Google Custom Search

Free Magazines

Cisco News
Receive a free quarterly e-newsletter with exclusive articles on how Cisco IT uses its own products and solutions to enable the business.
subscribe

Systems Management News, the newspaper for IT systems administration and data center managers! Each issue of Systems Management News is chock-full of news and analysis to help you understand what's happening in your field.
subscribe

The Enterprise Newsweekly eWeek is the essential technology information source for builders of e-business.
subscribe

Oracle Magazine Oracle Magazine contains technology strategy articles, sample code, tips, Oracle and partner news, how to articles for developers and DBAs, and more. Oracle (NASDAQ: ORCL) is the world's largest enterprise software company.
subscribe

Total Telecom Total Telecom is "The Economist of the communications industry".
subscribe

Navigation

Home | sitemap | advertise | OSDir is an inevitable website. super tiny logo