logo       

Choosing A Webhost:
A web hosting service is a type of Internet hosting service that allows individuals and organizations to provide their own website accessible via the World Wide Web. Web hosts are companies that provide space on a server they own for use by their clients as well as providing Internet connectivity, typically in a data center. Web hosts can also provide data center space and connectivity to the Internet for servers they do not own to be located in their data center, called colocation. more...

Re: How do you handle MS "smart quotes"?: msg#00018

lang.perl.modules.cgi-appplication

Subject: Re: How do you handle MS "smart quotes"?

Graham TerMarsch wrote:
> I've run into an issue on one of the projects that I'm working on and thought
> that I'd ping the list to see how others are handling this...

Lucky you, I just spent a few weeks fighting with this as $work and on Krang :)

> The app accepts form data from the user, runs it through Data::FormValidator
> to validate it, then stuffs it into our PostgreSQL database. We're expecting
> users are going to cut/paste from MS-Word and as a result we're going to have
> to deal with MS "smart quotes".
>
> My issue started with a DB error from DBD::Pg telling me that the input had
> an
> invalid byte sequence for UTF-8 (the tables in Pg are all encoded as UTF-8).
> Googling around brought me several possible solutions, but I can't say that
> I've found one yet that actually -works-.

The only thing that will really work is to go with one character set all the way
through. I'd recommend UTF-8 cause if you do, you'll never have to change when
users want to do something that ISO-8859-1 or CP-1252 can't do. And UTF-8 can do
everything. I will warn you that if you go down the UTF-8 route, because UTF-8
can have multibyte characters there's no magic switch to press. It's making your
application know about UTF-8 all the way through.

You need to do all of the following:

+ Tell the browser that the forms/pages are UTF-8 (using HTTP headers and <meta>
tags)
+ When the form data comes in, decode_utf8() it. If you're using CGI.pm you'll
need to use 3.30 which hasn't been released (you can find it on RT) cause it has
some UTF-8 fixes.
+ When doing DB pull/push you'll need to tell the database that the data is in
UTF-8. In MySQL it's done with the 'mysql_enable_utf8' flag on the database
handle.
+ If you're doing any file IO which may produce or read UTF-8 then you'll need
to make sure that your calls are using the IO layer magic syntax.

The biggest help for me was reading the perluniintro and perlunicode perldoc
pages.

--
Michael Peters
Developer
Plus Three, LP


---------------------------------------------------------------------
Web Archive: http://www.mail-archive.com/cgiapp@xxxxxxxxxxxxxxxxx/
http://marc.theaimsgroup.com/?l=cgiapp&r=1&w=2
To unsubscribe, e-mail: cgiapp-unsubscribe@xxxxxxxxxxxxxxxxx
For additional commands, e-mail: cgiapp-help@xxxxxxxxxxxxxxxxx




<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

Recently Viewed:
krysalis.sandbo...    web.zope.zwiki/...    gnome.apps.gnum...    xfree86.newbie/...    editors.vim/200...    mozilla.enigmai...    boot-loaders.gr...    network.vnc.ult...    redhat.release....    java.geronimo.u...    os.netbsd.devel...    horde.wicked/20...    linux.lsb.discu...    ietf.ips/2005-0...    alsa.devel/2002...    user-groups.lin...    package-managem...    debian.devel.da...    security.cyrus....    video.gstreamer...   
Home | blog view | USPTO Patent Archive | advertise | OSDir is an inevitable website. super tiny logo

Free Magazines

Cisco News
Receive a free quarterly e-newsletter with exclusive articles on how Cisco IT uses its own products and solutions to enable the business.
subscribe

Systems Management News, the newspaper for IT systems administration and data center managers! Each issue of Systems Management News is chock-full of news and analysis to help you understand what's happening in your field.
subscribe

The Enterprise Newsweekly eWeek is the essential technology information source for builders of e-business.
subscribe

Oracle Magazine Oracle Magazine contains technology strategy articles, sample code, tips, Oracle and partner news, how to articles for developers and DBAs, and more. Oracle (NASDAQ: ORCL) is the world's largest enterprise software company.
subscribe

Total Telecom Total Telecom is "The Economist of the communications industry".
subscribe

Navigation