|
|
Choosing A Webhost: |
Re: How do you handle MS "smart quotes"?: msg#00018lang.perl.modules.cgi-appplication
Graham TerMarsch wrote: > I've run into an issue on one of the projects that I'm working on and thought > that I'd ping the list to see how others are handling this... Lucky you, I just spent a few weeks fighting with this as $work and on Krang :) > The app accepts form data from the user, runs it through Data::FormValidator > to validate it, then stuffs it into our PostgreSQL database. We're expecting > users are going to cut/paste from MS-Word and as a result we're going to have > to deal with MS "smart quotes". > > My issue started with a DB error from DBD::Pg telling me that the input had > an > invalid byte sequence for UTF-8 (the tables in Pg are all encoded as UTF-8). > Googling around brought me several possible solutions, but I can't say that > I've found one yet that actually -works-. The only thing that will really work is to go with one character set all the way through. I'd recommend UTF-8 cause if you do, you'll never have to change when users want to do something that ISO-8859-1 or CP-1252 can't do. And UTF-8 can do everything. I will warn you that if you go down the UTF-8 route, because UTF-8 can have multibyte characters there's no magic switch to press. It's making your application know about UTF-8 all the way through. You need to do all of the following: + Tell the browser that the forms/pages are UTF-8 (using HTTP headers and <meta> tags) + When the form data comes in, decode_utf8() it. If you're using CGI.pm you'll need to use 3.30 which hasn't been released (you can find it on RT) cause it has some UTF-8 fixes. + When doing DB pull/push you'll need to tell the database that the data is in UTF-8. In MySQL it's done with the 'mysql_enable_utf8' flag on the database handle. + If you're doing any file IO which may produce or read UTF-8 then you'll need to make sure that your calls are using the IO layer magic syntax. The biggest help for me was reading the perluniintro and perlunicode perldoc pages. -- Michael Peters Developer Plus Three, LP --------------------------------------------------------------------- Web Archive: http://www.mail-archive.com/cgiapp@xxxxxxxxxxxxxxxxx/ http://marc.theaimsgroup.com/?l=cgiapp&r=1&w=2 To unsubscribe, e-mail: cgiapp-unsubscribe@xxxxxxxxxxxxxxxxx For additional commands, e-mail: cgiapp-help@xxxxxxxxxxxxxxxxx
|
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | How do you handle MS "smart quotes"?, Graham TerMarsch |
|---|---|
| Next by Date: | Re: How do you handle MS "smart quotes"?, Timothy Appnel |
| Previous by Thread: | How do you handle MS "smart quotes"?, Graham TerMarsch |
| Next by Thread: | Re: How do you handle MS "smart quotes"?, Mike Barry |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
Free MagazinesCisco NewsReceive a free quarterly e-newsletter with exclusive articles on how Cisco IT uses its own products and solutions to enable the business. subscribe Systems Management News, the newspaper for IT systems administration and data center managers! Each issue of Systems Management News is chock-full of news and analysis to help you understand what's happening in your field. subscribe The Enterprise Newsweekly eWeek is the essential technology information source for builders of e-business. subscribe Oracle Magazine Oracle Magazine contains technology strategy articles, sample code, tips, Oracle and partner news, how to articles for developers and DBAs, and more. Oracle (NASDAQ: ORCL) is the world's largest enterprise software company. subscribe Total Telecom Total Telecom is "The Economist of the communications industry". subscribe |