|
Re: how to move to UTF-8 ? (was: An encoding problem): msg#00190debian-www-debian
On Thu, Jul 30, 2009 at 01:35:25PM +0200, Jens Seidel wrote: > On Thu, Jul 30, 2009 at 01:05:40PM +0200, Simon Paillard wrote: > > On Wed, Jul 29, 2009 at 06:27:02PM +0200, Frans Pop wrote: > > > > Moving the website to UTF-8 would allow to get rid of such issues. > > Could you please describe the steps you have performed and how ? > > > > For what we have identified: > > - recode wml files (using recode from recode package) > > find . -type d -exec recode latin1..utf8 {} \; > > "-type d" ? This works for directories? Obviously "-type f" > I would restrict this to *.wml files. Indeed, it's better, converting to times from latin to utf8 po files is not a good idea.. > Some files such as *.inc files need to be handled as well, some for > text files (some describe mailing lists purposes, ...). .src files as well (vote results, l10n stats) ./MailingLists/desc/ ./devel/debian-jr/ > Let's avoid converting *.png, *.pdf files, OK? They are ignored by recode (and most of them are in the english directory). > Or use iconv ... It has the disadvantagee to actually empty the file if the output is the same as the input (I know, I could use sponge or some temporarly file..) > > - update the .wmlrc file > > -D CUR_LOCALE=fr_FR.UTF-8 > > -D CHARSET=utf-8 > > > > - convert charset of po files > > cd po ; for file in *po ; do msgconv -t UTF-8 -o $file $file ; done > > That should be optional ... (but the strings need to be convertible into > UTF-8). msgconv *does* convert the strings to UTF-8, it's not only about the header. > > - some references to ISO-8859-15 (or old coding) in webpages about > > website. > > * devel/website/examples.wml et > > s/pour le/for/ ??? (yes :-) > > international/french/web.wml > > * pour la traduction, international/french/traduire.wml > > - *.UTF-8 locale on www-master -> OK, checked > > - redirections pages with specified charset > > (devel/debian-installer/gtk-frontend.wml and distrib/cd.wml) > > > > Do you see something else ? > > This should be all except: > Warn all users that the working copy should be clean before an update, as > otherwise there will be many conflicts. Frans did change some HTML entities to proper Unicode, but I don't know which method was used. -- Simon Paillard -- To UNSUBSCRIBE, email to debian-www-REQUEST@xxxxxxxxxxxxxxxx with a subject of "unsubscribe". Trouble? Contact listmaster@xxxxxxxxxxxxxxxx
|
|
||||||||||||||||||||||||||
|
|
|
| News | Mail Home | sitemap | FAQ | advertise |