[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Convert a list with wrong encoding to utf8

vergos.nikolas at gmail.com wrote:
> I just tried:
> names = tuple( [s.encode('latin1').decode('utf8') for s in names] )
> but i get
> UnicodeEncodeError('latin-1', '???? ???????', 0, 4, 'ordinal not in range(256)')

This suggests that the string you're getting from the database *has*
already been correctly decoded, and there is no need to go through the
latin1 re-coding step.

What do you get if you do


immediately *before* trying to re-code them?

What *may* be happening is that most of your data is stored in the
database encoded as utf-8, but some of it is actually using a different
encoding, and you're getting confused by the resulting inconsistencies.

I suggest you look carefully at *all* the names in the list, straight
after getting them from the database. If some of them look okay and
some of them look like mojibake, then you have bad data in the database
in the form of inconsistent encodings.