[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Putting Unicode characters in JSON

On 03/22/2018 01:09 PM, Chris Angelico wrote:
> On Fri, Mar 23, 2018 at 6:46 AM, Tobiah <toby at tobiah.org> wrote:
>> I have some mailing information in a Mysql database that has
>> characters from various other countries.  The table says that
>> it's using latin-1 encoding.  I want to send this data out
>> as JSON.
>> So I'm just taking each datum and doing 'name'.decode('latin-1')
>> and adding the resulting Unicode value right into my JSON structure
>> before doing .dumps() on it.  This seems to work, and I can consume
>> the JSON with another program and when I print values, they look nice
>> with the special characters and all.
>> I was reading though, that JSON files must be encoded with UTF-8.  So
>> should I be doing string.decode('latin-1').encode('utf-8')?  Or does
>> the json module do that for me when I give it a unicode object?
> Reconfigure your MySQL database to use UTF-8. There is no reason to
> use Latin-1 in the database.
> If that isn't an option, make sure your JSON files are pure ASCII,
> which is the common subset of UTF-8 and Latin-1.
> ChrisA

It works the way I'm doing it.  I checked and it turns out that
whether I do datum.decode('latin-1') or datum.decode('latin-1').encode('utf8')
I get identical JSON files after .dumps().