json_normalize() can't deal with non-ascii characters in unicode keys

Example code:

``` python
import pandas
import json

testjson = u'''
[{"Ünicøde":0,"sub":{"A":1, "B":2}},
 {"Ünicøde":1,"sub":{"A":3, "B":4}}]
 '''.encode('utf8')
pd.io.json.json_normalize(json.loads(testjson))
```

Output:

```
Traceback (most recent call last):
  File "...lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2885, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-12-f866f9c7ec7c>", line 5, in <module>
    pd.io.json.json_normalize(json.loads(testjson))
  File ".../lib/python2.7/site-packages/pandas/io/json.py", line 715, in json_normalize
    data = nested_to_record(data)
  File ".../lib/python2.7/site-packages/pandas/io/json.py", line 617, in nested_to_record
    newkey = str(k)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xdc' in position 0: ordinal not in range(128)

```

Expected output

```
   sub.A  sub.B  Ünicøde
0      1      2        0
1      3      4        1
```

The cause are probably
https://github.com/pydata/pandas/blob/master/pandas/io/json.py#L618
and https://github.com/pydata/pandas/blob/master/pandas/io/json.py#L620

Those lines seemingly were introduced to deal with numeric types, but fail when `k` is a Unicode object containing non-ascii characters.

It seems to be the same bug in principle as https://github.com/pydata/pandas/issues/13101


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

json_normalize() can't deal with non-ascii characters in unicode keys #13213

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

json_normalize() can't deal with non-ascii characters in unicode keys #13213

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions