Skip to content

Series constructor skips dtype=str conversion for list data #16605

Closed
@kbg

Description

@kbg

Code Example

from pandas import Series, DataFrame

int_list = [1, 2, 3]

s1 = Series(int_list)
s2 = Series(int_list, dtype=float)
s3 = Series(int_list, dtype=str)
s4 = Series(int_list, dtype='U')
s5 = Series(Series(int_list), dtype=str)

print('Series element type:')
print('  s1:', type(s1[0]))
print('  s2:', type(s2[0]))
print('  s3:', type(s3[0]))
print('  s4:', type(s4[0]))
print('  s5:', type(s5[0]))

f1 = DataFrame(int_list)
f2 = DataFrame(int_list, dtype=float)
f3 = DataFrame(int_list, dtype=str)
f4 = DataFrame(int_list, dtype='U')
f5 = DataFrame(DataFrame(int_list), dtype=str)

print('\nDataFrame element type:')
print('  f1:', type(f1.iloc[0, 0]))
print('  f2:', type(f2.iloc[0, 0]))
print('  f3:', type(f3.iloc[0, 0]))
print('  f4:', type(f4.iloc[0, 0]))
print('  f5:', type(f5.iloc[0, 0]))

Output:

Series element type:
  s1: <class 'numpy.int64'>
  s2: <class 'numpy.float64'>
  s3: <class 'int'>
  s4: <class 'int'>
  s5: <class 'str'>

DataFrame element type:
  f1: <class 'numpy.int64'>
  f2: <class 'numpy.float64'>
  f3: <class 'str'>
  f4: <class 'str'>
  f5: <class 'str'>

Problem description

When creating a Series from a list using dtype=str, the data elements are not converted to strings. The Series instance apparently just keeps the original (Python) data type in this case.

This problem does not occur when, instead of a list, another Series is used as input data (s5 in the example above). It also does not happen when creating DataFrame instances from list data.

Expected Output

Series element type:
  s1: <class 'numpy.int64'>
  s2: <class 'numpy.float64'>
  s3: <class 'str'>
  s4: <class 'str'>
  s5: <class 'str'>

DataFrame element type:
  f1: <class 'numpy.int64'>
  f2: <class 'numpy.float64'>
  f3: <class 'str'>
  f4: <class 'str'>
  f5: <class 'str'>

Output of pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 3.6.1.final.0
python-bits: 64
OS: Linux
OS-release: 4.11.3-1-ARCH
machine: x86_64
processor: 
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.20.1
pytest: 3.1.1
pip: 9.0.1
setuptools: 36.0.1
Cython: 0.25.2
numpy: 1.12.1
scipy: 0.19.0
xarray: None
IPython: 5.3.0
sphinx: 1.6.1
patsy: None
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: 1.2.1
tables: 3.4.2
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.999999999
sqlalchemy: 1.1.10
pymysql: None
psycopg2: 2.7.1 (dt dec pq3 ext lo64)
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: 0.4.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugCompatpandas objects compatability with Numpy or Python functionsDtype ConversionsUnexpected or buggy dtype conversions

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions