Skip to content

Negative NaN (-nan) in CSV input treated as object #5952

Closed
@Bklyn

Description

@Bklyn

The value "-nan" (yes, this is a valid type of NaN!) in a CSV input file causes that column to be treated as 'object' instead of float64.

pd.read_csv(StringIO.StringIO('a,b\n1,2.0\n2,nan\n3,-nan')).b
Out[15]: 
0     2.0
1     NaN
2    -nan
Name: b, dtype: object

pd.read_csv(StringIO.StringIO('a,b\n1,2.0\n2,nan\n')).b
Out[16]: 
0     2
1   NaN
Name: b, dtype: float64

When the file is sufficiently large, the following error is generated:

In [3]: pd.read_csv('big.bad.csv')
/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py:1033: DtypeWarning: Columns (58,64) have mixed types. Specify dtype option on import or set low_memory=False.
  data = self._reader.read(nrows)

If the string "-nan" is replaced with "nan" all is well. I don't really need to distinguish negative NaN from NaN but would like to be able to read my data files w/o having to pre-process them to scrub all the '-nan's.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

        翻译: