Closed
Description
I try to read a csv file were some lines are longer than the rest.
Pandas C engine
throws an error with these lines.
But I do not was to skip these as discussed in a comment on a similar issue.
I prefer to "cut" the "bad columns" off using usecols. But I get the following errors:
df = pd.read_csv(file_path, sep=',', skiprows=1, header=None,
usecols=range(0,23), nrows=None,
engine='python')
Throws:
ValueError: Expected 10 fields in line 100, saw 20
Why is the ValueError
raised although I explicitly defined the columns to use?
References:
- read_csv() & extra trailing comma(s) cause parsing issues. #2886
- Dealing with bad lines -- does not apply
- Dealing with bad lines II
- Handling Variable Number of Columns with Pandas - Python
- pandas read_csv and filter columns with usecols, especially: http://stackoverflow.com/a/27791362