Skip to content

Raise an error in read_csv when names and prefix both are not None #39123

Closed
@malinkallen

Description

@malinkallen

Problem description

As a result of the discussion in issue #27394, read_csv is changed such that it raises an error when both header and prefix are different from None. A user had misunderstood how to (not) use header and prefix together. I think that the usage of namesand prefix can be misunderstood in a similar way.

It could also be that a user accidentally provides both arguments and expects prefix to have effect. Right now, it seems like prefix is silently ignored when names is provided.

Describe the solution you'd like

Raise a ValueError when the read_csv arguments prefix and names both differ from None, in accordance with issue #27394 and pull request #31383.

API breaking implications

This will "break" code that passes values (!=None) for both prefix and names, but since it was an accepted solution for issue #27394, I think it could be used here as well.

Describe alternatives you've considered

Another possibility is to issue a warning instead of an error, but that would be inconsistent with the behavior when prefix and header is both not None.

Additional context

Examples below are run in pandas version 1.3.0.dev0+210.g9f1a41dee.

When I run

pandas.read_csv("my_data.csv", prefix="XZ", header=0)

I get the following output:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/malin/workspace/py/pandas/pandas/io/parsers.py", line 605, in read_csv
    return _read(filepath_or_buffer, kwds)
  File "/home/malin/workspace/py/pandas/pandas/io/parsers.py", line 457, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "/home/malin/workspace/py/pandas/pandas/io/parsers.py", line 814, in __init__
    self._engine = self._make_engine(self.engine)
  File "/home/malin/workspace/py/pandas/pandas/io/parsers.py", line 1045, in _make_engine
    return mapping[engine](self.f, **self.options)  # type: ignore[call-arg]
  File "/home/malin/workspace/py/pandas/pandas/io/parsers.py", line 1853, in __init__
    ParserBase.__init__(self, kwds)
  File "/home/malin/workspace/py/pandas/pandas/io/parsers.py", line 1334, in __init__
    raise ValueError(
ValueError: Argument prefix must be None if argument header is not None

but running

pandas.read_csv("my_data.csv", prefix="XZ", names=["a", "b", "c"])

works fine, except that prefix is silently ignored.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions