BUG (Performance): Performance of to_csv varies significantly depending on when/how index is set

- [x] I have checked that this issue has not already been reported.

There are some historic "to_csv is slow" reports, but none mention the specific behaviour I've seen with regard to indices.

- [x] I have confirmed this bug exists on the latest version of pandas.

- [ ] (optional) I have confirmed this bug exists on the master branch of pandas.

---

**Note**: Please read [this guide](https://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports) detailing how to provide the necessary information for us to reproduce your bug.

#### Code Sample, a copy-pastable example

```python

import cProfile
import time
import pstats
from io import BytesIO, StringIO

import pandas as pd
import numpy as np

NUM_INDEX_COLS = 3
NUM_COLS = 50
NUM_ROWS = 1000000

# Generate a large-ish dummy dataframe
data = {}
idx_cols = []
for idx_id in range(NUM_INDEX_COLS):
    idx_name = f"Index_{idx_id}"
    idx_cols.append(idx_name)
    if idx_id == 0:
        data[idx_name] = np.random.randint(0, NUM_ROWS, NUM_ROWS)
    else:
        data[idx_name] = np.full(NUM_ROWS, 1, dtype=np.int)

for col_id in range(NUM_COLS):
    col_name = f"Column_{col_id}"
    data[col_name] = np.random.uniform(0, 100000.0, NUM_ROWS)

source_df = pd.DataFrame(data)
source_df_indexed = source_df.set_index(idx_cols)

# Default index - fast
df1 = source_df.head(10000)
# MultiIndex from idx_cols, then take head - slow
df2 = source_df_indexed.head(10000)
# Take head, then set MultiIndex from idx_cols - fast
df3 = source_df.head(10000).set_index(idx_cols)
# Deepcopy of df2 - slow (docs say that Index is not copied, so not surprising)
df4 = df2.copy(deep=True)

# Will run each df twice, and take timings from second run.
dataframes = [df1, df1, df2, df2, df3, df3, df4, df4]

#### WRITE TO CSV WITH EXISTING INDEX
#
# df1: 0.53s, with no index set
# df2: 2.35s, with index set before calling .head()
# df3: 0.54s, with index set after calling .head()
# df4: 2.12s, deep copy of df2
#

for i, df in enumerate(dataframes):
    out = StringIO()
    t1 = time.time()
    df.to_csv(out)
    t2 = time.time()
    if (i % 2) == 1:
        print(t2-t1)

#### WRITE TO CSV WITH INCLUDED CALL TO RESET INDEX
#
# df1: 0.55s, with no index set
# df2: 0.53s, with index set before calling .head()
# df3: 0.54s, with index set after calling .head()
# df4: 0.54s, deep copy of df2

for i, df in enumerate(dataframes):
    out = StringIO()
    t1 = time.time()
    df.reset_index().to_csv(out)
    t2 = time.time()
    if (i % 2) == 1:
        print(t2-t1)


```

#### Problem description

The speed of `to_csv` varies significantly depending on whether an Index was set, and when/how that Index was set. This is not the case with e.g. `to_parquet`. The original case where I noticed this speed difference is not so trivial, but it involves a MultiIndexed dataframe that has been manipulated before being output to CSV, taking approximately 50 minutes to output a ~30GB file. 

(Yes, CSV is slow and horrible for many other reasons even at the best of times, and we avoid it as much as possible.)

In the example above, I create a million row dataframe `source_df`, with three integer columns (two have the value `1` throughout), and 50 float columns. I then create four subsets of the first 10000 rows, doing different things with the index:

* df1: Just the source dataframe, no index modification (so it'll have the default RangeIndex)
* df2: Set the three integer columns as a MultiIndex, and then take .head(10000)
* df3: Take .head(10000), and then set the three integer columns as a MultiIndex
* df4: Deep copy of df2. (The Pandas documentation suggests that at least the index won't actually be copied).

I then time how long it takes to write each out to CSV (using a StringIO, to avoid IO speed concerns). df1 (no index) and df3 (index set after taking .head) are fast; df2 (index set before taking .head) and df4 (deepcopy of df2) are much slower. It appears that the index being a subset of a larger dataframe significantly slows down the writing.

I then do a separate test, whereby I reset the index of each dataframe immediately before writing. In this case, all write out as fast as df1/df3 in the original test.

I have also tested with `to_parquet` by changing `to_csv` to `to_parquet` and providing a BytesIO. All four dataframes are output in 0.08s, regardless of index origin and whether the index is reset or not.

Surprisingly (for me), profiling indicates that `{method 'astype' of 'numpy.ndarray' objects}` is where the extra time originates from. There aren't many extra calls - they just take much, much longer. If you're lucky, this might even be an upstream issue!

Summarised profiling output for df1, df2, and df3 in the don't-reset-index case below.

<details>

```

df1 - Default index, fast

         1928 function calls (1910 primitive calls) in 0.530 seconds

   Ordered by: internal time
   List reduced from 126 to 20 due to restriction <20>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
       12    0.285    0.024    0.285    0.024 {method 'astype' of 'numpy.ndarray' objects}
        6    0.242    0.040    0.242    0.040 {pandas._libs.writers.write_csv_rows}
       19    0.001    0.000    0.001    0.000 missing.py:193(_isna_ndarraylike)
        6    0.000    0.000    0.530    0.088 csvs.py:330(_save_chunk)
    20/13    0.000    0.000    0.000    0.000 {built-in method numpy.array}
  384/382    0.000    0.000    0.000    0.000 {built-in method builtins.isinstance}
      278    0.000    0.000    0.000    0.000 {built-in method builtins.getattr}
        6    0.000    0.000    0.282    0.047 blocks.py:1938(to_native_types)
        1    0.000    0.000    0.530    0.530 csvs.py:313(_save)
      200    0.000    0.000    0.000    0.000 generic.py:10(_check)
       12    0.000    0.000    0.000    0.000 blocks.py:292(getitem_block)
       19    0.000    0.000    0.001    0.000 missing.py:130(_isna)
        6    0.000    0.000    0.000    0.000 generic.py:3551(_slice)
       12    0.000    0.000    0.000    0.000 range.py:697(__getitem__)
        6    0.000    0.000    0.000    0.000 managers.py:757(get_slice)
       19    0.000    0.000    0.000    0.000 dtypes.py:903(is_dtype)
       12    0.000    0.000    0.000    0.000 range.py:153(_data)
        7    0.000    0.000    0.004    0.001 base.py:984(_format_native_types)
        6    0.000    0.000    0.000    0.000 {built-in method numpy.arange}
        6    0.000    0.000    0.001    0.000 blocks.py:656(to_native_types)


df2 - MultiIndex set before head, slow

         10629 function calls (10518 primitive calls) in 2.135 seconds

   Ordered by: internal time
   List reduced from 182 to 20 due to restriction <20>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
       20    1.630    0.082    1.630    0.082 {method 'astype' of 'numpy.ndarray' objects}
        5    0.247    0.049    0.247    0.049 {pandas._libs.writers.write_csv_rows}
   122/91    0.163    0.001    0.163    0.002 {built-in method numpy.array}
        6    0.042    0.007    1.588    0.265 base.py:954(to_native_types)
       15    0.027    0.002    0.027    0.002 {pandas._libs.lib.infer_dtype}
        1    0.016    0.016    2.135    2.135 csvs.py:313(_save)
       15    0.001    0.000    0.001    0.000 {pandas._libs.algos.take_1d_object_object}
       21    0.001    0.000    0.002    0.000 missing.py:193(_isna_ndarraylike)
2323/2321    0.001    0.000    0.001    0.000 {built-in method builtins.isinstance}
        5    0.000    0.000    0.001    0.000 {pandas._libs.lib.fast_zip}
    30/15    0.000    0.000    0.192    0.013 base.py:293(__new__)
     1386    0.000    0.000    0.000    0.000 {built-in method builtins.getattr}
       16    0.000    0.000    1.350    0.084 base.py:984(_format_native_types)
        5    0.000    0.000    2.119    0.424 csvs.py:330(_save_chunk)
        5    0.000    0.000    1.546    0.309 multi.py:1201(_format_native_types)
      910    0.000    0.000    0.000    0.000 generic.py:10(_check)
       90    0.000    0.000    0.000    0.000 base.py:498(_shallow_copy)
      105    0.000    0.000    0.000    0.000 base.py:463(_simple_new)
      171    0.000    0.000    0.000    0.000 common.py:1460(is_extension_array_dtype)
      217    0.000    0.000    0.000    0.000 {built-in method builtins.hasattr}

df3 - MultiIndex set after head, fast

         10629 function calls (10518 primitive calls) in 0.554 seconds

   Ordered by: internal time
   List reduced from 182 to 20 due to restriction <20>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
       20    0.301    0.015    0.301    0.015 {method 'astype' of 'numpy.ndarray' objects}
        5    0.243    0.049    0.243    0.049 {pandas._libs.writers.write_csv_rows}
   122/91    0.002    0.000    0.002    0.000 {built-in method numpy.array}
       15    0.001    0.000    0.001    0.000 {pandas._libs.lib.infer_dtype}
       21    0.001    0.000    0.001    0.000 missing.py:193(_isna_ndarraylike)
2323/2321    0.000    0.000    0.001    0.000 {built-in method builtins.isinstance}
     1386    0.000    0.000    0.000    0.000 {built-in method builtins.getattr}
        6    0.000    0.000    0.028    0.005 base.py:954(to_native_types)
    30/15    0.000    0.000    0.004    0.000 base.py:293(__new__)
        5    0.000    0.000    0.000    0.000 {pandas._libs.lib.fast_zip}
        1    0.000    0.000    0.554    0.554 csvs.py:313(_save)
        5    0.000    0.000    0.554    0.111 csvs.py:330(_save_chunk)
      910    0.000    0.000    0.000    0.000 generic.py:10(_check)
        5    0.000    0.000    0.027    0.005 multi.py:1201(_format_native_types)
       15    0.000    0.000    0.000    0.000 {pandas._libs.algos.take_1d_object_object}
      171    0.000    0.000    0.000    0.000 common.py:1460(is_extension_array_dtype)
       90    0.000    0.000    0.000    0.000 base.py:498(_shallow_copy)
      105    0.000    0.000    0.000    0.000 base.py:463(_simple_new)
      171    0.000    0.000    0.000    0.000 base.py:413(find)
       15    0.000    0.000    0.001    0.000 algorithms.py:1616(take_nd)

```

</details>

The simplest (but silliest) suggestion would just be to call `.reset_index()` within `df.to_csv()` if `index=True` is set. However, I suspect this indicates something more fundamental that can be improved.

#### Expected Output

Speed of outputting to CSV does not vary significantly depending on how the index was created, as is the case when outputting to Parquet, etc.

#### Output of ``pd.show_versions()``

<details>

INSTALLED VERSIONS
------------------
commit           : db08276bc116c438d3fdee492026f8223584c477
python           : 3.6.8.final.0
python-bits      : 64
OS               : Linux
OS-release       : 3.10.0-1127.19.1.el7.x86_64
Version          : #1 SMP Thu Aug 20 14:39:03 CDT 2020
machine          : x86_64
processor        : x86_64
byteorder        : little
LC_ALL           : None
LANG             : en_GB.UTF-8
LOCALE           : en_GB.UTF-8

pandas           : 1.1.3
numpy            : 1.18.1
pytz             : 2020.1
dateutil         : 2.8.1
pip              : 20.2.4
setuptools       : 45.2.0
Cython           : 0.29.21
pytest           : 5.4.3
hypothesis       : 5.16.0
sphinx           : 2.4.0
blosc            : None
feather          : None
xlsxwriter       : None
lxml.etree       : 4.5.2
html5lib         : None
pymysql          : None
psycopg2         : 2.8.4 (dt dec pq3 ext)
jinja2           : 2.11.2
IPython          : None
pandas_datareader: None
bs4              : 4.9.3
bottleneck       : 1.3.2
fsspec           : 0.6.2
fastparquet      : None
gcsfs            : None
matplotlib       : 3.1.2
numexpr          : 2.7.1
odfpy            : None
openpyxl         : None
pandas_gbq       : None
pyarrow          : 0.15.1
pytables         : None
pyxlsb           : None
s3fs             : None
scipy            : 1.2.1
sqlalchemy       : None
tables           : 3.5.2
tabulate         : 0.8.6
xarray           : None
xlrd             : 1.2.0
xlwt             : None
numba            : 0.51.0


</details>


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG (Performance): Performance of to_csv varies significantly depending on when/how index is set #37484

Code Sample, a copy-pastable example

Problem description

Expected Output

Output of `pd.show_versions()`

INSTALLED VERSIONS

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

BUG (Performance): Performance of to_csv varies significantly depending on when/how index is set #37484

Description

Code Sample, a copy-pastable example

Problem description

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Output of `pd.show_versions()`