PERF: allow even more flexible ISO 8601 datetime parsing

Hello,

I noticed that there is a huge code speed difference with `to_datetime` execution when format is not given and when it's given.

I wonder if there is not some room for improvements here!

```
In [1]: %time df=pd.read_csv("AUDUSD-2014-01.csv", names=['Symbol', 'Date', 'Bid', 'Ask'])
CPU times: user 3.31 s, sys: 481 ms, total: 3.79 s
Wall time: 4.13 s

In [2]: df
Out[274]:
          Symbol                   Date      Bid      Ask
0        AUD/USD  20140101 21:55:34.404  0.88796  0.88922
1        AUD/USD  20140101 21:55:34.444  0.88805  0.88914
2        AUD/USD  20140101 21:55:34.475  0.88809  0.88910
3        AUD/USD  20140101 21:55:48.962  0.88811  0.88908
4        AUD/USD  20140101 21:56:38.293  0.88808  0.88887
...          ...                    ...      ...      ...
1947101  AUD/USD  20140131 21:59:48.048  0.87525  0.87589
1947102  AUD/USD  20140131 21:59:54.599  0.87527  0.87589
1947103  AUD/USD  20140131 21:59:56.927  0.87531  0.87588
1947104  AUD/USD  20140131 21:59:59.365  0.87531  0.87574
1947105  AUD/USD  20140131 22:00:00.038  0.87531  0.87574

[1947106 rows x 4 columns]

In [3]: %time pd.to_datetime(df['Date'])
CPU times: user 11min 44s, sys: 19.4 s, total: 12min 4s
Wall time: 12min 46s
Out[3]:
0         2014-01-01 21:55:34.404
1         2014-01-01 21:55:34.444
2         2014-01-01 21:55:34.475
3         2014-01-01 21:55:48.962
4         2014-01-01 21:56:38.293
                    ...
1947101   2014-01-31 21:59:48.048
1947102   2014-01-31 21:59:54.599
1947103   2014-01-31 21:59:56.927
1947104   2014-01-31 21:59:59.365
1947105   2014-01-31 22:00:00.038
Name: Date, dtype: datetime64[ns]

In [4]: fmt='%Y%m%d %H:%M:%S.%f'

In [5]: %time pd.to_datetime(df['Date'], format=fmt)
CPU times: user 37.3 s, sys: 1.31 s, total: 38.6 s
Wall time: 40 s

In [6]: timedelta(minutes=12, seconds=46) / timedelta(seconds=40)
Out[6]: 19.15
```

There is x19.15 factor!!!

Sample data can be found here
https://drive.google.com/file/d/0B8iUtWjZOTqla3ZZTC1FS0pkZXc/view?usp=sharing

See also https://github.com/pydata/pandas-datareader/issues/153


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

PERF: allow even more flexible ISO 8601 datetime parsing #11899

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

PERF: allow even more flexible ISO 8601 datetime parsing #11899

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions