read_fwf 'infer' where first hundred lines differ from other lines

#### Code Sample, a copy-pastable example if possible

```
  1     1   -13.120080   0.229   0.484  -0.378  -0.872
  1     2    -1.902843  -0.090   0.256   1.791   0.967
  1     3   -22.050698  -0.176  -0.394   0.922  -0.454
  1     4   -30.349928   0.081  -0.194  -0.327  -0.981
  1     5   -22.204160  -0.168  -0.197   0.984  -0.266
  1     6   -28.001753  -0.065   0.597  -0.203  -0.802
  1     7   -17.247524   0.108   0.194   0.474   0.774
  1     8   -28.014811   0.017   0.994   0.493   0.112
  1     9   -13.325491   0.259   0.189  -1.275   0.149
  1    10   -10.063621   0.327   0.108  -1.784   0.061
...
115    18     5.697000   0.391  -0.027   0.252   1.000
115    19     8.324000  -0.283   0.132   0.227  -0.216
115    20    48.451000   0.070  -0.041   0.379  -0.082
115    21     0.146000   0.677   0.031  -0.561  -0.149
115    22     1.443000  -0.706  -0.033  -0.222   0.035
115    23     4.595000   0.654  -0.081   0.774   0.997
115    24     0.146000  -0.677   0.031   0.561  -0.149
115    25     4.595000   0.654  -0.081   0.774   0.997
115    26     6.769000  -0.363  -0.093  -0.298   0.996
115    27    24.157000  -0.280  -0.324  -0.142  -0.946
```
#### Problem description

I have a long [fixed-width file](https://paste.ubuntu.com/23806889/) (>100k lines) that whose head and tail are shown above. I want to read this file with pandas. I figure `pd.read_fwf` is the way to do this. The issue comes up because it reads the first hundred lines, which start with  `'  1'` to say "lets start reading at [2]" whereas the last hundred lines start with `115`, so it skips the initial `11` and starts the line with `5`, so I lose data.

A couple of approaches to solving this issue come to mind, though I'm sure there are others:

- Don't infer until all lines are scanned
- Take as an argument the number of lines to be scanned before concluding the format, including the option to scan all (e.g. `infer_from_all`)
- Take as an argument which direction to scan -- top to bottom or bottom to top


#### Output of ``pd.show_versions()``



<details>

commit: None
python: 2.7.6.final.0
python-bits: 64
OS: Linux
OS-release: 3.13.0-107-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: None.None


</details>


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

read_fwf 'infer' where first hundred lines differ from other lines #15138

Code Sample, a copy-pastable example if possible

Problem description

Output of `pd.show_versions()`

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

read_fwf 'infer' where first hundred lines differ from other lines #15138

Description

Code Sample, a copy-pastable example if possible

Problem description

Output of pd.show_versions()

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Output of `pd.show_versions()`