Skip to content

Scrape HTML tables into Dataframes #3369

Closed
@ghost

Description

from ML: https://groups.google.com/forum/?fromgroups=#!topic/pydata/q7VVD8YeSLk

User provides HTML string for whatever source he likes, or url.
optionally specify table id, or regex to match against contained cell
content to quickly single out .+ tables, when multiple exist on the page.

Pseudo:

DataFrame.from_html('http://foo.com/tickers?sym=GOOG',match="high")

Aside: Perhaps not widely known, but excel and co can import tables directly
from online webpages, a cheap "no code" way to get the data into a form
directly readable by pandas.

Metadata

Metadata

Assignees

No one assigned

    Labels

    IdeasLong-Term Enhancement Discussions

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions