scraping tables from web using pandas
Summary The pandas read_html() function is useful for quickly parsing HTML tables in pages - especially in Wikipedia pages. By the nature of HTML, the data is frequently not going to be as clean as you might need and cleaning up all the stray unicode characters can be time consuming. This article showed several techniques you can use to clean the data and convert it to the proper numeric format.