The program can map out the shortest path between 2 wikipedia pages.
-
Updated
May 24, 2020 - Go
The program can map out the shortest path between 2 wikipedia pages.
python web crawler to test theory that repeatedly clicking on the first link on ~97% of wiki pages eventually leads to the wiki page for knowledge 📡
Wikipedia Web Crawler written in Python and Scrapy. The ETL process involves multiple steps, extracting specific data from multiple wikipedia web pages/links using scrapy and organizing it into a structured format using scrapy items. Additionally, the extracted data is saved in JSON format for further analysis and integration into MySQL Workbench.
A Wikipedia crawler that gives the worst translated page around an english starting using hypertext links
Web scraping is data scraping technique used for extracting data from websites.
[READ-ONLY] A word extractor for Wikipedia articles.
A search engine that takes keyword queries as input and retrieves a ranked list of relevant results as output. It scraps a few thousand pages from one of the seed Wiki pages and uses Elasticsearch for a full-text search engine.
Custom implementation of Chord DHT and analysis of its operations
Innopolis IR 2016 course semester project IR system part
a crawler for Wikipedia (for now only the English pages)
Python wrapper for the MediaWiki API to access and parse data from Wikipedia
Document Search Engine Tool
Add a description, image, and links to the wikipedia-crawler topic page so that developers can more easily learn about it.
To associate your repository with the wikipedia-crawler topic, visit your repo's landing page and select "manage topics."