Skip to content

jnz8086/dumpwikicat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

This script dumps a list of category entries of a MediaWiki based website into a text file separated by lines.

Usage

python ./dwc.py url file [filter] [-r]
   where: filter is a regular expression that discards entries when matched. (leave it like "" to bypass filtering)
          -r flag means that it recursively goes through the subcategories as well

e.g.

python ./dwc.py "https://en.wiktionary.org/wiki/Category:English_nouns" ./en-nouns.txt "^(Appendix\\:|User\\:|[*/\\!&$%-@()_+.]|\d)"

About

No description or website provided.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages