Skip to content
Steve edited this page Nov 28, 2017 · 32 revisions

Simply import wptools to get started:

>>> import wptools

Page usage

Use wptools.page() to get a page object:

>>> flannery = wptools.page("Flannery O'Connor")

Leaving off the title invokes a random lookup in English:

>>> page = wptools.page()
en.wikipedia.org (random) 🍜
Sylvia_Rivera (en) data
{
  pageid: 3296309
  title: Sylvia_Rivera
}

The default language is 'en' (English):

>>> toshiko = wptools.page('穐吉敏子')
穐吉敏子 (en) data
{
  lang: en
  title: 穐吉敏子
}

If you specify only a language, you get a random Wikipedia page in that language:

>>> page = wptools.page(lang='zh')
zh.wikipedia.org (random) 🍰
哈莉特·塔布曼 (zh) data
{
  pageid: 211070
  title: 哈莉特·塔布曼
}

If you specify only a wiki site, you get a random page from that site:

>>> page = wptools.page(wiki='en.wikiquote.org')
en.wikiquote.org (random) 🍪
Malala_Yousafzai (en)
{
  pageid: 146817
  title: Malala_Yousafzai
}

You can also start with a Wikidata item:

>>> malcolmx = wptools.page(wikibase='Q43303')

Object data echoes automatically. You can turn that off with silent=True:

>>> page = wptools.page(silent=True)

HTTP request/response details echo to stderr with verbose=True:

>>> page = wptools.page(verbose=True)

All request actions support setting proxy and timeout (in seconds):

>>> page.get(proxy='http://example.com:80', timeout=5)

You can skip request actions using skip:

>>> page = wptools.page(skip=['claims', 'imageinfo'])

See help(wptools.page) for more details.

Category usage

Use wptools.category() to get a category object:

>>> cat = wptools.category('Category:Humanities')

Leaving off the title invokes a random category lookup (ns=14) in English:

>>> cat = wptools.category()
en.wikipedia.org (random:14) 🍕
Category:Jazz Messengers (en) data
{
  pageid: 44375025
  title: Category:Jazz Messengers
}

Get category members with get_members():

>>> cat.get_members()
en.wikipedia.org (categorymembers) Category:Jazz Messengers
Category:Jazz Messengers (en) data
{
  members: <list(56)> {u'ns': 0, u'pageid': 43686772, u'title': u'...
}

See help(wptools.category) for more details.

Site usage

Use wptools.site() to get a site object:

>>> site = wptools.site('de.wikisource.org')

Get a list of Wikimedia sites with get_sites():

>>> site.get_sites()
commons.wikimedia.org (sitematrix) all
commons.wikimedia.org (en) data
{
  random: https://en.wikiquote.org
  sites: <list(741)> https://pt.wikipedia.org, https://pt.wiktiona...
}

Use get_info() to get info about a site:

>>> site.get_info('de.wikisource.org')
de.wikisource.org (query) siteinfo|siteviews|mostviewed
de.wikisource.org (query) siteviews:uniques
Wikisource (de) data
{
  activeusers: 302
  admins: 19
  articles: 407,500
  edits: 2,937,877
  images: 5,128
  info: <dict(50)> invalidusernamechars, phpversion, imagewhitelis...
  jobs: 0
  mostviewed: <list(500)> {u'count': 1367, u'ns': 0, u'title': u'H...
  pages: 443,871
  queued-massmessages: 0
  site: dewikisource
  siteviews: 31,628
  users: 45,188
  visitors: 10,450
}

Leaving off the wiki title gets info about English Wikipedia:

>>> site.get_info()
Wikipedia (en) data
{
  activeusers: 127,990
  admins: 1,248
  articles: 5,481,140
  edits: 911,127,618
  images: 854,957
  info: <dict(49)> invalidusernamechars, phpversion, imagewhitelis...
  jobs: 170,080
  mostviewed: <list(500)> {u'count': 17670586, u'ns': 0, u'title':...
  pages: 43,145,240
  queued-massmessages: 0
  random: https://en.wikiquote.org
  site: enwiki
  sites: <list(741)> https://pt.wikipedia.org, https://pt.wiktiona...
  siteviews: 239,924,113
  users: 31,818,405
  visitors: 64,252,695
}

Use top() to show the most popular pages:

>>> site.top('ja.wikipedia.org')
ja.wikipedia.org (query) siteinfo|siteviews|mostviewed
ja.wikipedia.org (query) siteviews:uniques
jawiki mostviewed articles:
1. メインページ (378,862)
2. 安室奈美恵 (354,964)
3. 安室奈美恵実母殺害事件 (168,767)
4. ペニーオークション詐欺事件 (155,104)
5. 泰葉 (35,576)
6. 石川知裕 (32,693)
7. 白石美帆 (32,542)
8. コードブルー -ドクターヘリ緊急救命- (29,159)
9. 小森純 (26,888)
10. 鈴木京香 (26,779)
11. SAM (ダンサー) (25,439)
12. 江草仁貴 (23,385)
13. 谷垣禎一 (23,244)
14. 石原さとみ (22,993)
15. フジタコーポレーション (群馬県) (22,939)
16. 校閲ガール (22,258)
17. ロヒンギャ (22,242)
18. 前澤友作 (19,253)
19. 阪中香織 (19,050)
20. 植田佳奈 (18,802)
21. 竹内愛紗 (18,773)
22. SHE (阪神タイガース) (18,491)
23. ペニーオークション (18,443)
24. 七つの大罪 (漫画) (18,378)
25. 豊田真由子 (17,459)

See help(wptools.site) for more details.

Clone this wiki locally