Skip to content
Steve edited this page Nov 21, 2017 · 21 revisions

We get Wikidata from the Wikidata API via wbgetentities. We put Wikidata item statements in data['claims'] and put all the entity (either "P" property, or "Q" item) identifiers in data['labels']. We resolve each of the entity labels by a series of Wikidata API calls in order to populate data['wikidata'].

You can start with an unambiguous title, or a Wikidata item (wikibase ID):

>>> page = wptools.page('Art Blakey')

>>> page.get_wikidata()
www.wikidata.org (wikidata) Art Blakey
www.wikidata.org (labels) P646|Q5|P535|P4491|P358|Q15981151|Q8227...
www.wikidata.org (labels) P3430|P1417|P569|P1977|Q432|P2163|P1266...
en.wikipedia.org (imageinfo) File:Art Blakey08.JPG
Art Blakey (en) data
{
  aliases: <list(2)> Abdullah Ibn Buhaina, Arthur Blakey
  claims: <dict(53)> P646, P535, P4491, P358, P434, P166, P2605, P...
  description: American jazz drummer and bandleader
  image: <list(1)> {'kind': 'wikidata-image', u'descriptionshortur...
  label: Art Blakey
  labels: <dict(77)> P646, Q5, P535, P4491, P358, Q15981151, Q8227...
  modified: <dict(1)> wikidata
  pageid: 299895
  requests: <list(4)> wikidata, labels, labels, imageinfo
  title: Art_Blakey
  what: human
  wikibase: Q311715
  wikidata: <dict(52)> Discogs artist ID (P1953), award received (...
  wikidata_url: https://www.wikidata.org/wiki/Q311715
}

The Wikidata page claims:

>>> page.data['claims']
{u'P1006': [u'07522058X'],
 u'P106': [u'Q386854', u'Q806349', u'Q158852', u'Q36834', u'Q15981151'],
 u'P1196': [u'Q3739104'],
 u'P1263': [u'544/000048400'],
 u'P1266': [u'573917'],
 u'P1303': [u'Q128309'],
 u'P1315': [u'1067262'],
 u'P136': [u'Q8341'],
 u'P140': [u'Q432'],
 u'P1412': [u'Q1860'],
 u'P1417': [u'biography/Art-Blakey'],
 u'P166': [u'Q935843', u'Q865039'],
 u'P172': [u'Q49085'],
 u'P1728': [u'mn0000928942'],
 u'P1741': [u'82472'],
 u'P18': [u'Art Blakey08.JPG'],
 u'P19': [u'Q1342'],
 u'P1953': [u'29977'],
 u'P1977': [u'82592'],
 u'P20': [u'Q60'],
 u'P2019': [u'p6629'],
 u'P21': [u'Q6581097'],
 u'P213': [u'0000 0003 6854 5392', u'0000 0001 0869 405X'],
 u'P214': [u'10032579'],
 u'P2163': [u'68759'],
 u'P2168': [u'240740'],
 u'P227': [u'119053462'],
 u'P244': [u'n81023040'],
 u'P2604': [u'532504'],
 u'P2605': [u'119790'],
 u'P2639': [u'd8cecd47a76948088a952bb8282c517b'],
 u'P264': [u'Q885833', u'Q287177', u'Q183387'],
 u'P268': [u'13891566c'],
 u'P269': [u'130152145'],
 u'P27': [u'Q30'],
 u'P31': [u'Q5'],
 u'P3192': [u'Art+Blakey'],
 u'P3430': [u'w6474h5f'],
 u'P345': [u'nm0086845'],
 u'P3569': [u'jazz/art-blakey'],
 u'P358': [u'Q3029867'],
 u'P373': [u'Art Blakey'],
 u'P4104': [u'26442'],
 u'P434': [u'601e7466-eaf5-4a91-9909-ffd770b7e04a'],
 u'P4491': [u'blakey_art_batterie'],
 u'P535': [u'6449072'],
 u'P569': [u'+1919-10-11T00:00:00Z'],
 u'P570': [u'+1990-10-16T00:00:00Z'],
 u'P646': [u'/m/01k9wl'],
 u'P735': [u'Q8227356'],
 u'P856': [u'http://www.artblakey.com'],
 u'P910': [u'Q9048913'],
 u'P950': [u'XX854145']}

Labels are automatically resolved by get_labels() (see above status lines). If there are more than fifty (50) labels to resolve, then more calls will be made, that's the API limit. You can minimize the number of requests by specifying just the labels you want with wanted_labels().

>>> page.data['labels']
{u'P1006': u'National Thesaurus for Author Names ID',
 u'P106': u'occupation',
 u'P1196': u'manner of death',
 u'P1263': u'NNDB people ID',
 u'P1266': u'AlloCin\xe9 person ID',
 u'P1303': u'instrument',
 u'P1315': u'People Australia ID',
 u'P136': u'genre',
 u'P140': u'religion',
 u'P1412': u'languages spoken, written or signed',
 u'P1417': u'Encyclop\xe6dia Britannica Online ID',
 u'P166': u'award received',
 u'P172': u'ethnic group',
 u'P1728': u'AllMusic artist ID',
 u'P1741': u'GTAA ID',
 u'P18': u'image',
 u'P19': u'place of birth',
 u'P1953': u'Discogs artist ID',
 u'P1977': u'Les Archives du Spectacle ID',
 u'P20': u'place of death',
 u'P2019': u'AllMovie artist ID',
 u'P21': u'sex or gender',
 u'P213': u'ISNI',
 u'P214': u'VIAF ID',
 u'P2163': u'FAST ID',
 u'P2168': u'SFDb person ID',
 u'P227': u'GND ID',
 u'P244': u'Library of Congress authority ID',
 u'P2604': u'Kinopoisk person ID',
 u'P2605': u'\u010cSFD person ID',
 u'P2639': u'Filmportal ID',
 u'P264': u'record label',
 u'P268': u'BnF ID',
 u'P269': u'SUDOC authorities',
 u'P27': u'country of citizenship',
 u'P31': u'instance of',
 u'P3192': u'Last.fm music ID',
 u'P3430': u'SNAC Ark ID',
 u'P345': u'IMDb ID',
 u'P3569': u'Cultureel Woordenboek identifier',
 u'P358': u'discography',
 u'P373': u'Commons category',
 u'P4104': u'Carnegie Hall agent ID',
 u'P434': u'MusicBrainz artist ID',
 u'P4491': u'Isidore ID',
 u'P535': u'Find a Grave grave ID',
 u'P569': u'date of birth',
 u'P570': u'date of death',
 u'P646': u'Freebase ID',
 u'P735': u'given name',
 u'P856': u'official website',
 u'P910': u"topic's main category",
 u'P950': u'BNE ID',
 u'Q128309': u'drum kit',
 u'Q1342': u'Pittsburgh',
 u'Q158852': u'conductor',
 u'Q15981151': u'jazz musician',
 u'Q183387': u'Columbia Records',
 u'Q1860': u'English',
 u'Q287177': u'ABC Records',
 u'Q30': u'United States of America',
 u'Q3029867': u'Art Blakey discography',
 u'Q36834': u'composer',
 u'Q3739104': u'natural causes',
 u'Q386854': u'drummer',
 u'Q432': u'Islam',
 u'Q49085': u'African Americans',
 u'Q5': u'human',
 u'Q60': u'New York City',
 u'Q6581097': u'male',
 u'Q806349': u'bandleader',
 u'Q8227356': u'Art',
 u'Q8341': u'jazz',
 u'Q865039': u'Bird Award',
 u'Q885833': u'Blue Note Records',
 u'Q9048913': None,
 u'Q935843': u'Grammy Lifetime Achievement Award'}

Finally, we rewrite all of the Wikidata page statements (claims) with the labels we've found, and put it all together in data['wikidata']:

>>> page.data['wikidata']
{u'AllMovie artist ID (P2019)': u'p6629',
 u'AllMusic artist ID (P1728)': u'mn0000928942',
 u'AlloCin\xe9 person ID (P1266)': u'573917',
 u'BNE ID (P950)': u'XX854145',
 u'BnF ID (P268)': u'13891566c',
 u'Carnegie Hall agent ID (P4104)': u'26442',
 u'Commons category (P373)': u'Art Blakey',
 u'Cultureel Woordenboek identifier (P3569)': u'jazz/art-blakey',
 u'Discogs artist ID (P1953)': u'29977',
 u'Encyclop\xe6dia Britannica Online ID (P1417)': u'biography/Art-Blakey',
 u'FAST ID (P2163)': u'68759',
 u'Filmportal ID (P2639)': u'd8cecd47a76948088a952bb8282c517b',
 u'Find a Grave grave ID (P535)': u'6449072',
 u'Freebase ID (P646)': u'/m/01k9wl',
 u'GND ID (P227)': u'119053462',
 u'GTAA ID (P1741)': u'82472',
 u'IMDb ID (P345)': u'nm0086845',
 u'ISNI (P213)': [u'0000 0003 6854 5392', u'0000 0001 0869 405X'],
 u'Isidore ID (P4491)': u'blakey_art_batterie',
 u'Kinopoisk person ID (P2604)': u'532504',
 u'Last.fm music ID (P3192)': u'Art+Blakey',
 u'Les Archives du Spectacle ID (P1977)': u'82592',
 u'Library of Congress authority ID (P244)': u'n81023040',
 u'MusicBrainz artist ID (P434)': u'601e7466-eaf5-4a91-9909-ffd770b7e04a',
 u'NNDB people ID (P1263)': u'544/000048400',
 u'National Thesaurus for Author Names ID (P1006)': u'07522058X',
 u'People Australia ID (P1315)': u'1067262',
 u'SFDb person ID (P2168)': u'240740',
 u'SNAC Ark ID (P3430)': u'w6474h5f',
 u'SUDOC authorities (P269)': u'130152145',
 u'VIAF ID (P214)': u'10032579',
 u'award received (P166)': [u'Grammy Lifetime Achievement Award (Q935843)',
  u'Bird Award (Q865039)'],
 u'country of citizenship (P27)': u'United States of America (Q30)',
 u'date of birth (P569)': u'+1919-10-11T00:00:00Z',
 u'date of death (P570)': u'+1990-10-16T00:00:00Z',
 u'discography (P358)': u'Art Blakey discography (Q3029867)',
 u'ethnic group (P172)': u'African Americans (Q49085)',
 u'genre (P136)': u'jazz (Q8341)',
 u'given name (P735)': u'Art (Q8227356)',
 u'image (P18)': u'Art Blakey08.JPG',
 u'instance of (P31)': u'human (Q5)',
 u'instrument (P1303)': u'drum kit (Q128309)',
 u'languages spoken, written or signed (P1412)': u'English (Q1860)',
 u'manner of death (P1196)': u'natural causes (Q3739104)',
 u'occupation (P106)': [u'drummer (Q386854)',
  u'bandleader (Q806349)',
  u'conductor (Q158852)',
  u'composer (Q36834)',
  u'jazz musician (Q15981151)'],
 u'official website (P856)': u'http://www.artblakey.com',
 u'place of birth (P19)': u'Pittsburgh (Q1342)',
 u'place of death (P20)': u'New York City (Q60)',
 u'record label (P264)': [u'Blue Note Records (Q885833)',
  u'ABC Records (Q287177)',
  u'Columbia Records (Q183387)'],
 u'religion (P140)': u'Islam (Q432)',
 u'sex or gender (P21)': u'male (Q6581097)',
 u'\u010cSFD person ID (P2605)': u'119790'}

Further reading

Properties

You can view the old version of this page too, before we got ALL Wikidata.

Clone this wiki locally