-
Notifications
You must be signed in to change notification settings - Fork 2
Automated web browsing with javascript / cookies support.
License
picklingjar/spynner
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
= Introduction = Spynner is a stateful programmatic web browser module for Python. It is based upon [http://www.qtsoftware.com/ Qt] and [http://webkit.org/ WebKit], so it supports Javascript, AJAX, and every other technology that !WebKit is able to handle (Flash, SVG, ...). Spynner takes advantage of [http://jquery.com jQuery], a powerful Javascript library that makes the interaction with pages and event simulation really easy. Using Spynner you would able to simulate a web browser with no GUI (though a browsing window can be opened for debugging purposes), so it may be used to implement crawlers or acceptance testing tools. = Dependencies = * [http://www.python.org Python] (>=2.5) * [http://www.riverbankcomputing.co.uk/software/pyqt/download PyQt] (>=4.4.3): Python wrappers for the [http://www.qtsoftware.com/ Qt] framework. = Install = * A [http://code.google.com/p/spynner/downloads/list release] version (recommended): {{{ $ wget http://spynner.googlecode.com/files/spynner-VERSION.tgz $ tar xvzf spynner-VERSION.tgz $ cd spynner-VERSION $ sudo python setup.py install }}} or {{{ $ sudo easy_install spynner }}} * The bleeding edge version: {{{ $ svn checkout http://spynner.googlecode.com/svn/trunk/ spynner-trunk $ cd spynner-trunk $ sudo python setup.py install }}} = API = http://tokland.freehostia.com/googlecode/spynner/api/ You can generate the API locally (will create docs/api directory): $ python setup.py gen_doc = Usage = A basic example: {{{ import spynner browser = spynner.Browser() browser.load("http://www.wordreference.com") browser.runjs("console.log('I can run Javascript')") browser.runjs("console.log('I can run jQuery: ' + jQuery('a:first').attr('href'))") browser.select("#esen") browser.fill("input[name=enit]", "hola") browser.click("input[name=b]") browser.wait_page_load() print browser.url, browser.html browser.close() }}} Sometimes you'll want to see what is going on: {{{ browser = spynner.Browser() browser.debug_level = spynner.DEBUG browser.create_webview() browser.show() ... }}} See more examples in the repository: http://code.google.com/p/spynner/source/browse/#svn/trunk/examples = Running Javascript = Spynner uses jQuery to make Javascript interface easier. By default, two modules are injected to every loaded page: * [http://docs.jquery.com/Downloading_jQuery jQuery core]: Amongst other things, it adds the powerful [http://docs.jquery.com/Selectors jQuery selectors], which are used internally by some Spynner methods (_fill_, _select_, _click_, _check_, ...). Of course you can also use jQuery when you inject your own code into a page. * [http://code.google.com/p/jqueryjs/source/browse/trunk/plugins/simulate jQuery _simulate_ plugin]: Makes it possible to simulate mouse and keyboard events (for now spynner uses it only in the _click_ action). Look up the library code to see which kind of events you can fire. Note that you must use __jQuery(...)_ instead of _jQuery(...)_ or the common shortcut _$(...)_. That prevents name clashing with the jQuery library used by the page. = Cook your soup: parsing the HTML = You can parse the HTML of a webpage with your favorite parsing library ([http://www.crummy.com/software/BeautifulSoup BeautifulSoup], [http://codespeak.net/lxml/ lxml], ...). Since we are already using Jquery for Javascript, it feels just natural to work with [http://pypi.python.org/pypi/pyquery pyquery], its Python counterpart: {{{ import spynner import pyquery browser = spynner.Browser() ... d = pyquery.Pyquery(browser.html) d.make_links_absolute(browser.get_url()) href = d("#somelink").attr("href") browser.download(href, open("/path/outputfile", "w")) }}} = Running Spynner without X11 == Spynner needs a X11 server to run. If you are running it in a server without X11 you must install the virtual [http://en.wikipedia.org/wiki/Xvfb Xvfb server]. Debian users can use the small wrapper (xvfb-run). If you are not using Debian, you can download it here: http://www.mail-archive.com/debian-x@lists.debian.org/msg69632/x-run {{{ $ xvfb-run python myscript_using_spynner.py }}} = Feedback = Open an [http://code.google.com/p/spynner/issues/list issue] to report a bug or request a new feature. Other comments and suggestions can be directly emailed to me: [mailto://tokland@gmail.com tokland@gmail.com].
About
Automated web browsing with javascript / cookies support.
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published