Skip to content

rrwen/search_google

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

73 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

search_google

Richard Wen

A command line tool and module for Google API web and image search.

https://travis-ci.org/rrwen/search_google.svg?branch=master https://coveralls.io/repos/github/rrwen/search_google/badge.svg?branch=master https://img.shields.io/twitter/url/https/github.com/rrwen/search_google.svg?style=social

Install

  1. Install Python
  2. Install search_google via pip
pip install search_google

For the latest developer version, see Developer Install.

Usage

For help in the console:

search_google -h

Ensure that a CSE ID and a Google API developer key are set:

search_google -s cx="your_cse_id"
search_google -s build_developerKey="your_dev_key"

Search the web for keyword "cat":

search_google "cat"
search_google "cat" --save_links=cat.txt
search_google "cat" --save_downloads=downloads

Search for "cat" images:

search_google cat --searchType=image
search_google "cat" --searchType=image --save_links=cat_images.txt
search_google "cat" --searchType=image --save_downloads=downloads

Use as a Python module:

# Import the api module for the results class
import search_google.api

# Define buildargs for cse api
buildargs = {
  'serviceName': 'customsearch',
  'version': 'v1',
  'developerKey': 'your_api_key'
}

# Define cseargs for search
cseargs = {
  'q': 'keyword query',
  'cx': 'your_cse_id',
  'num': 3
}

# Create a results object
results = search_google.api.results(buildargs, cseargs)

# Download the search results to a directory
results.download_links('downloads')

For more usage details, see the Documentation.

Contributions

Report Contributions

Reports for issues and suggestions can be made using the issue submission interface.

When possible, ensure that your submission is:

  • Descriptive: has informative title, explanations, and screenshots
  • Specific: has details of environment (such as operating system and hardware) and software used
  • Reproducible: has steps, code, and examples to reproduce the issue

Code Contributions

Code contributions are submitted via pull requests:

  1. Ensure that you pass the Tests
  2. Create a new pull request
  3. Provide an explanation of the changes

A template of the code contribution explanation is provided below:

## Purpose

The purpose can mention goals that include fixes to bugs, addition of features, and other improvements, etc.

## Description

The description is a short summary of the changes made such as improved speeds, implementation

## Changes

The changes are a list of general edits made to the files and their respective components.
* `file_path1`:
    * `function_module_etc`: changed loop to map
    * `function_module_etc`: changed variable value
* `file_path2`:
    * `function_module_etc`: changed loop to map
    * `function_module_etc`: changed variable value

## Notes

The notes provide any additional text that do not fit into the above sections.

For more information, see Developer Install and Implementation.

Developer Notes

Developer Install

Install the latest developer version with pip from github:

pip install git+https://github.com/rrwen/search_google

Install from git cloned source:

  1. Ensure git is installed
  2. Clone into current path
  3. Install via pip
git clone https://github.com/rrwen/search_google
cd search_google
pip install . -I

Tests

  1. Clone into current path git clone https://github.com/rrwen/search_google
  2. Enter into folder cd search_google
  3. Ensure unittest is available
  4. Set your CSE ID and Google API developer key
  5. Run tests
  6. Reset config file to defaults
  7. Please note that this will use up 7 requests from your quota
pip install . -I
python -m search_google -s cx="your_cse_id"
python -m search_google -s build_developerKey="your_dev_key"
python -m unittest
python -m search_google -d

Documentation Maintenance

  1. Ensure sphinx is installed pip install -U sphinx
  2. Update the documentation in docs/
pip install . -I
sphinx-build -b html docs/source docs

Upload to github

  1. Ensure git is installed
  2. Add all files and commit changes
  3. Push to github
git add .
git commit -a -m "Generic update"
git push

Upload to PyPi

  1. Ensure twine is installed pip install twine
  2. Ensure sphinx is installed pip install -U sphinx
  3. Run tests and check for OK status
  4. Delete dist directory
  5. Update the version search_google/__init__.py
  6. Update the documentation in docs/
  7. Create source distribution
  8. Upload to PyPi
pip install . -I
python -m search_google -s cx="your_cse_id"
python -m search_google -s build_developerKey="your_dev_key"
python -m unittest
python -m search_google -d
sphinx-build -b html docs/source docs
python setup.py sdist
twine upload dist/*

Implementation

This command line tool uses the Google Custom Search Engine (CSE) to perform web and image searches. It relies on googleapiclient.build and cse.list, where build was used to create a Google API object and cse was used to perform the searches.

The class search_google.api simply passed a dictionary of arguments into build and cse to process the returned results with properties and methods. search_google.cli was then used to create a command line interface for search_google.api.

In order to use build and cse, a Google Developer API Key and a Google CSE ID needs to be created for API access (see search_google Setup). Creating these keys also required a Gmail account for login access.

googleapiclient.build  <-- Google API
          |
       cse.list        <-- Google CSE
          |
   search_google.api   <-- search results
          |
   search_google.cli   <-- command line

A rough example is provided below thanks to the customsearch example from Google:

from apiclient.discovery import build

# Set developer key and CSE ID
dev_key = 'a_developer_key'
cse_id = 'a_cse_id'

# Obtain search results from Google CSE
service = build("customsearch", "v1", developerKey=dev_key)
results = service.cse().list(q='cat', cx=cse_id).execute()

# Manipulate search results after ...