Skip to content

Structured data collected by sc-crawler

License

Notifications You must be signed in to change notification settings

SpareCores/sc-data

Repository files navigation

Spare Cores Data

Build Last Run Project Status: Alpha Maintenance Status: Active CC-BY-SA 4.0 License PyPI - Python Version NGI Search Open Call 3 beneficiary

SC Data is a Python package and related tools making use of sparecores-crawler to pull and standardize data on cloud compute resources. This repository actually runs the crawler every 5 minutes to update spot prices, and every hour to update all cloud resources in an internal SCD table and public SQLite snapshot as well.

Installation

Stable version from PyPI:

pip install sparecores-data

Most recent version from GitHub:

pip install "sparecores-data @ git+https://git@github.com/SpareCores/sc-data.git"

Usage

For easy access to the most recent version of the SQLite database file, import the db object of the sc_data Python package, which runs an updater thread in the background to keep the SQLite file up-to-date:

from sc_data import db
print(db.path)

By default, the SQLite file will be updated every 600 seconds, which can be overwritten by the sc_data_db_refresh_seconds builtins attribute or the SC_DATA_DB_REFRESH_SECONDS environment variable.

Similarly, you van set the following environment variables:

  • SC_DATA_NO_UPDATE
  • SC_DATA_DB_PATH
  • SC_DATA_DB_URL
  • SC_DATA_DB_REFRESH_SECONDS
  • SC_DATA_HTTP_TIMEOUT

References