Skip to content

Carbon-aware batch job scheduler extension for Slurm.

License

Notifications You must be signed in to change notification settings

Weitspringer/squirrel-hpc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Squirrel - Carbon-aware HPC Scheduling

drawing
Squirrel is a carbon-aware scheduler for Slurm batch jobs. It is a bridge system between the user and Slurm.

We propose three scheduling algorithms considering the grid carbon intensity (GCI) and servers' thermal design power (TDP) values.

Key Features

  • Temporal shifting based on the energy zone of the data center.
  • Intra-datacenter spatial shifting based on the nodes' TDP of CPUs and GPUs you provide.
  • Built-in, configurable forecasting based on historical data.
  • Integration of any forecasting data.

Prerequisites

Simulation

  • Python with version 3.11 or above.
  • Docker Installation

Production

  • All of the above.
  • You have access to sbatch and the scontrol Slurm command.

Setup

  • Setup a InfluxDB instance. See our setup tutorial.
  • Create a virtual environment which has the requirements from requirements.txt installed.
  • Make sure InfluxDB is populated with data for your intended use.

Configuration

See the wiki page for details on configuring Squirrel.

Usage

Squirrel is build with Typer, so you can interact with it via command line interface (CLI). You can use Squirrel to submit batch jobs similar to Slurm's sbatch, manage historical/forecast GCI data, and run simulations.

Submit a Batch Job

To submit an sbatch job, use

python -m cli submit "<rest-of-the-slurm-arguments>" --time=<hours> --partition=<partition_names> --gpus-per-node=[type:]<number>

If you want to run Squirrel in simulation mode, use python -m cli simulate-submit. It also has the optional parameter --submit_date=<isoformat-datestring>.

Import Historical GCI Data from Electricity Maps

To import historical GCI data from the data portal, use:

python -m cli electricitymaps ingest-history --help

Forecast GCI Data

Forecast based on current configuration and storage of results in InfluxDB:

python -m cli forecast to-influx --help

Configurable forecast based on range of historical data and storage of results in InfluxDB:

python -m cli forecast range-to-influx --help

Check out other possibilities:

python -m cli forecast --help

Shorten Command

You can also shorten the command by setting an alias.

alias squirrel="python -m cli"

squirrel submit [...]

Project Structure

. ├── assets # Assets ├── cli # Typer CLI ├── config # Configuration files │ ├── cluster_info_template.cfg # Template for metainformation │ ├── cluster_info.cfg # Your metainformation │ ├── squirrel_template.cfg # Template for Squirrel configuration │ └── squirrel.cfg # Your Squirrel configuration ├── scripts # Thesis remnants ├── src # Main logic │ ├── cluster # Slurm cluster functionality │ ├── config # Configuration logic │ ├── data # Data adapters (GCI, Timetable) │ ├── errors # Custom errors │ ├── forecasting # Builtin forecasting │ ├── sched # Timeslots, timetable, scheduler │ ├── sim # Simulation logic, scenarios │ └── submit # Adapters for job submissions ├── viz # Directory for simulation results etc. ├── .gitignore # Specify files for git to ignore ├── LICENSE # License ├── README.md # Readme ├── requirements.txt # Python requirements └── schedule.csv # By default, created when jobs are submitted

Testing

python -m unittest