Simple client for extracting data from the Dune Analytics API
Goals of spice
:
- use as a python library or as a CLI tool
- simple, no OOP, entire API is just one function
- support both sync and async workflows
- tight integration with polars
To discuss spice
, head to the Paradigm Data Tools Telegram channel.
pip install dune_spice
Can either use the sync workflow or async workflow. Each workflow has only one function.
See API Reference below for the full list of query function arguments.
import spice
# get most recent query results using query id
df = spice.query(21693)
# get most recent query results using query url
df = spice.query('https://dune.com/queries/21693')
# get most recent query results using raw sql
df = spice.query('SELECT * FROM ethereum.blocks LIMIT 5')
# perform new query execution and get results
df = spice.query(query, refresh=True)
# get query results for input parameters
df = spice.query(query, parameters={'network': 'ethereum'})
# perform new query execution, but do not wait for result
execution = spice.query(query, poll=False)
# get results of previous execution
df = spice.query(execution)
The async API is identical to the sync API as above, just add async_
prefix.
df = await spice.async_query(21693)
df = await spice.async_query('https://dune.com/queries/21693')
df = await spice.async_query('SELECT * FROM ethereum.blocks LIMIT 5')
df = await spice.async_query(query, refresh=True)
df = await spice.async_query(query, parameters={'network': 'ethereum'})
execution = spice.query(query, poll=False)
df = await spice.async_query(execution)
Running the spice
CLI will 1) extract the given query, 2) preview the contents, 3) and save it to a file. Each step of this process can be customized using the CLI options.
spice 21693
spice https://dune.com/queries/21693
spice "SELECT * FROM ethereum.blocks LIMIT 5"
spice $QUERY --refresh
spice $QUERY --parameters network=ethereum
spice
contains additional quality of life features such as:
- automatically cache results locally to save time and credits for repeated queries
- automatically handle pagination of multi-page results
- automatically execute queries that have no existing executions, especially when using new parameter values
- allow type overrides using the
types
parameter - support raw dynamic querying using parameter templates in order to 1) minimize roundtrips and 2) not require a paid API plan
- auto-retry with exponential backoff when encountering HTTP429 ratelimit errors
These python functions are accessed as spice.query()
and spice.async_query()
.
def query(
query_or_execution: Query | Execution,
*,
verbose: bool = True,
refresh: bool = False,
max_age: int | float | None = None,
parameters: Mapping[str, Any] | None = None,
api_key: str | None = None,
performance: Performance = 'medium',
poll: bool = True,
poll_interval: float = 1.0,
limit: int | None = None,
offset: int | None = None,
sample_count: int | None = None,
sort_by: str | None = None,
columns: Sequence[str] | None = None,
extras: Mapping[str, Any] | None = None,
types: Sequence[pl.DataType] | None = None,
cache: bool = True,
cache_dir: str | None = None,
save_to_cache: bool = True,
load_from_cache: bool = True,
) -> pl.DataFrame | Execution:
"""get results of query as dataframe
# Parameters
- query_or_execution: query or execution to retrieve results of
- verbose: whether to print verbose info
- refresh: trigger a new execution, or just use most recent execution
- max_age: max age of last execution in seconds, or trigger a new execution
- parameters: dict of query parameters
- api_key: dune api key, otherwise use DUNE_API_KEY env var
- performance: performance level
- poll: wait for result as DataFrame, or just return Execution handle
- poll_interval: polling interval in seconds
- limit: number of rows to query in result
- offset: row number to start returning results from
- sample_count: number of random samples from query to return
- sort_by: an ORDER BY clause to sort data by
- columns: columns to retrieve, by default retrieve all columns
- extras: extra parameters used for fetching execution result
- examples: ignore_max_datapoints_per_request, allow_partial_results
- types: column types to use in output polars dataframe
- cache: whether to use cache for saving or loading
- cache_dir: directory to use for cached data (create tmp_dir if None)
- save_to_cache: whether to save to cache, set false to load only
- load_from_cache: whether to load from cache, set false to save only
- include_execution: return Execution metadata alongside query result
"""
...
async def async_query(
# all the same parameters as query()
...
) -> pl.DataFrame | Execution:
"""get results of query as dataframe, asynchronously
## Parameters
[see query()]
"""
...
Set your API key using one of these approaches:
- specify key for all sessions: set the
DUNE_API_KEY
environment variable (e.g. write exportDUNE_API_KEY=your-api-key
in your~/.bash_profile
) - specify key for single session: set the
DUNE_API_KEY
entry in python'sos.environ
(e.g.os.environ['DUNE_API_KEY'] = your-api-key
) - specify key for query: in python use the
api_key
argument (df = query(api_key='your-api-key', ...)
) or for cli use the--api-key
argument (spice --api-key your-api-key ...
)
- Create a Dune account
- Go to Settings --> API
- Click "Create new API key"
Yes. But to fetch the result of large queries, you may need to remove the "250k datapoints per request" limitation in your account settings on the Dune website.
spice
interacts only with Dune's SQL-related API endpoints, documented here.