SIP Number: 006
Title: Clarity Execution Cost Assessment
Author: Aaron Blankstein aaron@blockstack.com, Reed Rosenbluth reed@blockstack.com
Consideration: Technical
Type: Consensus
Status: Ratified
Created: 19 October 2019
License: BSD 2-Clause
Sign-off: Jude Nelson jude@stacks.org, Technical Steering Committee Chair
Discussions-To: https://github.com/stacksgov/sips
This document describes the measured costs and asymptotic costs assessed for the execution of Clarity code. This will not specify the constants associated with those asymptotic cost functions. Those constants will necessarily be measured via benchmark harnesses and regression analyses. Furthermore, the analysis cost associated with this code will not be covered by this proposal.
The asymptotic cost functions for Clarity functions are modifiable via an on-chain voting mechanism. This enables new native functions to be added to the language over time.
This document also describes the memory limit imposed during contract execution, and the memory model for enforcing that limit.
Execution cost of a block of Clarity code is broken into 5 categories:
- Runtime cost: captures the number of cycles that a single processor would require to process the Clarity block. This is a unitless metric, so it will not correspond directly to cycles, but rather is meant to provide a basis for comparison between different Clarity code blocks.
- Data write count: captures the number of independent writes performed on the underlying data store (see SIP-004).
- Data read count: captures the number of independent reads performed on the underlying data store.
- Data write length: the number of bytes written to the underlying data store.
- Data read length: the number of bytes read from the underlying data store.
Importantly, these costs are used to set a block limit for each block. When it comes to selecting transactions for inclusion in a block, miners are free to make their own choices based on transaction fees, however, blocks may not exceed the block limit. If they do so, the block is considered invalid by the network --- none of the block's transactions will be materialized and the leader forfeits all rewards from the block.
Tracking the execution cost of a contract may be done either dynamically or statically. Dynamic cost assessment involves tracking, at the VM level, the various metrics as a contract is executed. Static cost assessment is performed via analysis of the contract source code, and is inherently a more pessimistic accounting of the execution cost: list operations are charged according to the maximum size of the list (per the type annotations and inferences from the source code) and branching statements are charged according to the maximum branch cost (per metric tracked, i.e., if one branch performs 1 write and has a runtime cost of 1, and another branch performs 0 writes and has a runtime cost of 2, the whole statement will be assessed as having a maximum of 1 write and runtime cost of 2).
Looking up variables in Clarity incurs a non-constant cost -- the stack depth and the length of the variable name affect this cost. However, variable names in Clarity have bounded length -- 128 characters. Therefore, the cost assessed for variable lookups may safely be constant with respect to name length.
The stack depth affects the lookup cost because the variable must be checked for in each context on the stack.
The cost model of Clarity depends on a copy-on-read semantic for objects. This allows operations like appends, matches, wrapping/unwrapping, to be constant cost, but it requires that variable lookups be charged for copies.
Cost Function:
a*X+b*Y+c
a, b, and c are constants
X := stack depth
Y := variable size
Looking up a function in Clarity incurs a constant cost with respect to name length (for the same reason as variable lookup). However, because functions may only be defined in the top-level contract context, stack depth does not affect function lookup.
Cost Function:
a
a is a constant.
The cost of binding a name in Clarity -- in either a local or the contract context is constant with respect to the length of the name:
binding_cost = a
a is a constant
Function application in Clarity incurs a cost in addition to the cost of executing the function's body. This cost is the cost of binding the arguments to their passed values, and the cost of ensuring that those arguments are of the correct type. Type checks and argument binding are linear in the size of the arguments.
The cost of applying a function is:
(a*X+b) + costEval(body)
a and b are constants
X := the cumulative size of the argument types
costEval(body)
:= the cost of executing the body of the function
User-signed transactions for contract-calls are charged for the application of the function, as well as the loading of the contract data. This charge is the same as a normal contract-call. However, contract principals that are supplied as trait arguments must be checked by the runtime system to ensure that they validly implement the trait. The cost of this check is:
read_count = 2
read_length = trait_size + contract_size
runtime_cost = a*(contract_size) + b*(trait_size) + c
This check needs to read the trait, and then validate that the supplied contract fulfills that trait by reading the contract in, and checking the method signatures. This check must be performed for each such trait parameter.
Parsing a type in Clarity incurs a linear cost in the size of the AST describing the type:
type_parsing_cost(X) = (a*X+b)
a, b, are constants
X := the number of elements in the type description AST
The type description AST is the tree of Clarity language elements used for describing the type, e.g.:
(list 1 uint)
- this AST has four elements:list
,1
,uint
and the parentheses containing them.(response bool int)
- this AST has four elements:response
,bool
,int
and the parentheses containing them.int
- this AST is just one component.
Defining a function in Clarity incurs an execution cost at the time of contract publishing (unrelated to any analysis). This is the cost of parsing the function's signature, which is linear in the length of the type signatures, and linear in the length of the function name and argument names.
binding_cost + sum(a + type_parsing_cost(Y) for Y in ARG_TYPES)
type_parsing_cost(Y)
:= the cost of parsing argument Y
ARG_TYPES := the function definition's argument type signatures
a is a constant associated with the binding of argument types
Storing a contract incurs both a runtime cost as well as storage costs. Both of these are linear the size of the contract AST.
WRITE_LENGTH = a*X+b
RUNTIME_COST = c*X+d
a, b, c, and d, are constants.
These are the initial set values for native function costs, however, these can be changed as described below in the Cost Upgrades section of this document.
Fetching data from the datastore requires hashing the key to be looked up. That cost is linear in the key size:
data_hash_cost(X) = a*X+b
X := size of the key
Fetching data from the datastore incurs a runtime cost, in addition to any costs associated with MARF accesses (which are simply counted as the integer number of times the MARF is accessed). That runtime cost is linear in the size of the fetched value (due to parsing).
read_data_cost = a*X+b
X := size of the fetched value.
Writing data to the datastore incurs a runtime cost, in addition to any costs associated with MARF writes (which are simply counted as the integer number of times the MARF is written). That runtime cost is linear in the size of the written value (due to data serialization).
write_data_cost = a*X+b
X := size of the stored value.
Contract calls incur the cost of a normal function lookup and application, plus the cost of loading that contract into memory from the data store (which is linear in the size of the called contract).
RUNTIME_COST: (a*Y+b) + func_lookup_apply_eval(X)
READ_LENGTH: Y
a and b are constants
Y := called contract size
func_lookup_apply_eval(X)
:= the cost of looking up, applying, and
evaluating the body of the function
Note that contract-calls that use trait definitions for dynamic dispatch are not charged at a different cost rate. Instead, there is a cost for looking up the trait variable (assessed as a variable lookup), and the cost of validating any supplied trait implementors is assessed during a transaction's argument validation.
RUNTIME_COST: data_hash_cost(X+Y) + read_data_cost(Z)
READ_LENGTH: Z
X := size of the map's key tuple
Y := the length of the map's name
Z := the size of the map's value tuple
RUNTIME_COST: data_hash_cost(X) + read_data_cost(Z)
READ_LENGTH: Z
X := size of the map's key tuple
Z := the size of the map's value tuple
RUNTIME_COST: data_hash_cost(X+Y) + write_data_cost(Z)
WRITE_LENGTH: Z
X := size of the map's key tuple
Y := the length of the map's name
Z := the size of the map's value tuple
RUNTIME_COST: data_hash_cost(X+Y) + write_data_cost(Z)
WRITE_LENGTH: Z
X := size of the map's key tuple
Y := the length of the map's name
Z := the size of the map's value tuple
RUNTIME_COST: data_hash_cost(X+Y) + write_data_cost(1)
WRITE_LENGTH: 1
X := size of the map's key tuple
Y := the length of the map's name
Y := the length of the map's name
RUNTIME_COST: data_hash_cost(1) + read_data_cost(Y)
READ_LENGTH: Y
Y := the size of the variable's value type
RUNTIME_COST: data_hash_cost(1) + write_data_cost(Y)
WRITE_LENGTH: Y
Y := the size of the variable's value type
RUNTIME_COST: data_hash_cost(Y) + write_data_cost(a) + b
WRITE_LENGTH: a
Y := size of the NFT type
a is a constant: the size of a token owner
b is a constant cost (for tracking the asset in the assetmap)
RUNTIME_COST: data_hash_cost(Y) + read_data_cost(a)
READ_LENGTH: a
Y := size of the NFT type
a is a constant: the size of a token owner
RUNTIME_COST: data_hash_cost(Y) + write_data_cost(a) + write_data_cost(a) + b
READ_LENGTH: a
WRITE_LENGTH: a
Y := size of the NFT type
a is a constant: the size of a token owner
b is a constant cost (for tracking the asset in the assetmap)
Minting a token is a constant-time operation that performs a constant number of reads and writes (to check the total supply of tokens and incremement).
RUNTIME: a
READ_LENGTH: b
WRITE_LENGTH: c
a, b, and c are all constants.
Transfering a token is a constant-time operation that performs a constant number of reads and writes (to check the token balances).
RUNTIME: a
READ_LENGTH: b
WRITE_LENGTH: c
a, b, and c are all constants.
Getting a token balance is a constant-time operation that performs a constant number of reads.
RUNTIME: a
READ_LENGTH: b
a and b are constants.
RUNTIME: a
READ_LENGTH: b
a and b are constants.
In addition to the cost of evaluating the body expressions of a let
,
the cost of a let
expression has a constant cost, plus
the cost of binding each variable in the new context (similar
to the cost of function evaluation, without the cost of type checks).
a + b * Y + costEval(body) + costEval(bindings)
a and b are constants
Y := the number of let arguments
costEval(body)
:= the cost of executing the body of the let
costEval(bindings)
:= the cost of evaluating the value of each let binding
a + costEval(condition) + costEval(chosenBranch)
a is a constant
costEval(condition)
:= the cost of evaluating the if condition
costEval(chosenBranch)
:= the cost of evaluating the chosen branch
if computed during static analysis, the chosen branch cost is the
max
of the two possible branches.
a + costEval(condition) + costEval(throwBranch)
a is a constant
costEval(condition)
:= the cost of evaluating the asserts condition
costEval(throwBranch)
:= the cost of evaluating the throw branch in
the event that condition is false
if computed during static analysis, the thrown branch cost is always included.
The cost of appending an item to a list is the cost of checking the type of the added item, plus some fixed cost.
a + b * X
a and b is a constant
X := the size of the list entry type
The cost of concatting two lists or buffers is linear in the size of the two sequences:
a + b * (X+Y)
a and b are constants
X := the size of the right-hand iterable
Y := the size of the left-hand iterable
The cost of evaluating an as-max-len?
function is constant (the function
is performing a constant-time length check)
The cost of mapping a list is the cost of the function lookup, and the cost of each iterated function application
a + func_lookup_cost(F) + L * apply_eval_cost(F, i)
a is a constant
func_lookup_cost(F)
:= the cost of looking up the function name F
apply_eval_cost(F, i)
:= the cost of applying and evaluating the body of F on type i
i
:= the list item type
L
:= the list length
If computed during static analysis, L is the maximum length of the list as specified by it's type.
The cost of filtering a list is the cost of the function lookup, and the cost of each iterated filter application
a + func_lookup_cost(F) + L * apply_eval_cost(F, i)
a is a constant
func_lookup_cost(F)
:= the cost of looking up the function name F
apply_eval_cost(F, i)
:= the cost of applying and evaluating the body of F on type i
i
:= the list item type
L
:= the list length
If computed during static analysis, L is the maximum length of the list as specified by it's type.
The cost of folding a list is the cost of the function lookup, and the cost of each iterated application
a + func_lookup_cost(F) + (L) * apply_eval_cost(F, i, j)
a is a constant
func_lookup_cost(F)
:= the cost of looking up the function name F
apply_eval_cost(F, i, j)
:= the cost of applying and evaluating the body of F on types i, j
j
:= the accumulator type
i
:= the list item type
L
:= the list length
If computed during static analysis, L is the maximum length of the list as specified by it's type.
The cost of getting a list length is constant, because Clarity lists store their lengths.
The cost of constructing a new list is linear -- Clarity ensures that each item in the list is of a matching type.
a*X+b
a and b are constants
X := the total size of all arguments to the list constructor
The cost of constructing a new tuple is O(nlogn)
with respect to the number of
keys in the tuple (because tuples are represented as BTrees).
a*(X*log(X)) + b
a and b are constants
X := the number of keys in the tuple
Reading from a tuple is O(nlogn)
with respect to the number of
keys in the tuple (because tuples are represented as BTrees).
a*(X*log(X)) + b
a and b are constants
X := the number of keys in the tuple
Match imposes a constant cost for evaluating the match, a cost for checking that the match-bound name does not shadow a previous variable. The total cost of execution is:
a + evalCost(chosenBranch) + cost(lookupVariable)
where a is a constant, and chosenBranch
is whichever branch
is chosen by the match. In static analysis, this will be:
max(branch1, branch2)
These check functions all have constant cost.
These functions all have constant cost.
The variadic operators (+
,-
,/
,*
, and
, or
) all have costs linear
in the number of arguments supplied
(a*X+b)
X := the number of arguments
The binary and unary operators:
>
>=
<
<=
mod
pow
xor
not
to-int
to-uint
all have constant cost, because their inputs are all of fixed sizes.
The hashing functions have linear runtime costs: the larger the value being hashed, the longer the hashing function takes.
(a*X+b)
X := the size of the input.
Clarity contract execution imposes a maximum memory usage limit for applications. For any given Clarity value, the memory usage of that value is counted using the size of the Clarity value.
Memory is consumed by the following variable bindings:
let
- each value bound in thelet
consumes that amount of memory during the execution of thelet
block.match
- the bound value in amatch
statement consumes memory during the execution of thematch
branch.- function arguments - each bound value consumes memory during the execution of the function. this includes user-defined functions as well as native functions.
Additionally, functions that perform context changes also consume memory, though they consume a constant amount:
as-contract
at-block
Types in Clarity may be described using type signatures. For example,
(tuple (a int) (b int))
describes a tuple with two keys a
and b
of type int
. These type descriptions are used by the Clarity analysis
passes to check the type correctness of Clarity code. Clarity type signatures
have varying size, e.g., the signature int
is smaller than the signature for a
list of integers.
The size of a Clarity value is defined as follows:
type_size(x) :=
if x =
int => 16
uint => 16
bool => 1
principal => 148
(buff y) => 4 + y
(some y) => 1 + size(y)
(ok y) => 1 + size(y)
(err y) => 1 + size(y)
(list ...) => 4 + sum(size(z) for z in list)
(tuple ...) => 1 + 2*(count(entries))
+ sum(size(z) for each value z in tuple)
Contract execution requires loading the contract's program state in
memory. That program state counts towards the memory limit when
executed via a programmatic contract-call!
or invoked by a
contract-call transaction.
The memory consumed by a contract is equal to:
a + b*contract_length + sum(size(x) for each constant x defined in the contract)
That is, a contract consumes memory which is linear in the contract's
length plus the amount of memory consumed by any constants defined
using define-constant
.
While data stored in the database itself does not count against the memory limit, supporting public function abort/commit behavior requires holding a write log in memory during the processing of a transaction.
Operations that write data to the data store therefore consume memory until the transaction completes, and the write log is written to the database. The amount of memory consumed by operations on persisted data types is defined as:
data-var
: the size of the stored data var's value.map
: the size of stored key + the size of the stored value.nft
: the size of the NFT keyft
: the size of a Clarity uint value.
In order to enable the addition of new native functions to the Clarity language, there is a mechanism for voting on and changing a function's cost-assessment function (the asymptotic function that describes its complexity and is used to calculate its cost).
New functions can be introduced to Clarity by first being implemented in Clarity and published in a contract. This is necessary to avoid a hard fork of the blockchain and ensure the availability of the function across the network.
If it is decided that a published function should be a part of the Clarity
language, it can then be re-implemented as a native function in Rust and
included in a future release of Clarity. Nodes running this upgraded VM
would instead use this new native implementation of the function when they
encounter Clarity code that references it via contract-call?
.
The new native function is likely faster and more efficient than the prior Clarity implementation, so a new cost-assessment function may need to be chosen for its evaluation. Until a new cost-assessment function is agreed upon, the cost of the Clarity implementation will continue to be used.
New and more accurate cost-assessment functions can be agreed upon as follows:
- A user formulates a cost-assessment function and publishes it in a
Clarity contract as a
define-read-only
function. - The user proposes the cost-assessment function by calling the
submit-proposal
function in the Clarity Cost Voting Contract. - Voting on the proposed function ensues via the voting functions in the Clarity Cost Voting Contract, and the function is either selected as a replacement for the existing cost-assessment function, or ignored.
Cost-assessment functions to be included in proposals can be published in Clarity contracts. They must be written precisely as follows:
- The function must be
read-only
. - The function must return a tuple with the keys
runtime
,write_length
,write_count
,read_count
, andread_length
(the execution cost measurement categories). - The values of the returned tuple must be Clarity expressions representing the asymptotic cost functions. These expressions must be limited to arithmetic operations.
- Variables used in these expressions must be listed as arguments to the Clarity function.
- Constant factors can be set directly in the code.
For example, suppose we have a function that implements a sorting algorithm:
(define-public (quick-sort (input (list 1024 int))) ... )
The cost-assessment function should have one argument, corresponding to the size
of the input
field, e.g.
(define-read-only (quick-sort-cost (n uint))
{
runtime: (* n (log n)),
write_length: 0,
write_count: 0,
read_count: 0,
read_length: 0,
})
Here's another example where the cost function contains constant factors:
(define-read-only (quick-sort-cost (n uint))
{
runtime: (+ 30 (* 2 n (log n))),
write_length: 0,
write_count: 0,
read_count: 0,
read_length: 0
})
Stacks holders can propose cost-assessment functions by calling the
submit-proposal
function in the Clarity Cost Voting Contract.
(define-public (submit-proposal
(function-contract principal)
(function-name (string-ascii 128))
(cost-function-contract principal)
(cost-function-name (string-ascii 128))
...
)
Description of submit-proposal
arguments:
function-contract
: the principal of the contract that defines the function for which a cost is being proposedfunction-name
: the name of the function for which a cost is being proposedcost-function-contract-principal
: the principal of the contract that defines the cost function being proposedcost-function-name
: the name of the cost-function being proposed
If submitting a proposal for a native function included in Clarity at boot,
provide the principal of the boot costs contract for the function-contract
argument, and the name of the corresponding cost function for the function-name
argument.
This function will return a response containing the proposal ID, if successful, and an error otherwise.
Usage:
(contract-call?
.cost-voting-contract
"submit-proposal"
.function-contract
"function-name"
.new-cost-function-contract
"new-cost-function-name"
)
To view cost-assessment function proposals, you can use the
get-proposal
function in the Clarity Cost Voting Contract:
(define-read-only (get-proposal (proposal-id uint)) ... )
This function takes a proposal-id
and returns a response containing the proposal
data tuple, if a proposal with the supplied ID exists, or an error code. Proposal
data tuples contain information about the state of the proposal, and take the
following form:
{
cost-function-contract: principal,
cost-function-name: (string-ascii 128),
function-contract: principal,
function-name: (string-ascii 128),
expiration-block-height: uint
}
Usage:
(contract-call? .cost-voting-contract "get-proposal" 123)
Stacks holders can vote for cost-assessment function proposals by calling the
Clarity Cost Voting Contract's vote-proposal
function. The vote-proposal
function takes two arguments, proposal-id
and amount
. proposal-id
is the ID
of the proposal being voted for. amount
is the amount of STX the voter would like
to vote with. The amount of STX you include is the number of votes you are casting.
Calling the vote
function authorizes the contract to transfer STX out
of the caller's address, and into the address of the contract. The equivalent
amount of cost-vote-tokens
will be distributed to the voter, representing the
amount of STX they have staked in the voting contract.
STX staked for voting can be withdrawn from the voting contract by the voter with
the withdraw-votes
function. If staked STX are withdrawn prior to confirmation,
they will not be counted as votes.
Upon withdrawal, the voter permits the contract to reclaim allocated CFV
tokens,
and will receive the equivalent amount of their staked STX tokens.
Voting example
(contract-call? .cost-voting-contract "vote-proposal" 123 10000)
The vote-proposal
function will return a successful response if the STX were staked
for voting, or an error if the staking failed.
Withdrawal example
(contract-call? .cost-voting-contract "withdraw-votes" 123 10000)
Like the vote-proposal
function, the withdraw-votes
function expects a proposal-id
and
an amount
, letting the voter withdraw some or all of their staked STX. This function
will return a successful response if the STX were withdrawn, and an error otherwise.
Miners can vote against (veto) cost-function proposals by creating a transaction that
calls the Clarity Cost Voting Contract's veto
function and mining
a block that includes this transaction. The veto
function won't count
the veto if the block wasn't mined by the node that signed that transaction.
In other words, miners must commit their veto with their mining power.
Usage:
(contract-call? .cost-voting-contract "veto" 123)
This function will return a successful response if the veto was counted, or an error if the veto failed.
In order for a cost-function proposal to get successfully voted in, it must be
confirmed. Confirmation is a two step process, involving calling the confirm-votes
function before the proposal has expired to confirm the vote threshold was met,
and calling the confirm-miners
function after to confirm that the proposal wasn't vetoed
by miners.
Any stacks holder can call the confirm-votes
function in the Clarity
Cost Voting Contract to attempt confirmation. confirm-votes
will return a
success response and become vote confirmed if the following criteria are met.
- The proposal must receive votes representing 20% of the liquid supply of STX. This is calculated like such:
(>= (/ (* votes u100) stx-liquid-supply) u20)
- The proposal must not be expired, meaning its
expiration-height
must not have been reached.
Usage:
(contract-call? .cost-voting-contract "confirm-votes" 123)
Like confirm-votes
, any stacks holder can call the confirm-miners
function.
confirm-miners
will return a success response and the proposal will become
miner confirmed if the following criteria are met:
- The number of vetos is less than 500.
- There have been less than 10 proposals already confirmed in the current block.
- The proposal has expired.
Usage:
(contract-call? .cost-voting-contract "confirm-miners" 123)
This section will be expanded upon after this SIP is ratified.
All Stacks accounts from Stacks 1.0 will be imported into Stacks 2.0 when it launches. The state of the Stacks 1.0 chain will be snapshotted at Bitcoin block 665750.
At least 20 miners must register a name in the .miner
namespace in Stacks 1.0.
Once the 20th miner has registered, the state of Stacks 1.0 will be snapshotted.
300 Bitcoin blocks later (Bitcoin block 666050), the Stacks 2.0 blockchain will launch. Stacks 2.0
implements this SIP.
Implemented in Rust. See https://github.com/blockstack/stacks-blockchain.