Replies: 4 comments 1 reply
-
Thanks for the report, I fixed it in Cubane (it did infinite loop) Maybe there is an issue there a.map((i) => BigInt(a.length - i + 1) * 32n) It should be |
Beta Was this translation helpful? Give feedback.
-
I haven't been able to get headlong to misbehave. I added a decoded array length limit in August 2020 and currently I decode array lengths as uint21 (with validation, largest accepted array is length 2^21 -1). So I either get something like
or
or
The last one is generated by calculating the minimum possible byte length for the array: /**
* Abort early if the input is obviously too short. Best effort to fail fast before allocating memory for the array.
*/
private void checkNoDecodePossible(final int remaining, final int arrayLen) {
final int minByteLen = !dynamic
? headLength
: elementType.dynamic
? arrayLen * OFFSET_LENGTH_BYTES
: !(elementType instanceof ByteType)
? arrayLen * elementType.headLength()
: (flags & ABIType.FLAG_LEGACY_DECODE) != 0
? arrayLen
: Integers.roundLengthUp(arrayLen, UNIT_LENGTH_BYTES);
if (remaining < minByteLen) {
throw new IllegalArgumentException("not enough bytes remaining: " + remaining + " < " + minByteLen);
}
} Maybe this will help somebody. Also, headlong supports backwards offset jumps and maybe |
Beta Was this translation helpful? Give feedback.
-
ZST is a new attack on ETH ABI parsers. It's low priority since it only causes Denial of Service - there is no Remote Code Execution.
Unfortunately, micro-eth-signer was partially affected. Unexepectedly, ethers.js was fine.
Our parser was broken. How about we take the offensive and break all other parsers?
What's ABI?
ABI (Application Binary Interface) is computer-friendly "documentation" for Ethereum smart contracts. An account wishing to use a smart contract's function uses the ABI to hash the function definition, so it can create the EVM bytecode required to call the function. micro-eth-signer is a tiny JS library, which interacts with transactions and smart contracts, so, ABI parsing is present. Underneath, it uses micro-packed, a very cool lib for handling binary data.
As an example, here's a part of Uniswap ABI:
Recursion
Ethereum ABI has pointers, which means we can cause cycles by creating pointer which refers to itself: e.g. 32-byte
0000000000000000000000000000000000000000000000000000000000000000
ABI pointers are unsigned 256-bit integers. So much efficiency!
There are no pointers in ABI definition, they are auto-created for dynamic structures
You can't define recursive structure in ABI definition. Meaning, a long chain of pointers will require long definition:
O(N)
complexityHowever, we have dynamic arrays, which is by default, under pointers
If we can create an array which consists of itself, we can cause DoS by combinatorial explosion
Suppose, we have the array of
[[], []]
. Now if we replace all internal arrays with global array, we can create circular reference:which outputs:
JS is smart enough to catch this. But what about ABI parsers?
Tricking a parser
If we can create an array of two elements which references itself, we can add
[][][][][]
(depth=5) to definition, and it will create2**5
arrays.An array starts with a pointer. The structure looks like
[u256(ptr), u256(arrayLen), ...u256(elements)]
. Which means an actual array starts at '0x20' (or 32 bytes, same as ptr in first 32 bytes).To act as a pointer, all array values should be equal to this ptr. Let's look at the result:
The result is unexpected:
32n
, there is0n
in nested arraysThe reason this happens is the fact ETH ABI has "nested" pointers, which feels like a poor man's attempt to fight recursive pointers. All pointers have absolute position inside of a scope, but whenever pointer is dereferenced, it creates a new scope:
Why it is bad? (most things in ETH spec are bad, just look at those 256-bit pointers)
You can easily disable loops by enforcing that ptr points to data after pointer only (or before, but should be same for all pointers), this will mean that no ptr can point to any previous data and next pointer in array cannot use same data as previous. Most libraries already construct stuff in a way the check works. There can be some compat issues with something obscure.
Scopes try to do the same, but in significantly more complicated way. Scopes can be better for compression, at the cost of DoS attacks.
This also means pointers (0x20) will point at first element instead of 'count'. Scopes -> every element in array of pointers can point to same data.
Which means, we can create array of N elements by encoding only single element. Furthermore, we can create nested elements which will "spend" only single pointer on depth level. ethers check tries to fix this bug, but it only requires to add 32 more pointers to input.
Proof of concept
Now, all zero users of our library can feel secure!
Results
uint32[1][]
): eats all memory (LOL!)More bugs
We've crashed all popular libraries. Let's try to crash micro-eth-signer now. Payload will be surprisingly easy:
Create an array of pointers. Each of them would point to an array after the current one. Add simple array of values in the end:
Payload will be larger than previous one, but it will bypass our checks.
Pointers are hard! You can encode very complex structures without pointers: look at micro-packed and ed25519-keygen's PGP implementation.
It was fixed by prohibiting de-referencing pointer to same address more than once. Check out micro-packed source code.
Interleave
We are secure now. Right? Did I tell you that pointers are hard? Basically, if one encoded byte can be used twice in output, DoS can happen.
Let's look at arrays. 10 elements array, will be represented as len(10)[9, 8, 7, ...] By construction when parsed as uint256[][] will create arrays:
This should create size explosion
N -> N**2/2
.Elements are pointers, so they should point on specific element of array.
[32, 64, 96, 128, ...] -> last element
is big and will require a lot of padding[128, 96, 64, 32] -> elem
will need roughly half of biggest array as paddingThe result:
This is in hex, byte size is 2x smaller. Check out code in abi.test.js.
Mitigation
How can the errors be fixed?
Stealing keys with junk data
If there are any pointers in encoding, libraries won't catch additional data in it. When looking into all transactions, there was a real uniswap tx with injected data. micro-eth-signer catched it, other libraries didn't.
On a first glance, this could be used for fingerprinting. A bad client will be able to inject IP or UUID to all transactions. Bad, but not too bad.
However, suppose, we have a cold wallet which is airgapped and can't access a network. Transactions are transferred by manually copying / typing every tx byte. With junk data, wallet could inject private keys into transactions with ABI calls. Moreover, the keys could be encrypted. All ABI decoding libraries won't see it, so, a user would assume the tx is safe. An attacker, however, would be able to parse the blockchain, find TXs.
How can this be mitigated? One way is doing a round-trip of
encode(decode(data))
. Differences would be shown, however, a spec doesn't force anything related to pointers and value positions. Any significantly different parser would create a lot of false-positives. Another way is using micro-eth-signer. Whenever it's necessary to parse an unsafe tx,ptrArr.decode(p, { allowUnreadBytes: true })
could be used. For PoC, take a glance at our abi.test.jsConclusion
ETH nodes are safe: ETH ABI is not stored on-chain, and nodes don't parse ABIs.
Block explorers and wallets are affected, whenever they touch ABIs.
As for bad events:
Status as per Jan 18, 2024:
Thanks to Viem for offering a reward for the finding.
Beta Was this translation helpful? Give feedback.
All reactions