Implement error correction #205

apoelstra · 2024-09-30T18:20:00Z

This implements the core algorithms for error correction. In principle this exposes an API which is sufficient for somebody to implement error correction (of both substitutions and erasures). In practice the API is unlikely to be super usable because:

We yield error locations as indices from the end of the string rather than from the beginning (which we do because the error correction logic doesn't know the original string or even its length);
We similarly require the user indicate the location of erasures as indices from the end of the string;
We yield errors as GF32 offsets to be added to the current character in the string, rather than as correct characters (again, we do this because we don't know the string).
There is a situation in which we detectably cannot correct the string, but we yield some "corrections" anyway (to detect this case, we need to notice if the error iterator ends "early" for a technical definition of "early"; this is not too hard but there's an API question about whether the iterator should be yielding a Result or what).
We don't have a way for the user to signal erasures other than providing a valid bech32 character and then later telling the correction logic that the location is an erasure. We should be able to parse ?s or something.

There is also some missing functionality:

We should be able to correct "burst errors" where if the user indicates a long string of erasures all in a row, we should be able to correct up to checksum-length-many of them. (But if there are other errors, we then won't detect them, so I'm unsure what the UX should look like..)
Eventually we ought to have a "list decoder" which not only provides a unique best correction if one exists, but always provides a list of "plausible" corrections that the user would then need to check against the blockchain. This would involve a totally different error correction algorithm and I don't intend to do it in the next several years, but throwing it out there anyway.

The next PR will be an "error correction API" PR. I would like some guidance from users on what this API should look like.

There are two parameterizations of the bech32 checksum (see the "roots" unit test in src/primitives/polynomial.rs for what they are). In rust-bitcoin#203 we mixed them up, using the generator from one but the exponents from the other. We made the same mistake with codex32 apparently. When we implement error correction this will cause failures. Fix it.

Adds a CHARACTERISTIC constant to the Field trait, so this is yet another breaking change (though in practice I don't think anybody is implementing Field on their own types).

…near shift registers This provides a general-purpose implementation of the Berlekamp-Massey algorithm for finding a linear shift register that generates a given sequence prefix. If compiled without an allocator, it will run less efficiently (and be limited to a maximum size) but it will work. Also introduces a fuzz test to check that it works properly and does not crash.

This commit pulls everything together. The actual error correction code isn't too big: we interpret a residue as a polynomial, evaluate it at various powers of alpha to get a syndrome polynomial, call berlekeamp-massey on this to get a "connection polynomial", then use Forney's algorithm to get the actual error values. Each step in the above is encapsulated separately -- the "big" stuff, in particular Berlekamp-Massey and obtaining the relevant constants from the checksum definition, were in previous commits. This PR does need to add some more functionality to Polynomial. Specifically we need the ability to evaluate polynomials, take their formal derivatives, and multiply them modulo x^d for a given d. These are the bulk of this PR. The next commit will introduce a fuzztest which hammers on the correction logic to ensure that it's not crashing.

The codex32 test will more thoroughly exercise the algebra, since there we can correct up to 4 errors. The bech32 test on the other hand should work without an allocator (though to exercise this you need to manually edit fuzz/Cargo.toml to disable the alloc feature -- this is rust-lang/cargo#2980 which has been open for 10 years and counting..)

apoelstra · 2024-09-30T19:41:44Z

cc @BenWestgate in case you want to look at this API (this is the same as the branch I posted on your discussion topic, but it's cleaned up so CI passes)

tcharding · 2024-10-02T00:04:31Z

What's the priority on this bro, and what sort of review do you need to be comfortable merging? (I assume the next PR will add a bunch of unit test that prove correctness of the algo here.)

apoelstra · 2024-10-02T16:56:21Z

The fuzz tests exhaustively prove correctness.

I can extract some fuzz vectors into a unit test if you think there's value in that.

BenWestgate · 2024-10-03T03:07:32Z

   Compiling bech32 v0.11.0 (/home/ben/Documents/GitHub/rust-bech32)
error[E0046]: not all trait items implemented, missing: `CorrectionField`, `ROOT_GENERATOR`, `ROOT_EXPONENTS`
 --> tests/codex32.rs:9:1
  |
9 | impl Checksum for Codex32 {
  | ^^^^^^^^^^^^^^^^^^^^^^^^^ missing `CorrectionField`, `ROOT_GENERATOR`, `ROOT_EXPONENTS` in implementation
  |
  = help: implement the missing item: `type CorrectionField = /* Type */;`
  = help: implement the missing item: `const ROOT_GENERATOR: <Self as bech32::Checksum>::CorrectionField = /* value */;`
  = help: implement the missing item: `const ROOT_EXPONENTS: RangeInclusive<usize> = /* value */;`

The snippet from the docs doesn't compile because I'm missing items that I can't find in the docs.

#![cfg(feature = "alloc")]

use bech32::Checksum;

/// The codex32 checksum algorithm, defined in BIP-93.
#[derive(Copy, Clone, PartialEq, Eq, PartialOrd, Ord, Hash)]
pub enum Codex32 {}

impl Checksum for Codex32 {
    type MidstateRepr = u128;
    const CHECKSUM_LENGTH: usize = 13;
    const CODE_LENGTH: usize = 93;
    // Copied from BIP-93
    const GENERATOR_SH: [u128; 5] = [
        0x19dc500ce73fde210,
        0x1bfae00def77fe529,
        0x1fbd920fffe7bee52,
        0x1739640bdeee3fdad,
        0x07729a039cfc75f5a,
    ];
    const TARGET_RESIDUE: u128 = 0x10ce0795c2fd1e62a;
}

BenWestgate · 2024-10-03T03:21:22Z

There is a situation in which we detectably cannot correct the string, but we yield some "corrections" anyway (to detect this case, we need to notice if the error iterator ends "early" for a technical definition of "early"; this is not too hard but there's an API question about whether the iterator should be yielding a Result or what).

Don't return "corrections" that don't validate the checksum?

We don't have a way for the user to signal erasures other than providing a valid bech32 character and then later telling the correction logic that the location is an erasure. We should be able to parse ?s or something.

All non-bech32 after the hrp should be treated as erasures.

There is also some missing functionality:

We should be able to correct "burst errors" where if the user indicates a long string of erasures all in a row, we should be able to correct up to checksum-length-many of them. (But if there are other errors, we then won't detect them, so I'm unsure what the UX should look like..)

Filling erasures takes precedence over detecting or correcting errors. Document that as more erasures are marked less errors can be corrected and detected and at the maximum erasures corrected there will be no error detection.

The next PR will be an "error correction API" PR. I would like some guidance from users on what this API should look like.

By far the easiest would be: I ask it to decode a bech32 string or list of ints [0-31], '?' or -1 marking erasures, it returns me a tuple with Boolean of checksum validity, a correction if one exists, and a list of error locations. Or (False, None, []) when no correction is possible.

It should also throw an error explaining when too many erasures have been marked for the HD of the checksum at this code length.

apoelstra · 2024-10-03T12:29:52Z

@BenWestgate what commit are you using to get those docs? The missing fields are present in this PR (and I test that all doccomments compile).

Don't return "corrections" that don't validate the checksum?

I can do this but there's a simpler check that I can do. But this doesn't address the API question, which is that we don't know whether the set of corrections is "good" until after they're all yielded. So do we waste memory accumulating them all, waste time generating them twice, or tell the user that they might get an error even after receiving some errors.

All non-bech32 after the hrp should be treated as erasures.

Yeah, this seems reasonable. I'll add a parsing API that does this.

By far the easiest would be: I ask it to decode a bech32 string or list of ints [0-31], '?' or -1 marking erasures, it returns me a tuple

I think this guidance is just not applicable to Rust, for a few reasons:

I would never return an actual tuple, but an error type which exposes a tighter API;
I can't return "lists" of corrections without (a) requiring an allocator, or (b) returning a very large object
"Just return a tuple" doesn't answer how I am supposed to handle the many non-checksum-related errors when parsing an HRP string (e.g. a bad HRP, mixed case, bad segwit version, violating length limits, invalid padding) (I guess, invalid padding should be treated as an erasure).

BenWestgate · 2024-10-03T22:36:05Z

@BenWestgate what commit are you using to get those docs? The missing fields are present in this PR (and I test that all doccomments compile).

Ah, that's my problem, I was reading the docs from the website and master.

I can do this but there's a simpler check that I can do. But this doesn't address the API question, which is that we don't know whether the set of corrections is "good" until after they're all yielded. So do we waste memory accumulating them all, waste time generating them twice, or tell the user that they might get an error even after receiving some errors.

How much memory are we talking? And how common is this situation for randomly generated strings?

I may have an opinion based on the brute force insert/delete correcting desktop application. It turned out best to not keep candidates in RAM or solving more than 2 inserts isn't possible, with a generator 2 inserts plus a few deletes is possible, perhaps 3 now that no substitutions must be checked and inserts are erasures. So wasting less time would yield better corrections. The memory use is one thread per CPU checking checksums or with this, checking substitution error qty and keeping track of the lowest distance score valid candidate found. If it returns corrections that don't validate, brute force must ms32_validate_checksum(candidate) before the thread can proceed to the next candidate. Unless it's so uncommon a list of "possibly valid" corrections can be kept and checked later for the true min distance correction, but more complex to implement.

"Just return a tuple" doesn't answer how I am supposed to handle the many non-checksum-related errors when parsing an HRP string (e.g. a bad HRP, mixed case, bad segwit version, violating length limits, invalid padding) (I guess, invalid padding should be treated as an erasure).

Some of the others could be treated as erasures as well: HRP, minor mixed case, wrong witver. If it's correctable that may be useful to return.

So perhaps Errors have a "trait" boolean called "correctable", and then a method that actually attempts the ECC. Correctable will be false when there are too many erasures or the length is wrong.

if ErrorName.correctable:
    candidate = ErrorName.suggest_correction()

apoelstra · 2024-10-04T12:20:54Z

How much memory are we talking? And how common is this situation for randomly generated strings?

Probably "one plus the maximum length of checksum we support without allocator, times 16, plus overhead". With compaction we should be able to reduce the size to 9, and by adding restrictions on the size of allowable strings we can probably reduce that to 4 or 5, but we'd still be using more memory than a correctly-parsed string would, which is not reasonable for an error type. (We'd have to silence lints to do it and we'd get user complaints that it was blowing up the size of all of their error types.)

And as for "how common", it would take this much stack space on every single call to the library no matter whether there were errors or whether somebody was even using the maximum-length checksum or not.

So perhaps Errors have a "trait" boolean called "correctable", and then a method that actually attempts the ECC

Yes, that's essentially what's in this PR.

clarkmoody · 2024-10-16T04:17:15Z

Any more ideas from @BenWestgate? Should I start reviewing this?

apoelstra · 2024-10-16T12:55:26Z

@clarkmoody yes please. I think that @BenWestgate's suggestions are not actionable in this PR -- though I will keep them in mind when I put together the next one.

src/primitives/correction.rs

clarkmoody · 2024-10-18T04:05:00Z

Tests running locally on 76d0dae

clarkmoody

ACK 76d0dae

apoelstra added 3 commits September 30, 2024 18:08

field: add ability to multiply by integers

4dfe325

Adds a CHARACTERISTIC constant to the Field trait, so this is yet another breaking change (though in practice I don't think anybody is implementing Field on their own types).

field: require TryInto<Base> for ExtensionField

fc903d6

apoelstra force-pushed the 2024-03--error-correction2 branch 2 times, most recently from c2d0ac8 to 4a10a86 Compare September 30, 2024 19:09

apoelstra added 4 commits September 30, 2024 19:15

correction: support erasures

383f788

apoelstra force-pushed the 2024-03--error-correction2 branch from 4a10a86 to 76d0dae Compare September 30, 2024 19:24

clarkmoody reviewed Oct 18, 2024

View reviewed changes

src/primitives/correction.rs Outdated Show resolved Hide resolved

clarkmoody reviewed Oct 18, 2024

View reviewed changes

src/primitives/correction.rs Outdated Show resolved Hide resolved

clarkmoody approved these changes Oct 18, 2024

View reviewed changes

apoelstra merged commit 3aab51d into rust-bitcoin:master Oct 18, 2024
15 checks passed

apoelstra deleted the 2024-03--error-correction2 branch October 18, 2024 18:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement error correction #205

Implement error correction #205

apoelstra commented Sep 30, 2024

apoelstra commented Sep 30, 2024

tcharding commented Oct 2, 2024 •

edited

Loading

apoelstra commented Oct 2, 2024

BenWestgate commented Oct 3, 2024

BenWestgate commented Oct 3, 2024 •

edited

Loading

apoelstra commented Oct 3, 2024

BenWestgate commented Oct 3, 2024 •

edited

Loading

apoelstra commented Oct 4, 2024

clarkmoody commented Oct 16, 2024

apoelstra commented Oct 16, 2024

clarkmoody commented Oct 18, 2024

clarkmoody left a comment

Implement error correction #205

Implement error correction #205

Conversation

apoelstra commented Sep 30, 2024

apoelstra commented Sep 30, 2024

tcharding commented Oct 2, 2024 • edited Loading

apoelstra commented Oct 2, 2024

BenWestgate commented Oct 3, 2024

BenWestgate commented Oct 3, 2024 • edited Loading

apoelstra commented Oct 3, 2024

BenWestgate commented Oct 3, 2024 • edited Loading

apoelstra commented Oct 4, 2024

clarkmoody commented Oct 16, 2024

apoelstra commented Oct 16, 2024

clarkmoody commented Oct 18, 2024

clarkmoody left a comment

Choose a reason for hiding this comment

tcharding commented Oct 2, 2024 •

edited

Loading

BenWestgate commented Oct 3, 2024 •

edited

Loading

BenWestgate commented Oct 3, 2024 •

edited

Loading