Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add Cryptid codecs #345

Merged
merged 11 commits into from
Sep 24, 2024
Merged

add Cryptid codecs #345

merged 11 commits into from
Sep 24, 2024

Conversation

dhuseby
Copy link
Contributor

@dhuseby dhuseby commented Apr 7, 2024

This adds a number of multicodec values for various Cryptid projects ahead of the public release of the code. Most of the code is already available:

Signed-off-by: Dave Grantham <dwg@linuxprogrammer.org>
@dhuseby dhuseby requested review from rvagg and vmx as code owners April 7, 2024 22:26
@dhuseby
Copy link
Contributor Author

dhuseby commented Apr 7, 2024

The new Multisig and Multikey specs and implementations strictly follow:

  • They MUST be in-band (with the value); not out-of-band (in context).
  • They MUST avoid lock-in and promote extensibility.
  • They MUST be compact and have a binary-packed representation.
  • They MUST have a human-readable representation.

This cannot be said for the existing Varsig spec. It relies heavily on context to know how to decode the bytes of the digital signature. More importantly, if your implementation doesn't support a specific algorithm, there is no way to know how many bytes to skip over in the stream to skip over the Varsig. Multisig fixes this as an explicit goal.

@rvagg
Copy link
Member

rvagg commented May 1, 2024

Apologies for our slowness on this @dhuseby, both Volker and I are a bit stretched and you've given us a lot to review with your PRs and the cost of context switching to try and get our heads around all the new stuff in here isn't small.

Firstly, the one we tend to be most concerned about is anything that adds ipld - i.e. anything that might qualify for going into the "codec" portion of a CID.

  1. If it's not intended to live in a CID then ipld is probably the wrong choice.
  2. If it follows the same decoding and encoding process as an existing codec, then adding a new entry is probably not appropriate. It should be a new encoding format that's not already covered. We discourage using codec codes as a signalling mechanism for anything other than "how do I turn these bytes into the IPLD data model and back again".
  3. If decoding bytes using this codec doesn't yield any "links" (concretely: CIDs), such that it can contribute to a larger graph, then serialization may be a better choice.

The new multi* entries will also require some additional thought and consideration. And due to the changes of the tags for existing entries we may need to pull in some folks who rely on those.

@dhuseby
Copy link
Contributor Author

dhuseby commented May 8, 2024

@rvagg

If decoding bytes using this codec doesn't yield any "links" (concretely: CIDs), such that it can contribute to a larger graph, then serialization may be a better choice.

Ah...ok. These are a new IPLD data structures but I think you're right, serialization is probably correct here.

The new multi* entries will also require some additional thought and consideration. And due to the changes of the tags for existing entries we may need to pull in some folks who rely on those.

This is a fair concern. I think to be safe, I'll just duplicate the overlap and assign new numbers.

Let me fix this.

Dave Grantham added 3 commits May 7, 2024 20:38
Signed-off-by: Dave Grantham <dwg@linuxprogrammer.org>
Signed-off-by: Dave Grantham <dwg@linuxprogrammer.org>
@dhuseby
Copy link
Contributor Author

dhuseby commented May 8, 2024

Thank you for taking time to review this. I changed the provenance-log and provenance-log-entry to be serialization. I changed a VLAD from cid to vlad since it is an identifier but not a CID. I restored the varsig entries to their original state and moved all of the multisig signature codecs to unique names and values to avoid collision. This should streamline the review and eliminate any push-back from repurposing.

Signed-off-by: Dave Grantham <dwg@linuxprogrammer.org>
@dhuseby
Copy link
Contributor Author

dhuseby commented May 8, 2024

I also just corrected the es384-msig name error. I filed an issue for the same name error for varsig: #348

Signed-off-by: Dave Grantham <dwg@linuxprogrammer.org>
@dhuseby
Copy link
Contributor Author

dhuseby commented May 8, 2024

Also just cleaned up the es521-msig name error. I filed an issue for the same name error for varsig: #349

@vmx
Copy link
Member

vmx commented May 8, 2024

I only had a quick look. We are generally pretty cautious adding anything into the number space which is encoded into a single byte as it is so limited (also some things that are already there shouldn't really, but that's hard to change). So if you would move your additions to a 2 bytes or higher section, that would be great.

@dhuseby
Copy link
Contributor Author

dhuseby commented Jun 7, 2024

So if you would move your additions to a 2 bytes or higher section, that would be great.

I can do that, but what about the multikey, multisig, nonce, and the chacha20-poly1305? I put all of these in the single byte space because they are fundamental types (or soon will be). I put chacha20-poly1305 to be next to the existing chacha codecs.

I'm happy to move things elsewhere but I had reasons for putting these in the single byte space.

@vmx
Copy link
Member

vmx commented Jun 10, 2024

(or soon will be).

That's the point. At the moment we hardly add anything to the range of single bytes, the idea is to put things there that have wide adoption/are standardized. I know that it's annoying if it really gets used, that later on you don't really change to a different code, but that space is just so limited.

@burdiyan
Copy link

FWIW, the word Multikey is currently being used in some of the upcoming W3C specs related to DID and Verifiable Credentials: https://www.w3.org/TR/vc-data-integrity/#multikey.

@dhuseby
Copy link
Contributor Author

dhuseby commented Aug 27, 2024

My multikey predates the W3C use. There's no mention of multikey in this file yet. Plus the verifiable credentials spec has failed to gain adoption in key political and business areas.

Dave Grantham added 2 commits August 26, 2024 20:40
Signed-off-by: Dave Grantham <dwg@linuxprogrammer.org>
Signed-off-by: Dave Grantham <dwg@linuxprogrammer.org>
table.csv Outdated Show resolved Hide resolved
Signed-off-by: Dave Grantham <dwg@linuxprogrammer.org>
Copy link
Member

@vmx vmx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are lot of codecs, but as all of them are at least in the 2 byte range I don't see a reason not to accept them. We surely want people to use the multicodec table and experiment with things. Hence I approve it.

But before merging I'd like to see another approval from someone else, as it isn't just me making decisions.

table.csv Outdated Show resolved Hide resolved
@rvagg
Copy link
Member

rvagg commented Sep 4, 2024

I think I'm OK with this, it's just a lot of new tags being added, including single-entry ones: vlag, nonce, multikey. I'd be more comfortable if we didn't encourage more expansion of tags. But I don't have a hard reason to object.

@dhuseby
Copy link
Contributor Author

dhuseby commented Sep 6, 2024

including single-entry ones: vlag, nonce, multikey

At the risk of bikeshedding... vlad and cid tags should just be "identifier" or something generic like that. multikey and key could be combine into "key-like" or something generic like that. But that would require re-tagging a bunch of values which I'm sure is way worse of an outcome. The tags are all fundamental types that fit with the establish pattern IMO.

Thanks for the time you spent reviewing these.

@rvagg
Copy link
Member

rvagg commented Sep 6, 2024

I see no reason to block with further bikeshedding

@dhuseby
Copy link
Contributor Author

dhuseby commented Sep 20, 2024

Thanks! Can we land this?

Signed-off-by: Dave Grantham <dwg@linuxprogrammer.org>
@dhuseby
Copy link
Contributor Author

dhuseby commented Sep 20, 2024

I think all concerns have been addressed.

@rvagg
Copy link
Member

rvagg commented Sep 24, 2024

Sorry for the delays Dave, we're not well set up for dealing with complex requests like this.

@rvagg rvagg merged commit badcfe5 into multiformats:master Sep 24, 2024
1 check passed
@vmx
Copy link
Member

vmx commented Sep 25, 2024

@dhuseby Thanks for your patience and all the changes compared to the original PR!

@bumblefudge
Copy link

bumblefudge commented Oct 3, 2024

@rvagg @vmx Sorry for the late appearance, this crossed my inbox last week but I was at TPAC on a borrowed laptop and couldn't get here in time.

My multikey predates the W3C use

This kinda feels like a "trust me bro" assertion. The "W3C use" was based on early versions of juan benet's eternally-unfinished/unpublished spec from 2016 which seems consistent with references in the original commits in this org. In fact, I myself removed multikey from the top-level readme in a 2023 PR exactly because I found discrepancies in various proposed multikey variants in the history of this org but no spec more canonical than JB's original issue (which matches earlier versions of W3C usage, fwiw) to discriminate between the variants. The new spec seems perfectly defensible but I would consider giving it a suffix or renaming it to avoid confusion with the older multikey and its variants. If it doesn't get renamed here in this org, it might just cause headaches if this ever goes to IANA.

FWIW W3C DIDWG and VCWG do not use the term "multikey" in surface-level schemas but do under the hood, for RDF-expansion purposes but not as a CID-like multiformat/composite containing keytypes; what they actually use (and what recent drafts specify internally rather than normrefing) is a multibased raw-codec ULEB128 of public key bytes with the w3c-namespaced RDF/JSON-LD semantic publicKeyMultibase, with key type information passed separately rather than inline. The Data Integrity vocabulary, on the other hand, groups publicKeyMultibase and other terms under the umbrella property of multikey, tho, so maybe there is some kind of confusion potential here, I'm not really an ontology expert. AFAIK none of the representatives from Maersk, GS1, or the Singaporean Trade Commission were lying about using publicKeyMultibase in prod last week at W3C TPAC? Also the DNSP social media ecosystem uses the older, simpler "multikey" family of terms for its signature verification stuff, and technically UCAN does as well, since it uses did:key throughout...

Similarly, I believe there is some multisig work ongoing elsewhere in the PL cinematic universe that I referred OP to when they asked about this a year ago, and never heard back about. That one goes by varsig rather than multisig and I'm not sure it's going to production or not, but could also be a source of avoidable confusion. Probably nothing but odd it wasn't mentioned in the PR.

No particular action needed at this time but I figured this thread should have more information in it for future reference.

@dhuseby
Copy link
Contributor Author

dhuseby commented Oct 8, 2024

@bumblefudge I don't care what it is called. If "multikey" is a hotly contested name then let's go with "polykey" or "unikey" or "whateverkey" or "skeletonkey". I officially do not care. I just need these changes to get landed. So in my best effort to end the yak shaving and to celebrate Halloween/Diwali and our triumph of light over darkness/enlightenment over ignorance, I propose the use of "skeletonkey" to alleviate the concern.

@dhuseby
Copy link
Contributor Author

dhuseby commented Oct 8, 2024

If Rust allowed for unicode type names, I'd even go so far as to call it "💀 key"

@dhuseby
Copy link
Contributor Author

dhuseby commented Oct 8, 2024

@dhuseby Thanks for your patience and all the changes compared to the original PR!

No worries! This is Thunderdome open source. 🪓

@bumblefudge
Copy link

oh totally the sigil is yours! sorry it didn't land sooner.

@dhuseby
Copy link
Contributor Author

dhuseby commented Oct 24, 2024

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants