From 4ba11cb8204e67f3e0449b44fbbd418f73ac2a5e Mon Sep 17 00:00:00 2001 From: Kim Ebert Date: Wed, 7 Feb 2024 08:47:31 -0700 Subject: [PATCH 1/3] feat: add compressing DIDcomm messages using dictionaries in zstd Signed-off-by: Kim Ebert --- concepts/compression-dictionary/README.md | 110 ++++++++++++++++++++ concepts/compression-dictionary/YlU3M52.png | Bin 0 -> 1361 bytes concepts/compression-dictionary/b9y8VTC.png | Bin 0 -> 2458 bytes 3 files changed, 110 insertions(+) create mode 100644 concepts/compression-dictionary/README.md create mode 100644 concepts/compression-dictionary/YlU3M52.png create mode 100644 concepts/compression-dictionary/b9y8VTC.png diff --git a/concepts/compression-dictionary/README.md b/concepts/compression-dictionary/README.md new file mode 100644 index 000000000..3345b3bce --- /dev/null +++ b/concepts/compression-dictionary/README.md @@ -0,0 +1,110 @@ +# 0000: Compressing DIDComm messages using dictionaries (Ex. 0000: RFC Topic) +- Authors: [Kim Ebert](kim@indicio.tech) +- Status: [PROPOSED](/README.md#proposed) +- Since: 2022- +- Status Note: Compression theory +- Supersedes: +- Start Date: 2022-03-10 +- Tags: [concept](/tags.md#concept) + +## Summary + +Using Dictionary Compression, higher compression rates can be achieved for small messages with known entries. + +## Motivation + +DIDComm messages contain well know values and are often short in size. Using dictionary based compression may reduce the overall size of messages that may be transmitted or stored. + +## Tutorial + +### Training + +The first step is to determine the type of data that needs to be provided for training, and generating a number of requests that meets that criteria. + +An example of creating such an invite using Aca-py and curl + +``` +curl -X POST "http://127.0.0.1:8150/out-of-band/create-invitation" -H "accept: application/json" -H "Content-Type: application/json" -d "{ \"alias\": \"\", \"attachments\": [ ], \"handshake_protocols\": [ \"did:sov:BzCbsNYhMrjHiqZDTUASHg;spec/didexchange/1.0\" ], \"metadata\": {}, \"my_label\": \"\", \"use_public_did\": false}" +``` + +Result: + +``` +{"invitation_url": "https://localhost:443?oob=eyJAdHlwZSI6ICJkaWQ6c292OkJ6Q2JzTlloTXJqSGlxWkRUVUFTSGc7c3BlYy9vdXQtb2YtYmFuZC8xLjAvaW52aXRhdGlvbiIsICJAaWQiOiAiYTYwZDhlYTAtZDg1Zi00NDJkLTk0NTktZTk2NWEyYjg3Nzg1IiwgInNlcnZpY2VzIjogW3siaWQiOiAiI2lubGluZSIsICJ0eXBlIjogImRpZC1jb21tdW5pY2F0aW9uIiwgInJlY2lwaWVudEtleXMiOiBbImRpZDprZXk6ejZNa296SGNjNzI0ajlGOFJBR214bTFOY3hpVlhtOE10c0NMQ0paWktacWRwd0Z3Il0sICJyb3V0aW5nS2V5cyI6IFsiZGlkOmtleTp6Nk1rcTNycDg1cm1qTjRwdnN5WUpWTlZoVXZBNUJwTWFlNkd5MlBUUzVZaHdVelIiLCAiZGlkOmtleTp6Nk1rbnZwTmEzQXdWOHl6SHJaM0s3WXVDdU1adXBiSEt0ZDJwVDN4U3NzODRqenEiXSwgInNlcnZpY2VFbmRwb2ludCI6ICJodHRwczovL21lZGlhdG9yNC50ZXN0LmluZGljaW90ZWNoLmlvOjQ0MyJ9XSwgImhhbmRzaGFrZV9wcm90b2NvbHMiOiBbImRpZDpzb3Y6QnpDYnNOWWhNcmpIaXFaRFRVQVNIZztzcGVjL2RpZGV4Y2hhbmdlLzEuMCJdLCAibGFiZWwiOiAiTGFiIn0=", "invitation": {"@type": "did:sov:BzCbsNYhMrjHiqZDTUASHg;spec/out-of-band/1.0/invitation", "@id": "a60d8ea0-d85f-442d-9459-e965a2b87785", "services": [{"id": "#inline", "type": "did-communication", "recipientKeys": ["did:key:z6MkozHcc724j9F8RAGmxm1NcxiVXm8MtsCLCJZZKZqdpwFw"], "routingKeys": ["did:key:z6Mkq3rp85rmjN4pvsyYJVNVhUvA5BpMae6Gy2PTS5YhwUzR", "did:key:z6MknvpNa3AwV8yzHrZ3K7YuCuMZupbHKtd2pT3xSss84jzq"], "serviceEndpoint": "https://localhost:443"}], "handshake_protocols": ["did:sov:BzCbsNYhMrjHiqZDTUASHg;spec/didexchange/1.0"], "label": "Lab"}, "state": "initial", "trace": false, "invi_msg_id": "a60d8ea0-d85f-442d-9459-e965a2b87785"} +``` + +We then extract the data required for the invitation. + +``` +{"@type": "did:sov:BzCbsNYhMrjHiqZDTUASHg;spec/out-of-band/1.0/invitation", "@id": "2dbf6f36-8dc0-4b35-9558-dab26e3ae3c3", "services": [{"id": "#inline", "type": "did-communication", "recipientKeys": ["did:key:z6MkqfRyf4ycr6HFpo4XyhQp8gBwdBW51Z2yXnxg11AuFZT6"], "routingKeys": ["did:key:z6Mkq3rp85rmjN4pvsyYJVNVhUvA5BpMae6Gy2PTS5YhwUzR", "did:key:z6MknvpNa3AwV8yzHrZ3K7YuCuMZupbHKtd2pT3xSss84jzq"], "serviceEndpoint": "https://localhost:443"}], "handshake_protocols": ["did:sov:BzCbsNYhMrjHiqZDTUASHg;spec/didexchange/1.0"], "label": "Lab"} +``` + +Finally, we strip out the keys that are specific to the local agent, leaving content that can easily be compressed. + +``` +{"@type": "did:sov:BzCbsNYhMrjHiqZDTUASHg;spec/out-of-band/1.0/invitation", "@id:": "", "services": [{"id": "#inline", "type": "did-communication", "recipientKeys": ["did:key:"], "routingKeys": ["did:key:", "did:key:"], "serviceEndpoint": "https://:443"}], "handshake_protocols": ["did:sov:BzCbsNYhMrjHiqZDTUASHg;spec/didexchange/1.0"], "label": "Lab"} +``` + +We do this a hundred or so times, and include other configuration options of interest. ( Research into what should be included here could provide some value ) + +We then create the dictionary. + +``` +zstd --train ./data/* -o dict +``` + +This dictionary can now be used to compress the data before it is base64 encoded into the url. + +### The Compressed Out of Band Message + +Using a unique url parameter for compressed out of band messages, the client can determine the alternative behavior to follow. + +The coob message includes the following binary data. The first 4 bytes indicate the dictary to be used, perhaps as an unsiged long. Or alternatively we could use a d= parameter for the storage of the dictionary id. + +Dictionary IDs would be used to indicate which dictionary the client should use. Occassionally, ARIES may release a new dictionary. This new dictionary should not be used for limited time to allow all clients to get the latest dictionaries. These dictionaries could be auto-retrieved by the clients when connection to the internet is available. + +The rest of the coob data is the a compressed zstd binary output. After the binary data is combined together, the data is base64url encoded. + +``` +https://localhost:443?c=sztd&d=1&oob=KLUv_Wc3PnoBMAG1BgBijCwjEIfWAzs-1Bd8YPpweoDAqvElxVlFB2t_B0mLRHdVVVVVwQ1ZRjAL7yxb-TIysjm8Ed-yTeWLF1qo8MlxiEaMtHI3fSrdFbppodFuTwhO6WsiVbU3ECY-bHpEdFBAg8QUpAG-8RKYVKWACeQ87VWx2H7qLWqW-QNtLAt11M6HIEmkxwYucGqk1akI2O1ABcPSONJHGQaQDJnr8mtWyfL4Ho4t6nhZ-XGX-8dUCIn_JQ8CCgCVXyJ1RAnO_AEwww7QY1FQCCwIETfkSRDzzwJ-R-kV6uOdbQ== +``` + +The URL is reduced from 794 bytes to 370 bytes.(46.6 % of original size) There difference in the QR codes can be seen below. + +![](./b9y8VTC.png) +![](./YlU3M52.png) + +### Redirect on failure + +If the client cannot decode the coob message, or does not have the appropriate dictionary, the client can visit the url and will be redirected to the decompressed url. + +## Dictionary Storage + +Dictionaries could be stored along side the RFCs, or an alternative method for transfering dictionaries between clients could be derived. + +## Drawbacks + +Dictionaries may need to be regularly rebuilt to adjust to new protocols. Some dictionaries may not provide any compression benefits depending upon the message. + +## Rationale and alternatives + + +## Prior art + +[zstd] (http://facebook.github.io/zstd/) +[zstd manual](https://github.com/facebook/zstd/blob/dev/programs/zstd.1.md) + +## Unresolved questions + +- Where are dictionaries stored +- How do we specify compression will be used for DIDComm messages +- What to do when a client doesn't support compression. + +## Implementations + +*Implementation Notes* [may need to include a link to test results](/README.md#accepted). + +Name / Link | Implementation Notes +--- | --- + | + diff --git a/concepts/compression-dictionary/YlU3M52.png b/concepts/compression-dictionary/YlU3M52.png new file mode 100644 index 0000000000000000000000000000000000000000..9bbc91b0c90a33a2b89c4372fc440b6ec48f80d2 GIT binary patch literal 1361 zcmV-X1+MyuP)eh}bWvp4pXFnnOn)r_^epdbEPb_<8TqqJR3xU-ZY&O=K*~GfSV& zCdDnZO{aTqMWPek^?Vj-f>h-kq<$Z(mUZjSQQc^6hfW%z=jCxMk0OIbKl|_l=<_(6 zKsxIyw1dd=Sl`tRt|lob3CFKEm4nFI^DuTM&r>|sg%BSHHLc%ey*fIdRlpi2H7}}%PMHN5x1{f5f!7|n-bl)gUiSR$Xl%an z&796BxhOe$(Jj?;*EUJ#&W7kEaqAeX`1YeBu#6V=c58HiEXl@|P4c3uHtFaPgwxeA zs;%5~|@S;~*s>9X@2~=2?{o1h(bm!#dycdfd*W4jona~q4??gGMxx;0k%5aM? zJFEL3{=2jyfbh}*!-Hry{OiZ%L7t<%7d_8Eei?7@*NMmI_<>p+(3EXdU5 zLFXTF`|=v!M>%|2aSkT)SmT%3F68U_af==l-9FtVh8Iw$7P>=0A^)Nm74zU#X8bL+ zNl)wZV%xxpvY_DcfwMxOa?e46=4Gl+6&$D@98P_)QefWWwA(NUNWtJm3AdZcH4ThK z+U=C^x#Z|Zi`Z()%Sg~vZ}u*W7DZ;O7ez0LEx_3f7Y>&a*!`a>xKQ(<=N(IK#?h(V z+VV>Se;?(u1rw;RKrFbJo`lrmp1r8ObJJUs7?Ba$e9hMVas&=^*KK_M#gpg-$E>e(Y^@=0*Sf|AoKk&!GPRe}hk- TbL$Jh00000NkvXXu0mjf2P3t8 literal 0 HcmV?d00001 diff --git a/concepts/compression-dictionary/b9y8VTC.png b/concepts/compression-dictionary/b9y8VTC.png new file mode 100644 index 0000000000000000000000000000000000000000..db516c4475cf3956a8654beffa73b187dd550ac2 GIT binary patch literal 2458 zcmV;L31#+)P)@))z&QbJj(qj-&Q=mcyKZG+S>LGoG~G=TYOs1 zZnl?YeaNngCq%=5w^f!9=;zBeKX~qgXMBz`T2;Ple(rvh(_bOH8Av`hOY#5y{FS&j z1LN1(9pzYWq-0Js>Uc&gUG=d(|6Z=Q0?C4huMPD_DVQ#)uY5FnEPho*(oO$5-setu)aO%zH@*OCJY@imYhBUx;mHAR$ zrm(aXY{w0qkHaUo=UhCNP5hS#SR1rT{5u%;a7Cu(q$iZ{B2~TU#EVIQ?i4+zqY#U> z&MqsjYHhfug-bf)seMWVe22mm7A_+r9e92blaTesm;1aDv?dJ@cDX{4SMIuxjN59h zoaW3A?ul*#>u-A+;M>-Rb2sbbD7m5BAU_{#@59-^g@LjtnlwP4Ny+u0r0}uIs6vt= zRv|QhzN&K4|0e-n24vG^NIsNfrJCatWJ+!sh}e6R0N-JoW=K8T4|*x_KRZ#Kb;l;+ zsh#@R69Em~M2V{kIV0-H!giS=(;b<_&hiW4-N1M;_LxtObJbJ;l(2SGhEN5$*&E-c z0jiBt2iq9zRVfsZ?sR0+QBaqYnqE1i0iqClx|~`Z@9?L(A?RFGgp_OM^O**ybUWZV>4J=}+`ivB}ncK^^`0w5J|WS<)vXmm4fxM(FMxT;c$WBEkpejrZh zUBfPdxn3(zex78efVa`19W1LQxGY!un|rWwFaO`siXXn>~2Z_0)=KxFcOf?G}= zLIaov5r9%q{jIIVMpGK#KBFD*XMBf3w>%)+U?5FCxTFzHU7t-FAT1rbeLC`O8+93U z^&+qN(5A%X@9>Sx?LerV){2O5B|gzmR~e&Od{wkNsj49j@CvGO7`KyxQ_)i#vcj|6 zaF0I@A76Ms5boh_P?0UNUHSscqM;Yb*azg5W^Y3p;9H%9Rwl*j8tu^D<41$#byQM6 zBYrz@7sq7p56c=MA`T~EX=22#sGt)8IoF^op`l5!Dirma6Dt|w=+f$6;@%B>Yuo|> z@Ex)-8>>}aop5Q+B*(zNU`PYpH8X}~?Ijgsgd*m}Dh)*y>V|%E+`EC~W3GoEi$+1L z#`OfamYDDp9}XYV0OP2mbPZF>j~gB}Of57|l|ROul~WoZDc^CBg-k4^C8PfaNM2|T zH0y?jG(eADAyO=c>}+c;4Cc$|Y*UGz+_8rE66#(THj<2d z=-1+I2ND^p&gUGJ6jn?!Hf}0hM!D63=nw7P4wUPCKHzx3QI#x97y$2XQaIxnw{@ohjc%X?kdNx z*S&FaKms< zNlIu#WgJ|S&h?Q9h}*8@L3#KxSYM0o4wqJ88nW1PzLSsJfvDzOheEAOH<2J?AT=O> zRjXj}{dXL8JMgVH8fa*EBUDu!HAzWnG1X}O&PRPn1Ed0;2N15GEZBC6o__14-m0Lr z28jPR(Yzao>A!IQN`W)(v`GX)1jlRsM7RG-y50^9+B7~ETv4f^i)o^Xj?NK}dbPoS zBJO^mA0&IW&r(u@Rx}!f(8(u$vfxwW{5}nkyP=uL2qIgwZRQCb4@-(OR%P@!iB!P2 zm=dkh;#79H1o*}({es8L@A!rAZXj)~m56f@yoUEtHB$j~e9=H#+27FRZlDB(sk12s zLP?Z*ZJTv)bw3=^{Nu;FfpSPlb-lWRo?7D?5G{FrLWEV#Lph`YYI>?G*>HqWzv#uN zIO);L1t^_3)Jy`@9QV^q<}&VJPa{1EQH@DeWkj!*1gP$LG_z*-3`0~>Uh&DOK67~6Re4HgPvNjry8HP z=AH()&v=xruka;AhKs>2$RW9kLlei=&Mpz~kpJ74Or!dPzey9y4wh9dViz9Y9sB)2 z3J$g2>3y{LKHx%F@Cx$Z7S>lN^7@bDx*51HDzb?)($D8OYpHMynwrZd)>`0r8sJ+p z>N7Tus`;}zP?Bb{jFOYzce!o{@}+l9YqG=lX!>Jh;34h#HR5&a(f~EltjWP`gl>@) zhwSO4?YgxZfNlGqt<>GXU96)m0RoDn(Qu}tF!3&xA+>py z*84?V)ZojDCLJ(rV+HGI{iOwkpIDZC0nQn<5lsh#@H9TG<6ATG Date: Wed, 7 Feb 2024 08:53:17 -0700 Subject: [PATCH 2/3] chore: add rfc number Signed-off-by: Kim Ebert --- .../README.md | 2 +- .../YlU3M52.png | Bin .../b9y8VTC.png | Bin 3 files changed, 1 insertion(+), 1 deletion(-) rename concepts/{compression-dictionary => 0812-compression-dictionary}/README.md (99%) rename concepts/{compression-dictionary => 0812-compression-dictionary}/YlU3M52.png (100%) rename concepts/{compression-dictionary => 0812-compression-dictionary}/b9y8VTC.png (100%) diff --git a/concepts/compression-dictionary/README.md b/concepts/0812-compression-dictionary/README.md similarity index 99% rename from concepts/compression-dictionary/README.md rename to concepts/0812-compression-dictionary/README.md index 3345b3bce..5bc822678 100644 --- a/concepts/compression-dictionary/README.md +++ b/concepts/0812-compression-dictionary/README.md @@ -1,4 +1,4 @@ -# 0000: Compressing DIDComm messages using dictionaries (Ex. 0000: RFC Topic) +# 0812: Compressing DIDComm messages using dictionaries (Ex. 0000: RFC Topic) - Authors: [Kim Ebert](kim@indicio.tech) - Status: [PROPOSED](/README.md#proposed) - Since: 2022- diff --git a/concepts/compression-dictionary/YlU3M52.png b/concepts/0812-compression-dictionary/YlU3M52.png similarity index 100% rename from concepts/compression-dictionary/YlU3M52.png rename to concepts/0812-compression-dictionary/YlU3M52.png diff --git a/concepts/compression-dictionary/b9y8VTC.png b/concepts/0812-compression-dictionary/b9y8VTC.png similarity index 100% rename from concepts/compression-dictionary/b9y8VTC.png rename to concepts/0812-compression-dictionary/b9y8VTC.png From 7556739136d91f1955f31af70a648a6a8e849f90 Mon Sep 17 00:00:00 2001 From: Kim Ebert Date: Wed, 20 Mar 2024 15:22:25 -0600 Subject: [PATCH 3/3] feat: add additional details as discussed in WG call Signed-off-by: Kim Ebert --- .../0812-compression-dictionary/README.md | 66 +++++++++++++++++++ 1 file changed, 66 insertions(+) diff --git a/concepts/0812-compression-dictionary/README.md b/concepts/0812-compression-dictionary/README.md index 5bc822678..2f776977f 100644 --- a/concepts/0812-compression-dictionary/README.md +++ b/concepts/0812-compression-dictionary/README.md @@ -88,11 +88,77 @@ Dictionaries may need to be regularly rebuilt to adjust to new protocols. Some d ## Rationale and alternatives +### QR Code Quality + +By reducing the size of QR codes for offline or cases where URL redirects are not available, the QR code becomes more manageable. + +### Reducing need for redirect support + +URL Shorterning services may introduce privacy concerns + +### Binary based format + +Instead of using compression, a binary file format would reduce overall message size. + +### Standard Compression without dictionaries + +#### gzip + +Using + +``` +gzip -9 +``` + +We can reduce the size of the Out of Band invitation. + +``` +https://localhost:443?c=gzip&oob=H4sICODx-mUCA3RtcC50eHQAhZFdb5swFIbv9ysqdjtCykdC2E3TbC1qCpqaDy1M02Ts03CSYDvYkEDV_z6cTVsvJu3a5338nPO-WDe6lWBFVxZDFinRRLfdLFfppkiqXYzH7NNyNV3E249KAnVErW3xbOeEM-d6MHSQN6iJRsGtD1fWDTIDIqMhC4EMbRYGz7bvu8ye-MHEhskoIG4ejsdhYMYVVA1SUH3m24v1K_se-QE5mOc3XjYVZVlzpH--qoCiROB6Du0FcNHfQxt1o2QvupjSsevvJnfh0_S-PJfXKT3j-msZJlrNHmcPWTbPjkye7k7Wd4PrF0O-_Sfs6FUyDKpyl_qyUe3mYZ2ui1UzDW5lQmB037pflotgU5xW3ZNRe5vljUyJNz2tw7aLq8ybjzf1rE6yWubxXDNXLr3zQqnQ33XHi8jvm3zmTArk2uxfaC1V5DglMCRaVP5Ag9ID5AwpCg20GKCIfN-zXg2h6LtRBdnDD1kJLag4_F3pf_X2M3CmPWELpt6L0YHkcDAejyS3Xt_9BC2pH8MxAgAA +``` + +Using gzip, we can reduce the size of the Out of Band invitation from 775 bytes to 590 bytes. (76.13 % of original size) + +#### ZSTD without a dictionary + +Using + +``` +zstd -9 +``` + +We can reduce the size of the Out of Band invitation. + +``` +https://localhost:443?c=zstd&oob=KLUv_WQxAW0MAPbYVCjgzMwDaJPAFkV01uPxaHNM71pq1L8QfLwg044FcGs2cBMMwwwjDC8ESQBJAE4AEMhKUkIT1LRQg_q0DQe1XxAXAvHi33T_dz9qMAYie6kGTWBoeziqnjVBxqJELJWDrPOWprs2DKY1SWhQfVhX9m1cVNEzuAsiRQsSBone1-QeLv-p2AD34RQiDjMBopHSo1YrxJvLEVagb0Cf7Oufv3pX_3ochyvk9zn3AyIFxKKK2ut_nWtqPh5TkyjlgDgEoWSa8pkqWbJ7YmnZyy5P4mEekynrrBCcF09kolj-rkNo8WwYilJpkEAFh9MrG1CRz7JdsMsCBtRflpiG66iZu6mTub89HBE9CZ5sO9kkywIAFVGjDBzXKMczl2714_YtB1e8cZclfEM3bS_dKuIkZYWfsVXf-Ovb-4ytfTNkhzd961wUVTrvve37ONuLEBE7oUZOHRQgMMKgpQeLQytzAcvFxYE1CCvGeoAVfTFYf7zoaWmohyZtK6wtoqCuRTa3JQrbbxSl8RATCDPtnYPf +``` + +Using zstd without a dictionary, we can reduce the Out of Band invitation from 775 bytes to 582 bytes. (75.10 % of original size) + +### DIDComm Compression + +It would be possible to use compression in DIDComm communications. Each message would be compressed individually, as DIDComm doesn't guarentee the order of messages being delievered. + +Things to consider + +* Compress may not want to be used until Discover features is shared +* It may be possible to sharing custom dictionaries as a separate protocol + +### Process of creating new dictionaries + +To be defined + +### Distribution of dictionaries + +If dictionaries are used, they should be included in DIDComm libraries +The dictionaries may be a dependency of a DIDComm library ## Prior art [zstd] (http://facebook.github.io/zstd/) [zstd manual](https://github.com/facebook/zstd/blob/dev/programs/zstd.1.md) +[brotli](https://datatracker.ietf.org/doc/html/rfc7932) +[zlib](https://en.wikipedia.org/wiki/Zlib) +[DEFLATE](https://datatracker.ietf.org/doc/html/rfc1951) ## Unresolved questions