-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Redo FSE Table Description section #3810
Conversation
Looking into this now. While adding a more detailed explanation of the decoding process is welcome and a great idea, Also: this PR is now in conflict with |
I will add the examples back and fix the merge conflict this weekend. |
…tion of the algorithm
da4a607
to
01c97ab
Compare
I've updated this to add back the examples, but rephrased them to use the nomenclature in the new description. I've also updated the upper bound description for offset codes: Rejecting out-of-bounds values is well-defined for Huffman weights, literal lengths, and match lengths, since they have fixed upper bounds, but offset code upper bounds are decoder-defined, so I've added some verbiage stating that decoders are allowed to reject any streams encoding a non-zero probability for a value larger than their limit, and indicated that encoders should not include non-zero probabilities for an offset code larger than the largest offset code present in the stream. |
I've read the newly proposed version, and while I have no error to report, I unfortunately also can't state that the new version is an obvious improvement (i.e. clearer) compared to the older one. They both look difficult to read to me. It's unfortunate that the FSE table allocation process is so complex to describe. I understand that the new version also contains some clarifications which are mildly different from the original version. There might be some merit here, but I can't really separate this part from the rest in the PR. If you believe some clarifications are worthwhile on their own, it might be easier to review them one by one. |
I'm not sure what this part is in reference to, but I have been trying to avoid bringing implementation details into this. Normally the expectation is that if a spec describes an algorithm, an implementation is free to implement any equivalent algorithm. So, tracking cumulative probability vs. remaining probability for instance isn't really important, and a lot of the intermediates could be described in different ways, it doesn't really matter as long as it produces the same results. What I am mainly trying to improve for example is this type of description:
That requires some deduction to tell that the 255 is supposed to be (1<<8)-1, and the 8 is supposed to come from the Anyway, I will split this up and close out this PR and open some new ones. |
Several specification updates have been integrated since this PR was first submitted. |
This rewrites the FSE Table Description section in a more descriptive and exhaustive style that I think will be clearer to implementers.
Main changes:
(1 << Accuracy_Log)
being exceeded. I believe this condition is impossible because the largest possible decodable probability value is the value that will cause the cumulative probability to equal(1 << Accuracy_Log)
at which point the loop is terminated.