-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validation of Items on local contruction #98
Comments
Oh I just saw that the serialize functions would catch such an error. But that might be too late in a problem cycle to be able to recover from. Imagine serializing a list of 100 tokens, where one of those tokens is invalid... |
Hello @LPardue |
No problem! Discussed this with a few people, we thought one approach could be to avoid using primitive types directly and instead make it more like
That implements FromStr and Deref<Target = str>, then the validation could be done in these on object construction. |
Thanks for the idea. |
Hej folks, stumbled across this crate a bit ago and want to use it to programmatically construct structured field values to be returned as a header from an HTTP API. I found a So I tried giving this a stab, as I liked the idea and have never fiddled around with the It turns out it would be a bigger change than initially anticipated, so I wanted to share my approach before continuing, before I sink too much time into a potentially unwanted change:
Examples for point two - just rough drafts from playing around. I do understand that this is technically a different issue, so feel free to move this to it's own issue if you see fit. // Those are all known failures for validating a BareItem::Token
#[derive(Error, Debug, PartialEq)]
pub enum TokenError {
#[error("first character is not ALPHA or '*'")]
InvalidLeadingCharacter,
#[error("disallowed character")]
InvalidCharacter,
#[error("empty input string")]
EmptyValue,
#[error("end of the string")]
UnexpectedEOF,
}
// `ParserError` contains all the different things that can go wrong during parsing
// plus an `Other` type that is the current &'static string, to allow for easier transition
#[derive(Error, Debug, PartialEq)]
pub enum ParserError {
#[error("parse_token: {0}")]
TokenError(#[from] TokenError),
#[error("{0}")]
Other(&'static str),
}
// Same, but for serializing, mind how the prefix changes for the TokenError, but the actual error type stays the same
#[derive(Error, Debug, PartialEq)]
pub enum SerializerError {
#[error("serialise_token: {0}")]
TokenError(#[from] TokenError),
#[error("{0}")]
Other(&'static str),
}
// Needs better name, but is basically the internal representation and could change as needed
#[derive(Error, Debug)]
pub enum ErrorRepr {
#[error(transparent)]
ParserError(#[from] ParserError),
#[error(transparent)]
SerializerError(#[from] SerializerError),
}
// If we want to we can wrap all of that "transparently" into a public error.
// This would be the type the user works with and would currently only contain
// same string as the library already has. Could be enhanced afterwards.
#[derive(Error, Debug)]
#[error(transparent)]
pub struct PublicError(#[from] ErrorRepr); This means within the actual codebase first all errors could be moved to the wrapped - Err("...")
+ Err(ParserError::Other("...")) Which would change test codebase for all non-migrated errors the types as well: assert_eq!(
- Err("serialize_integer: integer is out of range"),
+ Err(SerializerError::Other(
+ "serialize_integer: integer is out of range"
+ )),
Serializer::serialize_integer(1_000_000_000_000_000, &mut buf)
); From there the actual errors could be migrated. |
@zcei I think that's the correct approach.
I think this is the easiest way forward. Not sure if we really need to Deref the Token into string, but it's a nice to have. In any case, this would be an API breaking change. |
What we could do is start with having parse & serialize logic still duplicated, that way the
Cool, I'll give it a shot (most likely towards the weekend) and see where I land with this approach.
From how I understood it, |
sfv implements parsing validation, per the rules in RFC 8941. For example, Tokens are checked for valid character usage in https://github.com/undef1nd/sfv/blob/master/src/parser.rs#L290
However, I might have missed it but there doesn't appear to be an equivalent validation in the library when constructing such items. It seems like it would be helpful to support this, without having to resort to construction, serialization, and parsing.
For instance, the backing type to Token is String. And the docs suggest that I could create an Item of type Token by using
This would strictly be an invalid token according to https://www.rfc-editor.org/rfc/rfc8941.html#section-3.3.4-7 -
the first character is required to be either ALPHA or "*"
Of course, we could require that implementations sanitize their own values before passing into sfv. But then, sfv already implements the checks, just they aren't exposed in an easy way. It would be great to expose them for reuse.
The use case here is applications that pass through values from other things. If the receiving endpoint fails to parse that it will cause interop issues. It would be muc better if the sender had an opportunity to validate things before sending them, so that it might take mitigating action. There are probably still cases where a sender doesn't care, so perhaps checked and unchecked variants are required.
The text was updated successfully, but these errors were encountered: