Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improper location decoding #493

Open
tec27 opened this issue Nov 1, 2024 · 1 comment
Open

Improper location decoding #493

tec27 opened this issue Nov 1, 2024 · 1 comment

Comments

@tec27
Copy link

tec27 commented Nov 1, 2024

As of wouter 3.x, the library now calls decodeURI on locations during route matching, meaning route parameters have had this called on them as well. I'm not quite sure why this was added, as it doesn't match how routing would work in a typical HTTP server. That is, if I hit:

GET /path/escaped%2Fslashes

the intent is that this %2F is not treated like a slash, so if we were serving e.g. directories of files, this would look for a file named escaped%2Fslashes in the directory path, not a directory named escaped containing a file named slashes.

This presents a problem that is not solvable for clients of this library if any of their route components can contain a certain collection of characters. You can end up with a parameter value that can no longer be decoded unambiguously.

For instance, if you allow parameters that contain a % and #, you must use encodeURIComponent on this parameter or the resulting location will not work properly, but decodeURI will partially decode this string and you can no longer determine what the original string was without parsing (and effectively doing the routing) yourself:

const url = '/' + encodeURIComponent(`foo/bar#baz%`) // 'foo%2Fbar%23baz%25'
const afterWouter = decodeURI(url) // '/foo/bar%23baz%'
const laterDecode = decodeURIComponent(afterWouter) // throws "URIError: URI malformed"

Since the original string could also contain e.g. %25 at the end, the string at afterWouter is now ambiguous as well: we cannot tell whether the original string contained % or %25 unless we go back and look at the location value, and decoding multiple times may actually "work" without throwing an error but give you a value that is not the same as the original.

Calling decodeURI on the entire URL string multiple times makes issues like this impossible to fix, and wouter at this point seems to do it quite often.

@tec27
Copy link
Author

tec27 commented Nov 1, 2024

I'll note that I do see the issues with non-ASCII characters in URLs, but I think trying to auto-decode these in this library is a mistake (as is developers relying on the browser to auto-encode them). For example, if we rely on the browser to auto-encode the non-ASCII string from wouter's tests, but add a % at the front:

history.pushState(null, '', '/%пользователи/показать все/#101/げんきです?')
// location is now /%%D0%BF%D0%BE%D0%BB%D1%8C%D0%B7%D0%BE%D0%B2%D0%B0%D1%82%D0%B5%D0%BB%D0%B8/%D0%BF%D0%BE%D0%BA%D0%B0%D0%B7%D0%B0%D1%82%D1%8C%20%D0%B2%D1%81%D0%B5/
decodeURI(location.pathname) // throws "URIError: URI malformed", so wouter returns the string undecoded

This seems like an unfortunate quirk of the browser APIs, and I don't see a way to not have some aspect feel "broken" here. I think the approaches that maximize the capabilities of the library are either:

  1. Don't decode anything, present all route parameters, etc. to library clients undecoded, or
  2. Only decode parameters/search values (not keys)/etc. after parsing + routing (and this decoding may fail if the library client is not properly encoding their URL contents)

The current approach makes it impossible to ever accept %s in parameters/values, even if you encode them properly, which feels worse.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant