Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add verify client config for embedded DERP #2260

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

seiuneko
Copy link

  • have read the CONTRIBUTING.md file
  • raised a GitHub issue or discussed it on the projects chat beforehand
  • added unit tests
  • added integration tests
  • updated documentation if needed
  • updated CHANGELOG.md

Added verify_client_url and verify_client_url_fail_open configurations to support client verification in the embedded DERP server.

@kradalby
Copy link
Collaborator

Without having had a look at this, it seems very unnecessary to roundtrip this both via the config and via the http endpoint.

Can it not be passed internally as a function? if anything, I dont see why it should be configurable.

@seiuneko
Copy link
Author

@kradalby
The DERP server at tailscale.com/derp only supports hardcoded tailscaled Unix socket verification or URL-based verification.
https://github.com/tailscale/tailscale/blob/4d33f30f91eb7debdf90c8770990801f3857e30c/derp/derp_server.go#L1265

@kradalby
Copy link
Collaborator

Ok, this rings a bell, I dont think the URL should be configurable, this should be internal, so we should populate it if it is enabled:

derp:
  server:
    verify:
      enabled: true/false

I'm on the fence if an option for fail open should be included, my rational is that it should be impossible for it to not be, and if it is you have bigger problems.

The URL should be automatically populated and go over localhost, but it should have some logic in case a user uses a footgun like running it on [::1] vs 127.0.0.1 etc.

@seiuneko
Copy link
Author

@kradalby

Ok, this rings a bell, I dont think the URL should be configurable, this should be internal, so we should populate it if it is enabled:

I initially considered using server_url for automatic configuration, but I found that the situation is more complicated than expected.
I use Nginx to reverse proxy Headscale, with Nginx handling TLS decryption. When running httping on the VPS hosting Headscale with the server_url, I found the latency to be as high as 19ms. The TLS connection latency on localhost is also 3.5ms. However, the HTTP connection on localhost has only 0.4ms of latency.

httping test result

httping -SvY -t 2 -i 0.1 -g https://*.*.moe:8443/health -c 32
connected to 118.*.*.*:8443 (354 bytes), seq=31 time=  0.51+ 12.58+  5.70+  0.09+  0.02= 18.89 ms  toff=-669
--- https://*.*.moe:8443/health ping statistics ---
32 connects, 32 ok, 0.00% failed, time 3819ms
round-trip min/avg/max/sd = 18.2/19.1/33.9/2.7 ms
httping -SvY -t 2 -i 0.1 -g https://*.*.moe:8443/health -c 32 --divert-connect 127.0.0.1
connected to 127.0.0.1:8443 (354 bytes), seq=31 time=  0.01+  2.78+  0.61+  0.08+  0.01=  3.48 ms  toff=-357
--- https://*.*.moe:8443/health ping statistics ---
32 connects, 32 ok, 0.00% failed, time 3318ms
round-trip min/avg/max/sd = 3.3/3.5/5.2/0.4 ms
httping -SvY -t 2 -i 0.1 -g http://127.0.0.1:56398/health -c 32
connected to 127.0.0.1:56398 (67 bytes), seq=31 time=  0.01+  0.14+  0.03+  0.39+  0.04=  0.56 ms  toff=-742
--- http://127.0.0.1:56398/health ping statistics ---
32 connects, 32 ok, 0.00% failed, time 3217ms
round-trip min/avg/max/sd = 0.3/0.4/0.6/0.1 ms

Due to the potential performance issues with server_url, I tried concatenating the verify URL using listen_addr. However, I found that when TLS is enabled in Headscale, the URL https://127.0.0.1:8080/verify causes a certificate name validation failure.
In summary, I have decided to make the configuration option available.

I'm on the fence if an option for fail open should be included, my rational is that it should be impossible for it to not be, and if it is you have bigger problems.

I will remove this option later.

@kradalby
Copy link
Collaborator

Due to the potential performance issues with server_url, I tried concatenating the verify URL using listen_addr. However, I found that when TLS is enabled in Headscale, the URL 127.0.0.1:8080/verify causes a certificate name validation failure.
In summary, I have decided to make the configuration option available.
I fully agree that http+localhost is the way to go.

I think the only thing you need to figure out is how headscale runs, we should code it, not make it configurable, we just need to make sure it doesnt fail if someone for some reason only bound headscale to IPv6 version of localhost.

@seiuneko
Copy link
Author

@kradalby

I fully agree that http+localhost is the way to go.

The issue I found is that if Headscale is configured with TLS enabled, attempting to access Headscale using localhost triggers a TLS certificate error:

2024-11-27T19:29:42+08:00 DBG ../../home/seiuneko/forks/headscale/hscontrol/util/log.go:21 > derp: 127.0.0.1:60608: client nodekey:f02997a51a70df0fa45a878c1d1fc989f80b94c59e39bb06d0249249132e8c01 rejected: Post "https://0.0.0.0:56398/verify": tls: failed to verify certificate: x509: cannot validate certificate for 0.0.0.0 because it doesn't contain any IP SANs

One solution is to remind users in the documentation to map the headscale domain to localhost in their hosts file, but I don’t think this is ideal. I’m not sure if there’s a better solution.

we just need to make sure it doesnt fail if someone for some reason only bound headscale to IPv6 version of localhost.

I plan to concatenate the verification URL using listen_addr, which should not be an issue.

@kradalby
Copy link
Collaborator

@kradalby

I fully agree that http+localhost is the way to go.

The issue I found is that if Headscale is configured with TLS enabled, attempting to access Headscale using localhost triggers a TLS certificate error:

2024-11-27T19:29:42+08:00 DBG ../../home/seiuneko/forks/headscale/hscontrol/util/log.go:21 > derp: 127.0.0.1:60608: client nodekey:f02997a51a70df0fa45a878c1d1fc989f80b94c59e39bb06d0249249132e8c01 rejected: Post "https://0.0.0.0:56398/verify": tls: failed to verify certificate: x509: cannot validate certificate for 0.0.0.0 because it doesn't contain any IP SANs

I see, if headscale terminates the TLS, this is a problem yes, I really dont want to set up another port just for this purpose. I think one solution would be to try to make the http.Client overridable upstream, so we can inject our own that trusts that certificate, I can have a look at that tomorrow.

One solution is to remind users in the documentation to map the headscale domain to localhost in their hosts file, but I don’t think this is ideal. I’m not sure if there’s a better solution.

I dont think this is a viable solution.

we just need to make sure it doesnt fail if someone for some reason only bound headscale to IPv6 version of localhost.

I plan to concatenate the verification URL using listen_addr, which should not be an issue.

Good, thank you.

@kradalby
Copy link
Collaborator

We should probably have a section in the integration tests using this option.

@seiuneko
Copy link
Author

@kradalby
I came up with a slightly hacky solution by registering a custom protocol in http.DefaultTransport to intercept network requests. However, this method relies on tailscale.com/derp/derp_server using the default http.NewRequestWithContext when making requests; otherwise, the interception will fail. I don't think this is likely to change.

Do you think this solution is suitable? If so, I will add integration tests for it later.

Copy link
Collaborator

@kradalby kradalby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is neater, a couple of comments on renaming and restructuring, we probably dont want to use stuff made for testing in the prod path.

hscontrol/derp/server/derp_server.go Outdated Show resolved Hide resolved
hscontrol/derp/server/derp_server.go Outdated Show resolved Hide resolved
hscontrol/derp/server/derp_server.go Outdated Show resolved Hide resolved
Copy link
Collaborator

@kradalby kradalby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just need some integration tests to verify that it works as intended.

config-example.yaml Show resolved Hide resolved
hscontrol/derp/server/derp_server.go Outdated Show resolved Hide resolved
@seiuneko seiuneko requested a review from kradalby December 3, 2024 13:45
@seiuneko seiuneko force-pushed the embedded-derp-verify-client branch from a7cb0e2 to 07a3314 Compare December 18, 2024 14:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants