Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VOIP call initiation fails #111

Closed
jorgejones7711 opened this issue Jun 24, 2020 · 7 comments · Fixed by #112
Closed

VOIP call initiation fails #111

jorgejones7711 opened this issue Jun 24, 2020 · 7 comments · Fixed by #112

Comments

@jorgejones7711
Copy link

Hello thank you for the great little firewall. I have been using qubes mirage firewall 0.7.1 for several days and everything seems to be working great. Today I found something. I use voip software over a udp vpn. The vpn has been working fine with no problems. The client is able to log into the voip server without problem as indicated with a green circle (online). However, when I try to initiate a call, it tries for about 10 seconds then fails. On sys-firewall it works normally. I have gone back and forth several times to verify that it works on sys-firewall, but does not work on qmf071. I also tried a different voip client and it is the same. It is strange because since it is over a vpn the firewall should not treat the packets differently. But the problem is real. It is the only problem I have found.

@talex5
Copy link
Collaborator

talex5 commented Jun 28, 2020

Does anything appear in the logs when this happens?

(you can view the logs using Qubes Manager : select the firewall, then Qube -> Logs -> guest-mirage-firewall.log)

@jorgejones7711
Copy link
Author

jorgejones7711 commented Jun 29, 2020

Thank you for your response @talex5
I verified again that in both cases, the vpn connects, and the voip client logs in successfully to the VOIP server. It is still true that with sys-firewall a test call connects successfully, but with mirage-firewall it times out.

Sys-net is 10.137.0.4, and the vpn attached to mirage-firewall is 10.137.0.27.
The related mtu size is 1500 on each. The sys-firewall log is posted after the mirage-firewall log in case it is helpful.

Here are the contents of guest-mirage-firewall.log:
Edit: deleted unneccesary log.

@talex5
Copy link
Collaborator

talex5 commented Jun 29, 2020

I guess WRN [firewall] Failed to write packet to 10.137.0.4: (Failure "length exceeds size") is the problem. Failed to write packet is (probably) logged at

Log.warn (fun f -> f "Failed to write packet to %a: %s"
Ipaddr.V4.pp iface#other_ip
(Printexc.to_string ex));

length exceeds size comes from https://github.com/mirage/mirage-net-xen/blob/581196b7be3ae14587dccb4605547087e88ca9a0/lib/frontend.ml#L313, which is a sanity check that fails if the user of the network device sends more data than they said they would.

The Ethernet layer writes to the network device here: https://github.com/mirage/ethernet/blob/master/src/ethernet.ml#L79
We don't seem to be specifying a size, in which case it defaults to the MTU, which is the largest permitted size.

mirage-net-xen should probably use Cstruct.sub to restrict the callback to just the area it asked for, rather than letting it write the whole block and then complaining. That might produce a better error. You could try looking at one of the failed packets in wireshark in the sending VM. Maybe the sending VM has a different idea of the MTU, or maybe it's sending valid packets but the firewall is making them larger when doing NAT? Who is familiar with this new size-and-callback system?

@hannesm
Copy link
Member

hannesm commented Jun 29, 2020

Who is familiar with this new size-and-callback system?

I am. Notes in https://mirage.io/blog/announcing-mirage-35-release#TCP-IP.

The short overview is that the target (network device) knows best about alignment restrictions and MTU. This is why the send this buffer API changed. Briefly, a higher layer calls "send" with an argument "size" (how much data to be sent by the upper layer, if known), and a callback function "fill" (buffer -> int) which receives the buffer, fills it, and returns the length which was used.

This way, the ethernet layer gets the size argument from the IP layer, adds 14, and calls the network send. In the callback, it writes source and destination addresses (and protocol type) into the buffer, and calls the IP layer fill function with the buffer shifted to where the IP header should start.

Now, considering this issue report, I suspect there's something wrong. What would help is a pcap of one of the packets sent by the client, and not forwarded to sys-net.

the firewall is making them larger when doing NAT?

No, NAT does not increase packet sizes. But our NAT library now reassembles and fragments packets, so we should ensure that the right MTU is passed (to remind myself, the MTU in our stack is different for each layer since the (static) header is subtracted).

Cstruct.sub to restrict the callback to just the area it asked for, rather than letting it write the whole block and then complaining.

Yes, this is a good idea. To fix this issue, it'd be also worth to have the numbers in the log messages (sorry for them being slightly useless since they do not contain any data).

@hannesm
Copy link
Member

hannesm commented Jul 1, 2020

Dear @jorgejones7711, I was able to locally reproduce the issue with some icmp packets. I developed a fix (in mirage-net-xen, see the reference above if of interest). Could you check that this works for you? I compiled and uploaded a new firewall virtual machine image to https://data.robur.coop/qubes-firewall-0.7.1-mtu-fixed.xen -- if you download this and put it in your dom0 as /var/lib/qubes/vm-kernel/mirage-firewall/vmlinuz and then start (or restart) your mirage-firewall and test your VPN + VoIP setup, and report back here, that'd be great.

@jorgejones7711
Copy link
Author

It works! Thank you @hannesm and @talex5! Please close this when the next release is out.

hannesm added a commit to mirage/mirage-net-xen that referenced this issue Jul 1, 2020
…98)

* In MirageOS / mirage-net, the MTU is defined as the Ethernet payload -- adjust to 1500
* pass only the requested size of the buffer to the fill function (discovered in mirage/qubes-mirage-firewall#111)
kit-ty-kate pushed a commit to ocaml/opam-repository that referenced this issue Jul 2, 2020
CHANGES:

* MirageOS (mirage-net) defines the MTU as the link-level payload size, adjust from 1514 to 1500 (@hannesm, mirage/mirage-net-xen#98)
* Only pass the sub-buffer of requested size to the fill function (solves mirage/qubes-mirage-firewall#111, @hannesm, mirage/mirage-net-xen#98)
* listen: do not catch out of memory exception (@hannesm, mirage/mirage-net-xen#97)
@jorgejones7711
Copy link
Author

Want to say it's still working great, never a problem. Thanks guys. I hope this get mainlined into qubes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants