-
Notifications
You must be signed in to change notification settings - Fork 286
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: flaky tcp test #2369
fix: flaky tcp test #2369
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #2369 +/- ##
==========================================
+ Coverage 73.09% 73.14% +0.04%
==========================================
Files 149 149
Lines 30516 30516
==========================================
+ Hits 22305 22320 +15
+ Misses 8211 8196 -15 ☔ View full report in Codecov by Sentry. |
EDIT: ignore the trace, it is useless Trace of new error
not an expert on the tcp code, but it slowly fills up the buffer, while expecting more bytes. |
found a rare potentially unrelated crash: Details
|
this is how a successful run looks like: Details
|
fixed the rng and here is the diff between a successful run and a fail: --- runfail1.txt 2024-03-07 14:51:26.525643885 +0100
+++ runsucc.txt 2024-03-07 14:51:06.842938060 +0100
@@ -1,42 +1,72 @@
$ auto_tests/auto_TCP_test
[#0] DEBUG TCP_server.c:947 new_listening_tcp_socket: successfully bound to TCP port 13215
[#0] DEBUG TCP_server.c:947 new_listening_tcp_socket: successfully bound to TCP port 33445
[#0] DEBUG TCP_server.c:947 new_listening_tcp_socket: successfully bound to TCP port 25643
[#0] DEBUG network.c:1957 net_connect: connecting socket 6 to ::1:33445
[#0] TRACE network.c:799 loglogdata: [24 = <unknown> ] T=> 128= ::1:33445 (0: OK) | 0f0d97658ead7a0f...67
[#0] TRACE TCP_server.c:1074 do_incoming: handling incoming TCP connection 0
[#0] TRACE network.c:799 loglogdata: [24 = <unknown> ] =>T 128= (IP invalid, family 0):0 (0: OK) | 0f0d97658ead7a0f...67
[#0] TRACE network.c:799 loglogdata: [55 = <unknown> ] T=> 96= (IP invalid, family 0):0 (0: OK) | 5e05f0f2a5352257...2d
[#0] TRACE TCP_server.c:1112 do_unconfirmed: handling unconfirmed TCP connection 0
[#0] TRACE network.c:799 loglogdata: [55 = <unknown> ] =>T 96= ::1:33445 (0: OK) | 5e05f0f2a5352257...2d
[#0] TRACE network.c:799 loglogdata: [00 = PING_REQUEST ] T=> 27= ::1:33445 (0: OK) | 19260fee4ef34458...1f
[#0] TRACE TCP_server.c:1112 do_unconfirmed: handling unconfirmed TCP connection 0
[#0] TRACE network.c:799 loglogdata: [00 = PING_REQUEST ] =>T 2= (IP invalid, family 0):0 (0: OK) | 0000000000000000...19
[#0] TRACE network.c:799 loglogdata: [26 = <unknown> ] =>T 25= (IP invalid, family 0):0 (0: OK) | 0fee4ef34458c993...1f
[#0] TRACE TCP_server.c:717 handle_tcp_packet: handling ping for 0
[#0] TRACE network.c:799 loglogdata: [00 = PING_REQUEST ] T=> 27= (IP invalid, family 0):0 (0: OK) | 19a15235169fbec4...cb
[#0] TRACE TCP_server.c:1138 tcp_process_secure_packet: processing packet for 0: 0
[#0] DEBUG network.c:1957 net_connect: connecting socket 13 to ::1:33445
[#0] TRACE network.c:799 loglogdata: [00 = PING_REQUEST ] =>T 2= ::1:33445 (0: OK) | 0000000000000000...19
[#0] TRACE network.c:799 loglogdata: [a1 = <unknown> ] =>T 25= ::1:33445 (0: OK) | 5235169fbec430f1...cb
-[#0] TRACE network.c:791 loglogdata: [57 = <unknown> ] T=> 128E ::1:33445 (11: Resource temporarily unavailable) | 422bb14bf9e54680...8a
+[#0] TRACE network.c:799 loglogdata: [57 = <unknown> ] T=> 128= ::1:33445 (0: OK) | 422bb14bf9e54680...8a
[#0] TRACE TCP_server.c:1074 do_incoming: handling incoming TCP connection 1
-[#0] TRACE TCP_server.c:385 read_connection_handshake: connection handshake is not ready yet
+[#0] TRACE network.c:799 loglogdata: [57 = <unknown> ] =>T 128= (IP invalid, family 0):0 (0: OK) | 422bb14bf9e54680...8a
+[#0] TRACE network.c:799 loglogdata: [d2 = <unknown> ] T=> 96= (IP invalid, family 0):0 (0: OK) | 726d18e950da70ca...15
+[#0] TRACE TCP_server.c:1112 do_unconfirmed: handling unconfirmed TCP connection 1
[#0] TRACE TCP_server.c:1138 tcp_process_secure_packet: processing packet for 0: 0
-[#0] TRACE network.c:799 loglogdata: [57 = <unknown> ] T=> 128= ::1:33445 (0: OK) | 422bb14bf9e54680...8a
+[#0] TRACE network.c:799 loglogdata: [d2 = <unknown> ] =>T 96= ::1:33445 (0: OK) | 726d18e950da70ca...15
+[#0] TRACE network.c:799 loglogdata: [00 = PING_REQUEST ] T=> 27= ::1:33445 (0: OK) | 19105a8fcc40a8cb...5d
<<<<<<<<<<<<<< start here
-[#0] TRACE network.c:799 loglogdata: [00 = PING_REQUEST ] T=> 56= ::1:33445 (0: OK) | 36244303110a3f8f...d1
+[#0] TRACE network.c:799 loglogdata: [00 = PING_REQUEST ] T=> 56= ::1:33445 (0: OK) | 36a1046c37b52550...7e
sob conn2 1
[#0] TRACE network.c:799 loglogdata: [00 = PING_REQUEST ] T=> 51= ::1:33445 (0: OK) | 315a8f49b3a39036...44
srr conn 1
-[#0] TRACE network.c:799 loglogdata: [00 = PING_REQUEST ] T=> 51= ::1:33445 (0: OK) | 31617933b8cf4d09...09
+[#0] TRACE network.c:799 loglogdata: [00 = PING_REQUEST ] T=> 51= ::1:33445 (0: OK) | 319181538a199080...22
srr conn2 1
-[#0] TRACE TCP_server.c:1074 do_incoming: handling incoming TCP connection 1
-[#0] TRACE network.c:799 loglogdata: [57 = <unknown> ] =>T 128= (IP invalid, family 0):0 (0: OK) | 422bb14bf9e54680...8a
-[#0] TRACE network.c:799 loglogdata: [d2 = <unknown> ] T=> 96= (IP invalid, family 0):0 (0: OK) | 726d18e950da70ca...15
[#0] TRACE TCP_server.c:1112 do_unconfirmed: handling unconfirmed TCP connection 1
+[#0] TRACE network.c:799 loglogdata: [00 = PING_REQUEST ] =>T 2= (IP invalid, family 0):0 (0: OK) | 0000000000000000...19
+[#0] TRACE network.c:799 loglogdata: [10 = <unknown> ] =>T 25= (IP invalid, family 0):0 (0: OK) | 5a8fcc40a8cba540...5d
+[#0] TRACE TCP_server.c:717 handle_tcp_packet: handling ping for 3
+[#0] TRACE network.c:799 loglogdata: [00 = PING_REQUEST ] T=> 27= (IP invalid, family 0):0 (0: OK) | 19f59dd151e997c8...6f
+[#0] TRACE network.c:799 loglogdata: [00 = PING_REQUEST ] =>T 2= (IP invalid, family 0):0 (0: OK) | 0000000000000000...31
+[#0] TRACE network.c:799 loglogdata: [5a = GC_HANDSHAKE ] =>T 49= (IP invalid, family 0):0 (0: OK) | 8f49b3a39036aff4...44
+[#0] TRACE TCP_server.c:1138 tcp_process_secure_packet: processing packet for 0: 33
+[#0] TRACE TCP_server.c:690 handle_tcp_packet: handling routing request for 0
+[#0] TRACE network.c:799 loglogdata: [00 = PING_REQUEST ] T=> 52= (IP invalid, family 0):0 (0: OK) | 325ec6825af21172...e7
+[#0] TRACE TCP_server.c:1138 tcp_process_secure_packet: processing packet for 0: 0
[#0] TRACE network.c:799 loglogdata: [00 = PING_REQUEST ] =>T 2= (IP invalid, family 0):0 (0: OK) | 0000000000000000...36
-[#0] TRACE network.c:799 loglogdata: [24 = <unknown> ] =>T 54= (IP invalid, family 0):0 (0: OK) | 4303110a3f8f8183...d1
-[#0] ERROR TCP_common.c:311 read_packet_tcp_secure_connection: decrypted length 15 does not match expected length 54
-Aborting test program
-Aborted (core dumped)
+[#0] TRACE network.c:799 loglogdata: [a1 = <unknown> ] =>T 54= (IP invalid, family 0):0 (0: OK) | 046c37b52550cb09...7e
+[#0] TRACE TCP_server.c:1138 tcp_process_secure_packet: processing packet for 3: 38
+[#0] TRACE TCP_server.c:752 handle_tcp_packet: handling oob send for 3
+[#0] TRACE network.c:799 loglogdata: [00 = PING_REQUEST ] T=> 56= (IP invalid, family 0):0 (0: OK) | 368fac8735926ab1...20
+[#0] TRACE network.c:799 loglogdata: [00 = PING_REQUEST ] =>T 2= (IP invalid, family 0):0 (0: OK) | 0000000000000000...31
+[#0] TRACE network.c:799 loglogdata: [91 = FORWARDING ] =>T 49= (IP invalid, family 0):0 (0: OK) | 81538a1990805e3c...22
+[#0] TRACE TCP_server.c:1138 tcp_process_secure_packet: processing packet for 3: 33
+[#0] TRACE TCP_server.c:690 handle_tcp_packet: handling routing request for 3
+[#0] TRACE network.c:799 loglogdata: [00 = PING_REQUEST ] T=> 52= (IP invalid, family 0):0 (0: OK) | 32ac530bde428601...3c
+[#0] TRACE network.c:799 loglogdata: [00 = PING_REQUEST ] T=> 20= (IP invalid, family 0):0 (0: OK) | 12a726457fd8387f...06
+[#0] TRACE network.c:799 loglogdata: [00 = PING_REQUEST ] T=> 20= (IP invalid, family 0):0 (0: OK) | 1215fa85b4f99b7b...3a
+[#0] TRACE TCP_server.c:1138 tcp_process_secure_packet: processing packet for 3: 0
+[#0] TRACE network.c:799 loglogdata: [00 = PING_REQUEST ] =>T 2= ::1:33445 (0: OK) | 0000000000000000...32
+[#0] TRACE network.c:799 loglogdata: [5e = <unknown> ] =>T 50= ::1:33445 (0: OK) | c6825af21172d71b...e7
+[#0] TRACE network.c:799 loglogdata: [00 = PING_REQUEST ] =>T 2= ::1:33445 (0: OK) | 0000000000000000...36
+[#0] TRACE network.c:799 loglogdata: [8f = <unknown> ] =>T 54= ::1:33445 (0: OK) | ac8735926ab1dcdc...20
+[#0] TRACE network.c:799 loglogdata: [00 = PING_REQUEST ] =>T 2= ::1:33445 (0: OK) | 0000000000000000...12
+[#0] TRACE network.c:799 loglogdata: [15 = <unknown> ] =>T 18= ::1:33445 (0: OK) | fa85b4f99b7bcdb2...3a
+[#0] TRACE network.c:799 loglogdata: [00 = PING_REQUEST ] =>T 2= ::1:33445 (0: OK) | 0000000000000000...19
+[#0] TRACE network.c:799 loglogdata: [f5 = <unknown> ] =>T 25= ::1:33445 (0: OK) | 9dd151e997c837d5...6f
+[#0] TRACE network.c:799 loglogdata: [00 = PING_REQUEST ] =>T 2= ::1:33445 (0: OK) | 0000000000000000...32
+[#0] TRACE network.c:799 loglogdata: [ac = <unknown> ] =>T 50= ::1:33445 (0: OK) | 530bde428601cfea...3c
+[#0] TRACE network.c:799 loglogdata: [00 = PING_REQUEST ] =>T 2= ::1:33445 (0: OK) | 0000000000000000...12
+[#0] TRACE network.c:799 loglogdata: [a7 = <unknown> ] =>T 18= ::1:33445 (0: OK) | 26457fd8387fb57b...06 |
0453f7f
to
4f2c59b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r1.
Reviewable status: complete! 1 of 1 approvals obtained
This only fixes the symptoms, not the real problem. Sometimes or consistently on some platforms a socket might need a moment before it can be written to.
UPDATE: I am not pretty sure that the fail occurs when the creation of tcp sockets takes a while, because other processes need to run first. In my case its the firewall (eg
opensnitch
) which needs to check and allow the connection first.Adding a sleep of 50ms after every tcp connection creation fixes the test case, but it might still point to real error, because waiting longer after the actions and before the asserts does not make it recover.
This change is