Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

traceroute failing #274

Open
astlaurent opened this issue Jul 15, 2024 · 7 comments
Open

traceroute failing #274

astlaurent opened this issue Jul 15, 2024 · 7 comments
Assignees
Labels
possible-bug Something isn't working

Comments

@astlaurent
Copy link

Deployment Type

Docker

Version

v2.0.4

Steps to Reproduce

I am seeing this with built in XR directive as well as custom directive as well as Juniper. traceroutes to quite a bit of internet destinations fail. debug is showing pattern not detected. This seems to happen if there is a timeout along the path.

Expected Behavior

traceroute to properly display

Observed Behavior

error on display

Configuration

No response

Devices

No response

Logs

hyperglass-1  | [DEBUG] 20240715 20:11:46 |51 | collect → Connecting to device {'device': 'BEL - Bellevue, NE', 'address': 'None:None', 'proxy': None}
hyperglass-1  | [CRITICAL] 20240715 20:11:57 |48 | default_handler → Error {'method': 'POST', 'path': '/api/query', 'detail': "\nPattern not detected: 'RP/0/RSP0/CPU0:DEVICE\\\\#' in output.\n\nThings you might try to fix this:\n1. Explicitly set your pattern using the expect_string argument.\n2. Increase the read_timeout to a larger value.\n\nYou can also look at the Netmiko session_log or debug log for more information.\n\n"}
hyperglass-1  | ERROR - 2024-07-15 20:11:57,971 - litestar - config - Uncaught exception (connection_type=http, path=/api/query):
hyperglass-1  | Traceback (most recent call last):
hyperglass-1  |   File "/usr/local/lib/python3.12/site-packages/litestar/middleware/_internal/exceptions/middleware.py", line 159, in __call__
hyperglass-1  |     await self.app(scope, receive, capture_response_started)
hyperglass-1  |   File "/usr/local/lib/python3.12/site-packages/litestar/routes/http.py", line 80, in handle
hyperglass-1  |     response = await self._get_response_for_request(
hyperglass-1  |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
hyperglass-1  |   File "/usr/local/lib/python3.12/site-packages/litestar/routes/http.py", line 132, in _get_response_for_request
hyperglass-1  |     return await self._call_handler_function(
hyperglass-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
hyperglass-1  |   File "/usr/local/lib/python3.12/site-packages/litestar/routes/http.py", line 152, in _call_handler_function
hyperglass-1  |     response_data, cleanup_group = await self._get_response_data(
hyperglass-1  |                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
hyperglass-1  |   File "/usr/local/lib/python3.12/site-packages/litestar/routes/http.py", line 200, in _get_response_data
hyperglass-1  |     else await route_handler.fn(**parsed_kwargs)
hyperglass-1  |          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
hyperglass-1  |   File "/opt/hyperglass/hyperglass/api/routes.py", line 111, in query
hyperglass-1  |     output = await execute(data)
hyperglass-1  |              ^^^^^^^^^^^^^^^^^^^
hyperglass-1  |   File "/opt/hyperglass/hyperglass/execution/main.py", line 67, in execute
hyperglass-1  |     response = await driver.collect()
hyperglass-1  |                ^^^^^^^^^^^^^^^^^^^^^^
hyperglass-1  |   File "/opt/hyperglass/hyperglass/execution/drivers/ssh_netmiko.py", line 92, in collect
hyperglass-1  |     raw = nm_connect_direct.send_command(query, **send_args)
hyperglass-1  |           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
hyperglass-1  |   File "/usr/local/lib/python3.12/site-packages/netmiko/utilities.py", line 592, in wrapper_decorator
hyperglass-1  |     return func(self, *args, **kwargs)
hyperglass-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
hyperglass-1  |   File "/usr/local/lib/python3.12/site-packages/netmiko/base_connection.py", line 1721, in send_command
hyperglass-1  |     raise ReadTimeout(msg)
hyperglass-1  | netmiko.exceptions.ReadTimeout: 
hyperglass-1  | Pattern not detected: 'RP/0/RSP0/CPU0:DEVICE\\#' in output.
hyperglass-1  | 
hyperglass-1  | Things you might try to fix this:
hyperglass-1  | 1. Explicitly set your pattern using the expect_string argument.
hyperglass-1  | 2. Increase the read_timeout to a larger value.
hyperglass-1  | 
hyperglass-1  | You can also look at the Netmiko session_log or debug log for more information.
hyperglass-1  | 
hyperglass-1  | 
hyperglass-1  | [INFO] 20240715 20:11:57 |1762 | callHandlers → 172.19.0.1:57876 - "POST /api/query HTTP/1.0" 500 {}
hyperglass-1  | [CRITICAL] 20240715 20:12:15 |34 | __init__ → Request timed out. (Connection timed out) {}
hyperglass-1  | ERROR - 2024-07-15 20:12:15,382 - asyncio - runners - Exception in callback Loop._read_from_self
hyperglass-1  | handle: <Handle Loop._read_from_self>
hyperglass-1  | Traceback (most recent call last):
hyperglass-1  |   File "uvloop/cbhandles.pyx", line 66, in uvloop.loop.Handle._run
hyperglass-1  |   File "uvloop/loop.pyx", line 397, in uvloop.loop.Loop._read_from_self
hyperglass-1  |   File "uvloop/loop.pyx", line 402, in uvloop.loop.Loop._invoke_signals
hyperglass-1  |   File "uvloop/loop.pyx", line 377, in uvloop.loop.Loop._ceval_process_signals
hyperglass-1  |   File "/opt/hyperglass/hyperglass/execution/main.py", line 41, in handler
hyperglass-1  |     raise DeviceTimeout(**exc_args)
hyperglass-1  | hyperglass.exceptions.public.DeviceTimeout: Request timed out. (Connection timed out)
@astlaurent astlaurent added the possible-bug Something isn't working label Jul 15, 2024
@NaumanNahian
Copy link

I'm experiencing the same issue with the same Docker image for Arista devices, not just with Traceroute, but also with other commands that takes a bit longer to finish.

I resolved the issue by increasing the read_timeout by adding send_args['read_timeout'] = 120 in ssh_netmiko.py.

@astlaurent
Copy link
Author

send_args['read_timeout'] = 120

Thanks. This worked well. I needed to modify that and the timeout in the config.yaml to the same value. The ssh_netmiko.py file is embedded into the docker image so I had to modify the file and re-commit the docker

@thatmattlove you should make this value configurable or always have it match the timeout value in the config file. The value just simply tells netmiko to wait that max value for the prompt to come back. the default is like 10 seconds which is not long enough for traces

@Dinokinni
Copy link

Hey guys
I tried this solution but it doesn't work for me.
Can you show me where you place this argument?
Thanks

@astlaurent
Copy link
Author

astlaurent commented Jul 24, 2024

If it is running on Docker you need to change the file in the docker image not in the app directory on the OS. If you are not running Docker then it is enough to just add the value in the app directory. Here are the instructions I documented to assist

  • Make sure the service is started
  • enter into the docker image shell
    sudo docker exec -it hyperglass-hyperglass-1 sh
  • Edit netmiko file
    vi /opt/hyperglass/hyperglass/execution/drivers/ssh_netmiko.py
  • Add the following line on line 56, save and exit the file
    send_args['read_timeout'] = 120
  • type in exit to leave the docker environment
  • Get docker container ID
    sudo docker ps -a
  • copy container ID for hyperglass-hyperglass
  • commit docker changes
    sudo docker commit <ID> hyperglass-hyperglass
  • restart service

@Dinokinni
Copy link

Thank you very much.
Now it works perfectly.
I wasted all last week working on this.

@umiseaz
Copy link

umiseaz commented Jul 30, 2024

If it is running on Docker you need to change the file in the docker image not in the app directory on the OS. If you are not running Docker then it is enough to just add the value in the app directory. Here are the instructions I documented to assist

* Make sure the service is started

* enter into the docker image shell
  `sudo docker exec -it hyperglass-hyperglass-1 sh`

* Edit netmiko file
  `vi /opt/hyperglass/hyperglass/execution/drivers/ssh_netmiko.py`

* Add the following line on line 56, save and exit the file
  `send_args['read_timeout'] = 120`

* type in exit to leave the docker environment

* Get docker container ID
  `sudo docker ps -a`

* copy container ID for hyperglass-hyperglass

* commit docker changes
  `sudo docker commit <ID> hyperglass-hyperglass`

* restart service

Previously trace ipv6 2600:: and 2a11:: its not working. (for juniper)
I follow this guide and tips provided and now is working fine.
Thanks

@gondimcodes
Copy link

Hi,
There is also an error in traceroute ipv6 for Huawei. The same command executed for IPv4, is also being executed for IPv6, which causes an error.

/opt/hyperglass/hyperglass/defaults/directives/huawei.py:

        command="tracert -q 2 -f 1 -a {source6} {target}",

The correction:

        command="tracert ipv6 -q 2 -f 1 -a {source6} {target}",

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
possible-bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants