Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix timeout bug on very long command output #687

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Elektordi
Copy link

SUMMARY

I was having timeout problems on IOS XE and IOS XR devices with very long configurations (more than 50k lines / 2M bytes). On those same devices, a "show running-config" (with pager disabled) over plain old openssh client takes about 10-15 seconds.
Even for smaller configurations (20k lines / 700k bytes) I had to change command_timeout up to 300 for my ansible playbook to success.

After some debug, I found that the problem is in the regex search for errors and prompt in network_cli, which is done on the full buffer, after each 4096 bytes from the wire, so it takes an exponential time to parse it.
My change is to only do regex search on the last 1k of the buffer, I think there is no chance of a prompt or error longer than 1k.
It prevents timeout issues, but also greatly speed-up many tasks on those devices.

Currently, for the proof-of-concept, I hardcoded the limit in the source code, but if someone think it could benefit from being an option, I could look on changing that.

My initial issue: https://forum.ansible.com/t/ios-config-and-iosxr-config-for-very-long-configs-50k-lines/40757

ISSUE TYPE
  • Bugfix Pull Request
COMPONENT NAME

network_cli

ADDITIONAL INFORMATION

Before changes:

2025-03-05 16:44:30,117 p=49599 u=elektordi n=ansible | jsonrpc request: b'{"jsonrpc": "2.0", "method": "run_commands", "id": "05b1a0f7-642a-4a03-92cb-ae5186b37d5c", "params": [[], {"commands": ["show running-config"], "check_rc": false}]}'
2025-03-05 16:49:30,642 p=49599 u=elektordi n=ansible | command timeout triggered, timeout value is 300 secs.
See the timeout setting options in the Network Debug and Troubleshooting Guide.
2025-03-05 16:49:30,644 p=49599 u=elektordi n=ansible | Traceback (most recent call last):
  File "/home/elektordi/.local/lib/python3.10/site-packages/ansible/utils/jsonrpc.py", line 45, in handle_request
    result = rpc_method(*args, **kwargs)
  File "/home/elektordi/.ansible/collections/ansible_collections/cisco/ios/plugins/cliconf/ios.py", line 537, in run_commands
    out = self.send_command(**cmd)
  File "/home/elektordi/.ansible/collections/ansible_collections/ansible/netcommon/plugins/plugin_utils/cliconf_base.py", line 148, in send_command
    resp = self._connection.send(**kwargs)
  File "/home/elektordi/.ansible/collections/ansible_collections/ansible/netcommon/plugins/connection/network_cli.py", line 345, in wrapped
    return func(self, *args, **kwargs)
  File "/home/elektordi/.ansible/collections/ansible_collections/ansible/netcommon/plugins/connection/network_cli.py", line 1005, in send
    response = self.receive(
  File "/home/elektordi/.ansible/collections/ansible_collections/ansible/netcommon/plugins/connection/network_cli.py", line 946, in receive
    response = self.receive_libssh(
  File "/home/elektordi/.ansible/collections/ansible_collections/ansible/netcommon/plugins/connection/network_cli.py", line 909, in receive_libssh
    if self._find_prompt(resp):
  File "/home/elektordi/.ansible/collections/ansible_collections/ansible/netcommon/plugins/connection/network_cli.py", line 1152, in _find_prompt
    match = stdout_regex.search(response)
  File "/home/elektordi/.local/lib/python3.10/site-packages/ansible/cli/scripts/ansible_connection_cli_stub.py", line 186, in command_timeout
    raise Exception(msg)
Exception: command timeout triggered, timeout value is 300 secs.
See the timeout setting options in the Network Debug and Troubleshooting Guide.

After change: OK, no error

I also did some timing checks on some devices using this command:
time ansible -vvvv -i inventory.yaml --playbook-dir . <DEVICE> -m cisco.ios.ios_command -a '{"commands":["sh runn"]}' -c ansible.netcommon.network_cli

Device with config of 55967 lines / 2188377 bytes
Before: Timeout
After: Ok in 25s

Device with config of 24651 lines / 717168 bytes
Before: Ok in 1m16s
After: Ok in 13s

Device with config of 2740 lines / 67080 bytes
Before: Ok in 13s
After: Ok in 11s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant