Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[neighbor] Preserve default lo neighbor to prevent the redis operation failure when arp cache overflow #22038

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

gord1306
Copy link

Why I did it

Fix the issue #2189.
The neighbor garbage collector was clearing the default neighbor (0.0.0.0) on the loopback interface, causing Redis operations to fail and the SONiC CLI to hang. Syslog entries showed errors like:

Mar 14 10:17:38.* INFO kernel: neighbour: arp_cache: neighbor table overflow!

and netlink messages (by ip -ts monitor) confirmed the deletion:

[2025-03-14T10:11:14.214473] Deleted 0.0.0.0 dev lo lladdr 00:00:00:00:00:00 NOARP

This commit fixes the issue by adding a persistent neighbor entry using:

ip neigh replace 0.0.0.0 dev lo lladdr 00:00:00:00:00:00 nud permanent

Work item tracking
  • Microsoft ADO (number only):

How I did it

Continuously generating gratuitous ARP to trigger the neighbor table overflow

How to verify it

  • Adjusting ARP settings via /proc/sys/net/ipv4/conf/*
echo 0 > /proc/sys/net/ipv4/conf/all/arp_ignore
echo 1 > /proc/sys/net/ipv4/conf/Ethernet16/arp_accept
  • Continuously generating gratuitous ARP
  • Ensuring proper SONiC CLI output (e.g., via "show vlan brief")

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205
  • 202211
  • 202305

Tested branch (Please provide the tested image version)

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

…n failure

The neighbor garbage collector was clearing the default neighbor (0.0.0.0) on the lo interface,
causing Redis operations to fail and the SONiC CLI to hang. Syslog entries showed errors like:

  Mar 14 10:17:38.* INFO kernel: neighbour: arp_cache: neighbor table overflow!

and netlink messages confirmed the deletion:

  [2025-03-14T10:11:14.214473] Deleted 0.0.0.0 dev lo lladdr 00:00:00:00:00:00 NOARP

This commit fixes the issue by adding a persistent neighbor entry using:

  ip neigh replace 0.0.0.0 dev lo lladdr 00:00:00:00:00:00 nud permanent

Signed-off-by: gord_chen <gord_chen@edge-core.com>
@gord1306 gord1306 requested a review from lguohan as a code owner March 14, 2025 02:56
@mssonicbld
Copy link
Collaborator

/azp run Azure.sonic-buildimage

Copy link

Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants