Description
Hello all,
What are the remediation and a cause of subject error?
I am trivially using
taskset 4444 exasock
Hadware is ExaNic X10
exanic is 2.5.0-1.e17
CentOs 7.5.1804
Thank you!
Update:
I reproduced issue with application that has 2 accelerated sockets:
read and read/write.
I start multiple instances on same multicast groups with follows results:
first 6 instances have no issues.
7th generates error inside bind.
[pid 23731] socket(AF_INET, 2050, 0) = 15
[pid 23731] setsockopt(15, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
[pid 23731] setsockopt(15, SOL_SOCKET, SO_RCVBUF, [4194304], 4) = 0
[pid 23731] bind(15, {sa_family=AF_INET, sin_port=htons(), sin_addr=inet_addr("0.0.0.0")}, 16
exasock warning: setting of SO_RCVBUF on accelerated socket is not effective
exasock_exanic_ip_dev_init: exanic_acquire_tx_buffer failed for dev exanic0
exasock_exanic_ip_dev_init: exanic_acquire_tx_buffer failed for dev exanic0
exasock_exanic_ip_dev_init: exanic_acquire_tx_buffer failed for dev exanic0
exasock_exanic_ip_dev_init: exanic_acquire_tx_buffer failed for dev exanic0
) = 0
Ignore SO_RCVBUF warning it is noop operation.
8-34 instances run without errors.
If I kill 7th instance and start it again, it generates exact same failure.
Update 2:
Firmware ver 21 solved problem of 7th instance. With ver 21 all processes after 7th are producing the error.
Update 3:
Solution:
Following code change eliminates exanic_acquire_tx_buffer in multicast reader.
- set IP_ADD_MEMBERSHIP before bind
- use explicit multicast sin_addr in bind instead of INADDR_ANY
otherwise exasock acquire tx buffer for multicast read on every NIC on the box.