-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dora node packet loss #515
Comments
For the latency, could you try removing the sleep here: https://github.com/dora-rs/autoware.universe/blob/db29af5314dbe4c3c9c5c79ab78e33f2a8bbd327/localization/ekf_localizer/src/gnss_ekf.cc#L65 |
The “EKF_fusion_pthread“” thread is the sleep of the thread that sends data and has nothing to do with data reception. Data packet loss has occurred when receiving data from other nodes in the function “int run(void *dora_context) ”. |
Dora uses a default queue size of 10 for each input. If the receiver cannot keep up with the sender, dora deliberately drops some inputs (oldest first). If you set the You can set a custom queue size in your YAML file, e.g.: dora/examples/python-dataflow/dataflow.yml Lines 5 to 8 in ea47a55
|
Hu has also tested with 100K queue size today and can reproduce this issue. |
Thanks for the details! Could you please also share the full output of the terminal on the top left? I see that there are some error in the output, so the full content might help debugging this. Also, setting |
The error in the terminal on the upper left corner is that we did not use a real GPS sensor during debugging, but used the data in the txt file to simulate GPS data. This error is generated after the data in the txt file has been read and the node exits. Therefore, this error should not affect data loss. The details can be found in line 26 of file CGI_610_driver_dora_with_file.py( https://github.com/dora-rs/autoware.universe/blob/feature/autoware_dora/dora-hardware/vendors/gnss/CGI_610/CGI_610_driver_dora_with_file.py) |
I just noticed that you added the |
For debugging, it would be useful to have a minimal example that still triggers this issue. Could you try the following:
After these steps, we ideally have a very small example that we can try easily on our machines without working into the autoware build details. |
Hello,This is the complete reproduction process of the above example: https://github.com/dora-rs/autoware.universe/blob/feature/autoware_dora/localization/ekf_localizer/Packet_loss_test.md |
The second link is the minimal example. The link for all examples is https://github.com/dora-rs/autoware.universe/tree/feature/autoware_dora/tools
|
Here are some new results for #515 after making changes to philip's comments - "I just noticed that you added the queue_size parameter as a child of the the custom: object. It has no effect there. Instead, you need to add it as a child field to an input.". You can see that change does not solve the issue.
|
The easiest way to reproduce issue is this minimal test example - https://github.com/dora-rs/autoware.universe/tree/feature/autoware_dora/tools/C_Python_test. |
Ok, so I tried your example and I can reproduce the dropped messages. If you spawn the coordinator and daemon manually and set When I increase the
The reason why this happens mostly when combining a sending C node with a Python receiver is probably that Python is slower, so it cannot keep up with the inputs. |
I think a change such as #533 would make it more obvious what's happening. |
After further investigation, there were no packet loss and it was a problem in the C++ String with no null termination. |
Describe the bug
The dora is V0.3.2.
I use two nodes. Node A is written in Python. Node B is written in c/c++.
Node A reads GPS data at a 20ms cycle and sends the data via a json string. Node B performs coordinate conversion by receiving data from node A, but I found that node B will lose data packets when receiving data from node A.
Here I print the seq number of the data pub by node A, the message count received by node B, and the time when the message was received. There was no packet loss at the beginning, but after running for a while, the message count received by node B was not equal to the seq value of the message.
And, the period of node B to receive data should normally be 20ms, which depends on the period of node A sending data. However, the following picture shows that the time difference between the two events is 40ms. (lower left corner)
here is the test code:
https://github.com/dora-rs/autoware.universe/blob/feature/autoware_dora/localization/ekf_localizer/test_ekf_dataflow.yml
The text was updated successfully, but these errors were encountered: