Python notebook times out #75

mobsy74 · 2018-05-24T17:10:33Z

We use Python notebooks heavily in our workflows. However, it seems that the notebook often loses its connection and we are forced to restart the kernel (which removes all of the cached data in the notebook).
This appears to happen right after the rabbitmq container logs indicate a 60 second timeout (first image below):

rabbitmq_1           |
rabbitmq_1           | =ERROR REPORT==== 24-May-2018::16:57:24 ===
rabbitmq_1           | closing AMQP connection <0.1117.0> (10.255.3.11:34656 -> 10.255.3.9:5672):
rabbitmq_1           | missed heartbeats from client, timeout: 60s

Once this happens, any subsequent call to the Python kernel gets an error from the notebooks container indicating that the connection was closed (second image below):

notebooks_1          | Exception in thread Thread-13:
notebooks_1          | Traceback (most recent call last):
notebooks_1          |   File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
notebooks_1          |     self.run()
notebooks_1          |   File "/usr/lib/python2.7/threading.py", line 763, in run
notebooks_1          |     self.__target(*self.__args, **self.__kwargs)
notebooks_1          |   File "/usr/local/share/jupyter/kernels/pyspark/socket_forwarder.py", line 58, in to_rabbit_forwarder
notebooks_1          |     self.to_rabbit_sender(message)
notebooks_1          |   File "/usr/local/share/jupyter/kernels/pyspark/forwarding_kernel.py", line 145, in sender
notebooks_1          |     self._send_zmq_forward_to_rabbit(stream_name, message)
notebooks_1          |   File "/usr/local/share/jupyter/kernels/pyspark/forwarding_kernel.py", line 125, in _send_zmq_forward_to_rabbit
notebooks_1          |     'body': [base64.b64encode(s) for s in message]
notebooks_1          |   File "/usr/local/share/jupyter/kernels/pyspark/rabbit_mq_client.py", line 110, in send
notebooks_1          |     message=json_message)
notebooks_1          |   File "/usr/local/share/jupyter/kernels/pyspark/rabbit_mq_client.py", line 43, in send
notebooks_1          |     body=message)
notebooks_1          |   File "/usr/local/lib/python2.7/dist-packages/pika/adapters/blocking_connection.py", line 2077, in basic_publish
notebooks_1          |     mandatory, immediate)
notebooks_1          |   File "/usr/local/lib/python2.7/dist-packages/pika/adapters/blocking_connection.py", line 2164, in publish
notebooks_1          |     self._flush_output()
notebooks_1          |   File "/usr/local/lib/python2.7/dist-packages/pika/adapters/blocking_connection.py", line 1250, in _flush_output
notebooks_1          |     *waiters)
notebooks_1          |   File "/usr/local/lib/python2.7/dist-packages/pika/adapters/blocking_connection.py", line 474, in _flush_output
notebooks_1          |     result.reason_text)
notebooks_1          | ConnectionClosed: (-1, "error(104, 'Connection reset by peer')")

I can reproduce this by creating a brand new workflow, adding a Python notebook node, opening it, and then just waiting until the timeout error occurs (seems to take three minutes). It seems that at some point the notebooks container stops issuing heartbeats for some reason.

Initially I was using our code fork, but I confirmed that the same thing happens after checking out the master branch of deepsense-ai/seahorse and building from source.

The text was updated successfully, but these errors were encountered:

mobsy74 · 2018-05-25T13:54:16Z

Adding on to this, we have seen this issue running on both MacOS and linux systems.

jaroslaw-osmanski · 2018-05-28T13:16:00Z

Turning heartbeats off like in this pull request should fix that #76

We're still searching for long term solution, but this will take more than couple of weeks.

mobsy74 · 2018-06-02T10:35:36Z

This workaround seems to have fixed the timeouts. Thanks.

jaroslaw-osmanski mentioned this issue Jul 23, 2018

Turn heartbeats on #95

Open

jaroslaw-osmanski closed this as completed Jul 23, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python notebook times out #75

Python notebook times out #75

mobsy74 commented May 24, 2018 •

edited

Loading

mobsy74 commented May 25, 2018

jaroslaw-osmanski commented May 28, 2018

mobsy74 commented Jun 2, 2018

Python notebook times out #75

Python notebook times out #75

Comments

mobsy74 commented May 24, 2018 • edited Loading

mobsy74 commented May 25, 2018

jaroslaw-osmanski commented May 28, 2018

mobsy74 commented Jun 2, 2018

mobsy74 commented May 24, 2018 •

edited

Loading