-
Notifications
You must be signed in to change notification settings - Fork 14
Queue for faulty webhooks #391
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This is working as designed. If you have configured the webhook to require each webhook to succeed, and at least one fails - we fail the request when we cannot contact your webhook. Are you asking for us to complete the request, but then continue to send the event until it succeeds? To be truly transactional, we have to ensure everything is successful in a synchronous fashion. If you reduce the TX level to "some must succeed" for example, then we will allow the request to succeed and we'll queue any failed attempts for retry. The retry queue will try try up to 3 times before giving up. You can also use a Kafka integration instead of a Webhook to receive the event, and then you could leverage the Kafka service which may provide you additional redundancies |
Yes, this is currently implemented, but blocks the User, which is not acceptable for us.
Yes, this is exactly what we need.
Cool, good to know. But with this TX level we can lose some events. If we could change the retries from 3 times to unlimited it could be solve our problem.
Kafka is an option as well, but if the Kafka broker goes down for hours, we lose events or the UI will blocked as well. We want to make sure that we receive every webhook event successfully, even if our log server fails for a few hours. It is important that although the log server is down, users can use FusionAuth as usual. In a nutshell: We need extremely high availability of FusionAuth, but we still need to ensure that all webhook events are successfully transmitted. For your part, is there another idea for this scenario? |
We could add an additional TX level that says "all must succeed... eventually" - and in this mode we would not block on the request but queue "forever" until success. |
That would be a very good solution for us. Does "forever" mean the permanent persistence of the faulty webhook events? So that after a restart of FusionAuth the faulty webhook events are still available? |
That is correct. Off the top of my head, we'd persist the events and then have nodes work off of that queue based upon the TX level until we can complete the request. Once we start persisting these events for these types of scenarios, we may also add a webhook event log so that there is visibility into the sent webhooks, and pending events that have not yet been successfull sent, retry counts, etc. |
Wow, that sounds amazing. Great and very helpful idea! :) |
Uh oh!
There was an error while loading. Please reload this page.
Queue for faulty webhooks
Problem
If a webhook is fired with the transaction setting "All the Webhooks must succeed" and the webhook endpoint is not reachable, the user is blocked.
The following image shows the blocking message while login:

We have several use cases, where we use webhooks. For example for a custom audit log system or for creating shadow user in another database system for additional user information.
Solution
Developing a queue for webhooks that could not be sent would solve the problem.
The queue is a buffer that sends the unsent webhooks in as soon as the webhook endpoint is available again.
Perhaps it is possible to add a settings option for enable the webhook queue.
For our system it is necessary that all events are successfully transmitted and that users are never blocked on the basis of faulty webhooks.
So it would be great if the concept similar to that of a broker with QoS were integrated into FusionAuth.
Alternatives/workarounds
As workaround it would be possible to build a redundant system.
For example, three web servers as webhook endpoints hosted at different locations. Setting the transaction setting to "Any single Webhook must succeed" for three Webhooks.
However, this workaround only works for one use case (e.g. audit logs).
Related
How to vote
Please give us a thumbs up or thumbs down as a reaction to help us prioritize this feature. Feel free to comment if you have a particular need or comment on how this feature should work.
The text was updated successfully, but these errors were encountered: