How long to wait for duplicate webhook events

The webhook guide mentions that a best practice is to use a queue to process webhook events and to generate an idempotency key based on the topic and resource self link. It doesn’t mention how long you should wait before processing the events to allow multiple events on the same resource to come in.

My architecture would be:

  1. Immediately throw every new event in a delayed jobs queue unless its idempotency key matches one already in the table
  2. Run the event processing job after a short delay of maybe 5 minutes to allow similar events on the same resource to come in.
  3. Delete the event from the processing queue to allow for future events on same resource.

Is that a reasonable delay or am I thinking about this wrong?

Appreciate the feedback.

Roger

Hey @rogerm89,

tldr; Your queue can process the events immediately, but depending on what you are doing in response to the event you’ll want to ensure you don’t process the same event twice.

For example, you could:

  • Have a table for storing events from Dwolla (e.g. dwolla_events) with a unique index on the column storing the event id from the webhook. This will prevent the same webhook event from being inserted into the table more than once.
    • Note: You’ll want to make sure you handle the RecordNotUnique exception (or whatever exception is thrown) and return a successful response in your webhook handler when this happens; else Dwolla will continue to send you webhooks until it receives a successful response.
  • Add a handled_at column to the dwolla_events table. Then, using something like Postgres’ FOR UPDATE SKIP LOCKED, you could ensure your workers only process each dwolla_event once.

Those are some techniques you could use, but it really depends on your application. Hopefully it helps tho! Let us know if you have any questions.

Thanks for the response, Stephen!

My concern with the timing of these duplicate events is that, given the volume of potential events coming in, I’m worried that this table would get bloated pretty quickly and I’d like to delete the events after some amount of time to save space.

Is it common practice to maintain a table of all the past webhook events for an account?

@rogerm89 No problem!

Yeah that’d be fine. I’m not sure how common that particular solution is, but you basically need to implement a distributed lock to ensure the same work isn’t performed twice. Using an existing database is probably the easiest way to go about it.

For example, you could also use Redis which has some open source implementations.

@stephen, thanks for that. My company is currently using redis in our rails application, along with an sql database and a delayed jobs worker queue system. So if we had . a table of processed events and all of the events are processed through the jobs queue I think we would be covered since the event processing would take place sequentially and would allow us to check for previously processed duplicate events.

Thanks for clearing this up.

Cheers,

Roger