Disambiguate recoverable/non-recoverable status?

Created on 17 April 2024, 7 months ago

Problem/Motivation

I'd like to use several retries on a queue of many thousands of items to account for failures or downtime of downstream systems, but some jobs may end up in the queue with invalid arguments or are in a state that will always fail. I'd like to avoid constantly retrying these known-invalid/error jobs as well as report on their counts separately from those that might have failed due to a potentially recoverable issue that might succeed on a retry.

Example situation

For example, consider a queue and job-type that goes through all users in a site and communicates with external systems to pull in attributes and data. Each week all users are added to the queue and then the queue is processed in chunks over time, with the plugin getting a single user id as its payload.

In addition to successful syncs, different types of errors may occur.

  • Some failures may occur due to the external system being down and an API call failing. This might work on a second or third attempt a little later if the job is retried.
  • In contrast, if the user account is deleted prior to the job being processed, then there is nothing to do. Maybe this could be considered "Success".
  • If there is no user-id in the payload, then it is impossible to look up the user. This isn't a failure of the data-fetching system, but rather an error in the job definition. Retrying later will always have the same result, but it doesn't make sense to report this as a "successful" run and it would be useful to know how many jobs are getting into this invalid state.

Proposed resolution

Add an additional state for non-recoverable failures/errors that won't be retried and can be listed in the reports.

Remaining tasks

  • Determine an appropriate name for the additional JobState.
  • Add to JobSate class
  • Add to Queue and Job views.

User interface changes

  • Report the third job state for non-recoverable failures/errors in the queue list and job lists.

API changes

  • An additional job state for non-recoverable failures/errors.

Data model changes

I don't think there are any.

✨ Feature request
Status

Active

Version

1.0

Component

Code

Created by

πŸ‡ΊπŸ‡ΈUnited States adamfranco

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Production build 0.71.5 2024