[ GUIDE ]

How to Monitor Laravel Queues in Production

Worker supervision, failed-job handling, and per-job-class metrics across Redis, SQS, and database drivers.

QUICK ANSWER

How do I monitor Laravel queues in production?

You need three layers: (1) a worker supervisor like Horizon or Supervisor to keep workers alive and restart on crashes, (2) the failed_jobs table plus a dead-letter strategy for failures, and (3) an APM that groups attempts by job class with duration percentiles and failure alerts. Horizon covers Redis; NightOwl and Nightwatch Cloud cover all drivers.

Updated · 2026-04-13

1. Keep your workers alive

Queue workers are long-running processes. They leak memory, die on OOM, and restart on deploy. You need a supervisor that restarts them automatically.

Supervisor config — /etc/supervisor/conf.d/laravel-worker.conf

ini
[program:laravel-worker]
process_name=%(program_name)s_%(process_num)02d
command=php /srv/app/artisan queue:work redis --sleep=3 --tries=3 --max-time=3600
autostart=true
autorestart=true
stopasgroup=true
killasgroup=true
user=www-data
numprocs=4
redirect_stderr=true
stdout_logfile=/srv/app/storage/logs/worker.log
stopwaitsecs=3600

If you're on Redis, Horizon is strictly better than raw Supervisor — it auto-scales workers, exposes a dashboard, and handles graceful deploys.

bash
composer require laravel/horizon
php artisan horizon:install
php artisan horizon

2. Handle failures loud, not silent

Laravel writes failed jobs to the failed_jobs table by default. That's useful as a dead-letter store but invisible unless you look.

Hook into the Queue::failing() event to alert on failures:

app/Providers/AppServiceProvider.phpphp
use Illuminate\Queue\Events\JobFailed;
use Illuminate\Support\Facades\Queue;

public function boot(): void
{
    Queue::failing(function (JobFailed $event) {
        logger()->error('Job failed', [
            'job' => $event->job->resolveName(),
            'connection' => $event->connectionName,
            'queue' => $event->job->getQueue(),
            'attempts' => $event->job->attempts(),
            'exception' => $event->exception->getMessage(),
        ]);
    });
}

Logging is a baseline. For production alerting you want something that groups failures by job class and throttles duplicate noise — raw logger() calls will drown your channel.

3. Track attempts, duration, and throughput

A job that takes 10x longer today than yesterday is a problem — even if it doesn't fail. The three metrics that matter:

  • Wait time — how long a job sat in the queue before a worker picked it up
  • Processing time — how long handle() took, with p50/p95/p99 per job class
  • Attempt count — how often jobs are being retried before succeeding

Horizon exposes wait time and throughput for Redis queues on its metrics page. For everything else (SQS, database, mixed drivers, per-class duration percentiles), you need an APM with queue-specific watchers.

4. Tag jobs for context

When a job fails, you want to know whose data it was processing. Use Horizon tags or add context via $this->user_id properties that your APM can surface.

php
class ChargeCustomer implements ShouldQueue
{
    public function __construct(public Customer $customer) {}

    public function tags(): array
    {
        return ['billing', 'customer:'.$this->customer->id];
    }

    public function handle(): void
    {
        // ...
    }
}

THE EASY WAY

NightOwl groups every queue attempt by job class

NightOwl's queue watcher records every dispatch, released/processed/failed transition, and exception, grouped by job class. You see p50/p95/p99 processing duration per class, wait-time trends, and failure fingerprints in one view — across all drivers (Redis, SQS, database, Beanstalkd), not just Redis.

Combine it with alert channels (Slack, Discord, Email, Webhook) to get paged on failure spikes without wiring custom Queue::failing listeners.

bash
composer require nightowl/agent
php artisan nightowl:install

Telemetry lands in your PostgreSQL. Works alongside Horizon — NightOwl for the big picture, Horizon for live worker ops.

Frequently asked questions

How do I monitor Laravel queues in production?

You need three layers: a worker supervisor (Horizon for Redis, Supervisor for others) to keep workers alive, job tables (failed_jobs) to log failures, and an APM that tracks attempts, duration, and retries per job class. Horizon covers workers and metrics for Redis queues; NightOwl or Nightwatch Cloud cover all drivers with per-job-class detail and alerting.

What's the difference between Horizon and NightOwl for queue monitoring?

Horizon is a worker dashboard for Redis-backed queues — it shows worker processes, throughput, and failed jobs. NightOwl is broader: it tracks every job attempt across any driver (Redis, database, SQS, Beanstalkd), groups by job class with duration percentiles, surfaces failures with full stack traces, and alerts on failure spikes. Many teams run both.

How do I see why a Laravel job failed?

Laravel writes failed jobs to the failed_jobs table by default — you can inspect them with php artisan queue:failed and read the exception column. This shows the error but not the request context or attempt history. An APM that groups failures by job class (like NightOwl) shows you fingerprinted exceptions plus duration, attempts, and payload context.

How do I monitor queue latency in Laravel?

Queue latency has two parts: wait time (how long a job sits in the queue before a worker picks it up) and processing time (how long the handle() method takes). Horizon reports wait time for Redis queues on its metrics page. NightOwl tracks both wait and processing duration per attempt and per job class.

Should I alert on failed jobs?

Yes, but with care. A single failed job is noise; a spike in the failure rate for a specific job class is a real signal. Configure alerts at the class level on a time-windowed threshold (e.g. more than 5 failures in 10 minutes for App\Jobs\ChargeCustomer). Route alerts to Slack or Discord — never email for high-volume queues.

How do I track job retries in Laravel?

Laravel's Job class has a tries property and an attempts() method on the underlying job instance. You can log the attempt count manually, but aggregating it across thousands of jobs is the hard part. NightOwl and Nightwatch Cloud both track the full attempt history per job (released, processed, failed) with timing.

What's the best queue driver for production Laravel?

Redis is the default for most teams — cheap, fast, well-supported by Horizon. SQS is the right pick if you're already on AWS and want managed infra. Database queues work for low-volume apps but don't scale past ~50 jobs/sec. Avoid sync and null in production.

PRICING

Flat pricing. No event caps. No per-seat fees.

14-day free trial, no credit card. Your PostgreSQL, your data.

HOBBY

$5 /month

1 app · 14 days lookback · all Laravel events

TEAM

$15 /month

Up to 3 connected apps · unlimited environments · all Laravel events

AGENCY

$69 /month

Unlimited apps · unlimited agent instances · same flat rate at any traffic

Related