[ GLOSSARY ]

p95 vs p99 Latency, Explained

QUICK ANSWER

What's the difference between p95 and p99 latency?

p95 latency is the response time below which 95% of requests completed — 5% were slower. p99 means 99% completed faster; 1% were slower. Both are 'percentiles.' The gap between them is your tail latency: a small gap means consistent performance, a big gap means a real user population is suffering.

Updated · 2026-04-13

The definitions

Sort every request's duration from fastest to slowest. The value at the Xth position from the bottom, where X is your percentile, is pX.

  • p50 (median) — half your requests are faster, half are slower
  • p95 — 95% of requests finished under this; 5% were slower
  • p99 — 99% of requests finished under this; 1% were slower
  • p99.9 — typical SLO for mission-critical services

Why averages lie

Take an endpoint where 99% of requests finish in 50ms and 1% take 30 seconds. The average is around 350ms — a number that describes neither population. The fast users don't see 350ms; the slow users see 60,000x worse.

Averages mask outliers. Percentiles are the outliers — which is what you care about.

The gap is the signal

p95 tells you the story for most users. p99 tells you the story for your most frustrated users. The gap between them tells you how frustrated those users are.

Healthy service

text
p50:   80ms
p95:  180ms
p99:  240ms
Gap p95 → p99: 60ms  (small — consistent tail)

Service with a bad tail

text
p50:   85ms
p95:  220ms
p99: 3400ms
Gap p95 → p99: 3180ms  (huge — 1% of users hit something broken)

Common causes of a big p95 → p99 gap: database contention on specific rows, cache misses, garbage collection pauses, one slow downstream service affecting only some requests, uncached file reads, or — classic in Laravel — an N+1 query that fires on some pages but not others.

How to pick which to optimize

In typical user-facing web apps:

  • p95 — your reliable SLO. Alert when it regresses; make it the number you promise.
  • p99 — your canary. Watch the gap, not the absolute value. A widening gap is the earliest signal of degradation.
  • p99.9 — for payment, auth, health-critical paths. Worth the operational work.

How NightOwl surfaces percentiles

NightOwl stores request duration in microseconds and computes p50, p95, and p99 across any time range — per route, per query pattern, per job class. Because the data lives in your PostgreSQL, calculations use a straightforward ORDER BY duration OFFSET ... LIMIT 1 query instead of lossy histograms or sketches.

RELATED

Frequently asked questions

What's the difference between p95 and p99 latency?

p95 means 95% of requests completed faster than this number; 5% were slower. p99 means 99% completed faster; 1% were slower. p99 is always ≥ p95. The gap between them tells you how bad the long tail is — a small gap means consistent performance, a big gap means some users are having a much worse experience than the typical request.

Should I optimize for p95 or p99?

Most user-facing web apps optimize for p95 as the reliable SLO and watch p99 as a canary. If p99 is much larger than p95, something is wrong for a small but real population — database contention, cold caches, one slow downstream service. Ignoring p99 means ignoring your most frustrated users.

Why not just look at the average latency?

Averages are worse than useless at scale. One endpoint where 99% of requests finish in 50ms and 1% take 30 seconds has a mean of ~350ms — which misrepresents both populations. The fast requests aren't that fast, and the slow requests are invisible. Percentiles tell you what your users actually experience.

How many samples do you need for a reliable p99?

Rule of thumb: 100x the reciprocal of (1 - percentile). For p99 (1 in 100), you want at least 10,000 samples in the window. With fewer samples p99 is noisy and moves around on individual slow requests. p95 needs ~2,000 samples to stabilize.

How do I measure p95 in a Laravel app?

Time every request's duration, store it, and sort to pick the 95th percentile. Don't try to compute this in-request — it's an aggregate. An APM does it automatically: NightOwl stores request duration in microseconds and calculates p95 with an ORDER BY duration OFFSET ... LIMIT 1 query across any time range.

PRICING

Flat pricing. No event caps. No per-seat fees.

14-day free trial, no credit card. Your PostgreSQL, your data.

HOBBY

$5 /month

1 app · 14 days lookback · all Laravel events

TEAM

$15 /month

Up to 3 connected apps · unlimited environments · all Laravel events

AGENCY

$69 /month

Unlimited apps · unlimited agent instances · same flat rate at any traffic