[ GLOSSARY ]

SLO vs SLA — and where SLI fits

QUICK ANSWER

What's the difference between SLI, SLO, and SLA?

SLI (indicator) is what you measure — p95 latency, success rate. SLO (objective) is your internal target for that indicator, like 'p95 under 300ms over rolling 28 days.' SLA (agreement) is a contractual promise to customers, usually with financial penalties. Every SLA implies an SLO and an SLI; not every SLO has an SLA. Most teams should set SLOs; only sell SLAs when customers require them.

Updated · 2026-04-13

The three terms

Example across all three

text
SLI:  "The proportion of successful (non-5xx) /api/checkout requests."
        Definition: count(status < 500) / count(*)

SLO:  "99.9% of /api/checkout requests succeed over a rolling 28-day window."
        Internal target; breached means we invest in reliability.

SLA:  "We guarantee 99.5% availability of /api/checkout.
       Customers exceeding failed requests receive service credits."
        External contract; breached means we pay out.

Error budget — the unlock

The inverse of the SLO. If your SLO is 99.9%, your error budget is 0.1%. Over a 30-day window that's 43 minutes of tolerable failure. The budget is a currency:

  • Budget plentiful → green light for risky deploys, new features, velocity
  • Budget running out → freeze risky changes, invest in reliability
  • Budget fully consumed → hard freeze until the next window resets

This framing ends the debate between "ship features" and "fix reliability" teams — the error budget tells you which phase you're in.

Common Laravel SLOs

Endpoint class Latency SLO Availability SLO
Auth, checkout, paymentp95 < 300ms99.9% / 28 days
JSON API (list / read)p95 < 200ms99.5% / 28 days
Page render (dashboard)p95 < 500ms99.5% / 28 days
Internal adminp95 < 1000ms99% / 28 days

Burn-rate alerts

Alert on how fast you're burning the budget, not on raw thresholds. A 1-hour burn rate of 14x means you'd exhaust a 30-day budget in about 2 days if it continued — actionable. A brief latency spike that uses 0.5% of the budget probably isn't.

Google's SRE workbook has canonical multi-window burn-rate alert setups: page on high burn over 1h + sustained burn over 6h.

Frequently asked questions

What's the difference between an SLI, SLO, and SLA?

SLI (Service Level Indicator) is the measurement itself — p95 latency, 5xx error rate, request success ratio. SLO (Service Level Objective) is the target you set internally for that SLI — 'p95 under 300ms over rolling 28 days'. SLA (Service Level Agreement) is the contractual promise to customers, usually with financial penalties if missed. Every SLA implies an SLO and an SLI; not every SLO has an SLA.

Do I need an SLA for my Laravel app?

Only if you're selling enterprise contracts where customers demand uptime guarantees. Internal SLOs are far more useful — they tell your team when to invest in reliability versus new features. SLAs introduce legal and financial complexity. Start with SLOs, graduate to SLAs when customers require it.

What's an 'error budget'?

The inverse of an SLO. If your SLO is 99.9% success rate, your error budget is 0.1% failures. Over a 30-day window that's 43.2 minutes of failure. Spend the budget on risky deploys or new feature velocity when it's plentiful; tighten change velocity when it's running out. The error budget framing turns SLOs from pass/fail gates into a currency.

What SLO should I set for a Laravel app?

Depends on the endpoint. Common starting points: 99% success rate over rolling 28 days for non-critical routes, 99.9% for payment/auth/checkout paths. Latency SLOs: p95 under 500ms for UI render endpoints, under 200ms for JSON API endpoints. These are rough floors — tune based on what users actually complain about.

Should I alert on SLO burn or on raw SLI breach?

SLO burn rate is better for noise reduction. Alerting on any p95 spike is noisy; alerting when you've burned 5% of your monthly error budget in the last hour is signal. Google's SRE workbook has canonical burn-rate alert configurations. Raw threshold alerts have their place for novel events (first occurrence of a new exception), but daily ops alerts should be burn-based.

Can NightOwl enforce SLOs?

Not yet as a first-class feature. You can approximate with threshold alerts on request latency and error rates — configure alert channels to page when a route's p95 exceeds its SLO for 10+ minutes. Full SLO tooling with error budgets and burn rates is a feature on the roadmap, not shipping today.

PRICING

Flat pricing. No event caps. No per-seat fees.

14-day free trial, no credit card. Your PostgreSQL, your data.

HOBBY

$5 /month

1 app · 14 days lookback · all Laravel events

TEAM

$15 /month

Up to 3 connected apps · unlimited environments · all Laravel events

AGENCY

$69 /month

Unlimited apps · unlimited agent instances · same flat rate at any traffic