Latency targets by endpoint class
| Endpoint type | Good p95 | Acceptable | Problem |
|---|---|---|---|
| Auth (login, token refresh) | < 100ms | 100-200ms | > 200ms |
| UI-blocking read | < 100ms | 100-200ms | > 300ms |
| Search / filter | < 200ms | 200-400ms | > 500ms |
| Write (create / update) | < 300ms | 300-500ms | > 800ms |
| Background receipt (webhook) | < 500ms | 500ms-1s | > 2s |
Why API latency is tighter than page latency
Three reasons:
- No perceptual masking. A browser showing a loading spinner at 500ms feels fine. A mobile app waiting on your API at 500ms feels janky.
- Chained calls. A single screen in a client app often fires 3-5 API calls. Each at 200ms compounds to 600-1000ms perceived latency.
- Consumers can't batch for you. A web server can render server-side and send one response; a consumer SDK makes calls one at a time unless you offer batching endpoints.
Middleware for API-specific timing context
app/Http/Middleware/ApiTiming.php
use Closure;
use Illuminate\Http\Request;
class ApiTiming
{
public function handle(Request $request, Closure $next)
{
$start = hrtime(true);
$response = $next($request);
$durationMs = (hrtime(true) - $start) / 1e6;
$response->headers->set('Server-Timing', "total;dur={$durationMs}");
return $response;
}
}The Server-Timing header is readable by consumer dev tools (Chrome DevTools renders it in the Network panel). Good for consumer-side debugging without exposing full trace data.
Per-endpoint trace drilldown
For a slow endpoint:
- Open the requests dashboard, filter to the route pattern
- Sort by duration descending — find representative slow requests
- Open a slow request's trace view
- Identify the dominant span — usually a DB query or outgoing HTTP call
- Fix that one thing (eager load, add index, move to async) and watch p95 drop
See our related guides: slow query monitoring, N+1 detection, outgoing HTTP tracking.
Budget for network
A consumer in Europe calling your US-East API has 80-120ms of fixed network RTT. That's budget you can't compress with code. If your end-to-end API SLO is 300ms and network is 100ms, your server-side budget is 200ms. Plan accordingly.
THE EASY WAY
Per-endpoint p95 with trace drilldown
NightOwl groups API requests by route pattern with p95 / p99 per endpoint. Click any endpoint to see its slowest requests; click a request to see its spans. Data in your PostgreSQL, from $5/month flat.
composer require nightowl/agent
php artisan nightowl:install