Why sample at all
A fully-instrumented request generates 5-20 spans. At 1,000 req/s that's 10M-20M spans/day. At ~1 KB per span that's 10-20 GB/day, before retention. Storage adds up quickly. Sampling trades data volume for cost.
Head-based sampling
Decide at the root span. A deterministic hash of the trace ID lets distributed services agree on the same decision — a trace either lives or dies across every service it touches.
OpenTelemetry head sampler (PHP)
use OpenTelemetry\SDK\Trace\Sampler\TraceIdRatioBasedSampler;
use OpenTelemetry\SDK\Trace\TracerProvider;
$sampler = new TraceIdRatioBasedSampler(0.1); // keep 10%
$tracerProvider = TracerProvider::builder()
->setSampler($sampler)
->build();Pros: cheap (no buffering), works cross-service, zero operational complexity. Cons: you throw away traces before knowing whether they were errors or slow. 99% of your interesting traces are in the 10% you're statistically likely to have sampled out.
Tail-based sampling
Buffer every span for some window (usually 30-60 seconds, keyed by trace ID). When the trace completes, apply rules to decide whether to keep it.
OTel Collector tail sampling config
processors:
tail_sampling:
decision_wait: 30s
num_traces: 50000
policies:
- name: errors-always
type: status_code
status_code:
status_codes: [ERROR]
- name: slow-always
type: latency
latency:
threshold_ms: 1000
- name: sample-normal
type: probabilistic
probabilistic:
sampling_percentage: 10Pros: keep all errors, keep all slow traces, sample the rest. Much better data quality per dollar. Cons: you need a Collector cluster with enough memory to buffer all in-flight traces. Operational complexity.
Hybrid approach
Many production systems combine both: head-sample at 100% (keep everything) through a Collector that then applies tail rules. The head-sampler stops being a sampler and becomes a pass-through; the tail rules do the actual filtering.
The NightOwl approach
NightOwl doesn't sample by default. At typical Laravel volumes (1-10K req/s) storing everything in PostgreSQL is cheap, and the full dataset matters for debugging rare issues. At high volumes where storage cost matters, configure sampling at the agent level — the Nightwatch package supports it via its nightwatch.sample_rate config key.