Client Reports
Client reports (not to be confused with User Feedback) are a protocol feature that let clients send status reports about themselves to Sentry. They are currently mainly used to emit outcomes for events that were never sent. Chained relays are also able to emit these client reports to inform the next relay in chain about some outcomes.
Due to a bug in Relay, which discards envelopes containing unknown envelope items, the minimum required version of Sentry for client reports is 21.9.0.
Before client reports were added there were no insights into the full number of events generated within applications instrumented with Sentry SDKs. It was always clear to track the number of events dropped on Sentry server side for any number of reason, but there was a gap in knowing just how many events were never sent from the SDKs at all. Are there patterns in different platforms? Are there problems we are not aware of? If a customer were to call Sentry and ask where there events are, we would have no answer, and no way to find out if there are truly missing events from their SDKs. Client reports removes some of this doubt. That being said we are not looking to perfectly measure every nuance and edge case of events being discarded in SDKs. It is more important to have a best effort and be able to gain insights to our SDKs and their host applications.
As seen here, we communicate Accepted, filtered and dropped, and now we can send a new type discarded (not displayed in product yet).
Client reports are sent as envelope items to Sentry, typically as separate envelopes or with one of the already scheduled envelopes. They should not be sent too frequently but not too infrequently either. Their main purpose is to bring visibility into what is happening on the SDK side which affects the user experience.
For instance SDKs might drop events in a few places in the SDK and this loss of events can be invisible to a customer. Client reports let an SDK emit such event outcomes to provide data about how often this is happening. For instance SDKs might drop events if the transports hit their maximum internal queue size, because rate limits instruct the SDK to drop events as they are over quota etc.
Bugs in our SDKs are out of scope for client reports and are not being tracked using client reports at the moment.
A client report is an item in an envelope called client_report
. It consists of a JSON payload that looks roughly like this:
{
"timestamp": "2020-02-07T14:16:00Z",
"discarded_events": [
{
"reason": "queue_overflow",
"category": "error",
"quantity": 23
},
{
"reason": "queue_overflow",
"category": "transaction",
"quantity": 1321
}
]
}
Note that this must be enclosed in an envelope. So the full event looks something like this:
{}
{"type":"client_report"}
{"timestamp":"..."}
The following fields exist:
timestamp
- String | Number, optional. The timestamp of when the client report was created. Must be an ISO DateTime string or a UNIX timestamp. If not sent, the server will assume the current UTC timestamp. In the data model, this is called
received
. discarded_events
- List of outcome objects {
reason
,category
,quantity
}
reason
: A string reason that defines why events were lost.category
: The data category for which the discard reason applies. These are the same data categories used for rate limits.quantity
: The number of events which were lost
The following discard reasons are currently defined for discarded_events
:
queue_overflow
: a SDK internal queue (eg: transport queue) overflowedcache_overflow
: an SDK internal cache (eg: offline event cache) overflowedratelimit_backoff
: the SDK dropped events because an earlier rate limit instructed the SDK to back off.network_error
: events were dropped because of network errors and were not retried.sample_rate
: an event was dropped because of the configured sample rate.before_send
: an event was dropped inbefore_send
event_processor
: an event was dropped by an event processor; may also be used for ignored exceptions / errorssend_error
: an event was dropped because of an error when sending it (eg: 400 response)internal_sdk_error
: an event was dropped due to an internal SDK error (eg: web worker crash)insufficient_data
: an event was dropped due to a lack of data in the event (eg: not enough samples in a profile)backpressure
: an event was dropped due to downsampling caused by the system being under load
In case a reason needs to be added, it also has to be added to the allowlist in snuba.
Additionally the following discard reasons are reserved but there is no expectation that SDKs send these under normal operation:
rate_limited_events
,filtered_events
,filtered_sampling_events
- List of outcome objects {
reason
,category
,quantity
}
These function like discarded_events
but identify events that were rate limited, filtered or filtered by by dynamic sampling at a relay. Client SDKs must never emit these unless they are operating as a relay. The reason codes for these need to match the reason codes that relay would emit directly to Sentry.
When a transaction is dropped, we record an additional span
category event containing the number of the spans inside the transaction plus one. The plus one stems from the fact that Relay extracts an additional span from the transaction. If a transaction contains no spans, we still want to report one dropped transaction and one dropped span. This also applies to transactions that are not sampled. If certain spans are dropped in beforeSendTransaction
, an event processor etc., we also want to report those.
{
"discarded_events": [
{
"reason": "queue_overflow",
"category": "transaction",
"quantity": 1
},
{
"reason": "queue_overflow",
"category": "span",
"quantity": 3 // 2 spans + 1 span (the transaction itself should be counted)
}
]
}
The client reports feature doesn't expect 100 percent correct numbers, and it is acceptable for the SDKs to lose a small number of client reports. The expectation of this feature is to give the users an approximation of specific outcomes. Of course, the SDKs should ensure not dropping too many reports. It is not required, for example:
- to persist the data when an application crashes.
- to move an envelope item with a client report to the next envelope when the cache for envelopes is full.
SDKs are encouraged to reduce needless communication. They shall not send an envelope everytime they record a discarded event. The following approaches are recommendations that can be adjusted if required. The SDKs should keep track of the discard events in the transport. For applications with a low frequency of envelopes to ingest, such as mobile or web apps, the SDK shall attach the discarded events to an already scheduled envelope. We accept the tradeoff that apps rarely sending events will have fewer client reports. When the SDK schedules envelopes continuously, like on the backend, the SDK shall either choose to flush the discarded events periodically or attach them to an already scheduled envelope.
The one who drops an envelope item must record and report it. If the server drops an envelope item, for example, with the response 429, the client SDK must not record this as the server already does. Still, the SDK must record lost envelope items when dropping them itself, for example, caused by an active rate limit. SDKs can put a simple optional check for HTTP status codes in place where any code >= 400
except 429
will be recorded as network_error
. The client SDKs can assume that client reports never get rate limited. The server is minimizing the possibility of client reports getting rate limited, but the SDKs shouldn't worry about this edge case as this feature is best-effort.
SDKs should provide a way to turn sending of client reports on and off. This option is called send_client_reports
or sendClientReports
on SDKs that have already implemented it.
For SDKs still sending legacy events instead of envelopes for backward compatibility with older Sentry servers, the recommendation is to send the client report as a separate envelope or attach it to pending session envelopes.
There is no expectation that such bookkeeping can work transparently for custom transports. Consequently, it's acceptable if client reports are optional for custom transports.
Our documentation is open source and available on GitHub. Your contributions are welcome, whether fixing a typo (drat!) or suggesting an update ("yeah, this would be better").