[SDK] LogRecord attribute limits enforcement#4157
Conversation
7bdc3c5 to
1769757
Compare
Apply attribute count and value length limits described by the logs SDK spec (https://opentelemetry.io/docs/specs/otel/logs/sdk/#logrecord-limits) during attribute writes. Limits flow from LoggerProvider through LoggerContext to each LogRecord created by Logger::CreateLogRecord. * Add LogRecordLimits struct with spec defaults: attribute_count_limit = 128 and attribute_value_length_limit = SIZE_MAX (unlimited). * Recordable gains a virtual SetLogRecordLimits with a no-op default so existing implementations need not change. ReadableLogRecord gains a virtual GetDroppedAttributesCount returning zero by default. * ReadWriteLogRecord and OtlpLogRecordable enforce the limits in SetAttribute. An attribute beyond attribute_count_limit is dropped and counted as dropped; string and string-array values whose byte length exceeds attribute_value_length_limit are truncated. Truncation is byte-level, mirroring the existing Span attribute behavior. The OTLP path also populates dropped_attributes_count on the proto LogRecord. * MultiRecordable propagates the limits to every wrapped recordable. * LoggerContext owns a LogRecordLimits value; Logger calls SetLogRecordLimits on the recordable returned by MakeRecordable before any user attribute writes. A new LoggerProviderFactory::Create overload accepts LogRecordLimits. * The declarative configuration path (SdkBuilder) wires LogRecordLimitsConfiguration to the runtime LogRecordLimits. Tests cover ReadWriteLogRecord and OtlpLogRecordable: defaults, count enforcement (including the "replace existing key while at limit must not drop" case), length truncation of strings and string arrays, type selectivity (only string and array-of-string are truncated), the combined count plus length case, and a Logger-level wiring test that verifies the limits configured on LoggerProvider reach the recordable returned by Logger::CreateLogRecord. Fixes open-telemetry#4126 Signed-off-by: thc1006 <84045975+thc1006@users.noreply.github.com>
1769757 to
e244122
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4157 +/- ##
==========================================
- Coverage 82.83% 82.77% -0.06%
==========================================
Files 406 409 +3
Lines 16913 17024 +111
==========================================
+ Hits 14009 14090 +81
- Misses 2904 2934 +30
🚀 New features to boost your workflow:
|
|
@proost Hey, thanks for the ping. I went through #4132 and the review thread before pushing #4157. There's a fair bit of substantive work in yours that mine doesn't have: the OTEL_LOGRECORD_* env var integration, the benchmark file, ElasticSearchRecordable enforcement, and the yaml count_limit plumbing. The reason I went ahead with #4157 even after seeing yours is lalitb's comment from yesterday (#4132 (comment)), which was asking to move enforcement out of the base Recordable hot path and let each recordable handle it the way that fits. That's the shape #4157 takes. Recordable just gains a If the maintainers think #4157 is the better base, I'd like to layer the env vars, benchmark, and ES enforcement on top in follow-ups with attribution to your work. If they prefer the #4132 redesign route, happy to close #4157 and review the new version there. @marcalff @lalitb would appreciate your call on which to drive forward. |
|
Thanks for the explanation. I'm always open to feedback, and if there were concerns about the direction of #4132, I would have been happy to discuss and adjust it. I do wish those concerns had been raised directly on the PR earlier, as it would have helped avoid duplicated work. That said, I don't have a strong preference on whose implementation lands. If you would like to continue driving this work, I'm happy to step aside and close #4132 so we don't spend effort maintaining two competing PRs. So would you like to continue driving this work? |
|
@proost You're right and that one's on me. The honest reason I missed #4132: I was searching open issues by If you're comfortable stepping aside, yes I'd like to continue with #4157. After #4157 lands I'd open these as follow-ups:
If you'd rather author any of those follow-ups yourself and have me review, that works equally well. Whatever keeps the work landing. |
|
Thanks for the clarification and the apology. I’m happy for you to continue driving this work, so I’ll close #4132 to avoid duplicate effort. |
| const opentelemetry::sdk::resource::Resource *resource_ = nullptr; | ||
| const opentelemetry::sdk::instrumentationscope::InstrumentationScope *instrumentation_scope_ = | ||
| nullptr; | ||
| const opentelemetry::sdk::logs::LogRecordLimits *limits_ = nullptr; |
There was a problem hiding this comment.
I think we should avoid storing a pointer to the limits object here. In the normal logger path, CreateLogRecord() sets this from LoggerContext, but the returned LogRecord is an independent unique_ptr. User code can keep that record and call SetAttribute() after the logger/provider/context is gone, which would leave limits_ dangling.
Can we instead copy LogRecordLimits into the recordable instead of storing &limits? The struct is tiny, and it avoids adding a lifetime requirement to LogRecord.
There was a problem hiding this comment.
Agreed. Switched to by-value storage in ba8f9047. The default-constructed value carries the spec defaults (count=128, length=unlimited), so a fresh recordable now enforces the spec count cap from construction — happy to revisit that semantic if you'd prefer a no-op-when-unset sentinel instead.
…runcate) Address three review comments from @lalitb on PR open-telemetry#4157: 1. Store LogRecordLimits by value inside ReadWriteLogRecord and OtlpLogRecordable instead of a raw pointer to a LoggerContext-owned object. A LogRecord is handed out as a unique_ptr, so a record outliving the context that produced it would otherwise dereference a dangling pointer when the user calls SetAttribute() later. The default-constructed value already carries the spec defaults (count=128, length=unlimited), so a fresh recordable enforces the spec count cap from construction, matching the PR's "enforcement" contract. 2. Drop the now-redundant `limits_ != nullptr` short-circuit at every enforcement site (4 in total). This also closes the Codecov-reported uncovered branch in otlp_log_recordable.cc. 3. Truncate UTF-8 string attributes at a code-point boundary instead of a raw byte boundary, so an OTLP protobuf string_value produced by truncation stays valid UTF-8 when the input was. Malformed UTF-8 and trailing lead bytes degrade to plain byte truncation. Logic adapted from open-telemetry#4132 with attribution. While in the same truncation paths, also apply the byte-length cap to raw bytes attributes (`vector<uint8_t>` on the SDK side, AnyValue `bytes_value` on the OTLP side). Both were previously passing through any size, even though the spec applies `attribute_value_length_limit` to bytes attributes as well. Test changes: - Rename DefaultsPassThroughWithoutLimitsObject to DefaultRecordEnforcesSpecCountCap (200 attrs in, 128 stored, 72 dropped) to reflect the new spec-correct default behavior. - Add 3 UTF-8 regression tests on the SDK side (split prevention, exact fit at sequence boundary, malformed input falls back) plus a bytes-truncation test. - Add 3 mirror tests on the OTLP side, a default-cap test, and a bytes-truncation test. Refs: open-telemetry#4126 Co-authored-by: Hyeonho Kim <proost@apache.org> Signed-off-by: thc1006 <84045975+thc1006@users.noreply.github.com>
…are truncate) Address three review comments from @lalitb on PR open-telemetry#4157, plus a follow-up from @owent (r3434178971). 1. Store LogRecordLimits by value inside ReadWriteLogRecord and OtlpLogRecordable instead of a raw pointer to a LoggerContext-owned object. A LogRecord is handed out as a unique_ptr, so a record outliving the context that produced it would otherwise dereference a dangling pointer when the user calls SetAttribute() later. The default-constructed value already carries the spec defaults (count=128, length=unlimited), so a fresh recordable enforces the spec count cap from construction, matching the PR's "enforcement" contract. 2. Drop the now-redundant `limits_ != nullptr` short-circuit at every enforcement site (4 in total). This also closes the Codecov-reported uncovered branch in otlp_log_recordable.cc. 3. Truncate OTLP string attributes at a UTF-8 code-point boundary instead of a raw byte boundary, so the protobuf string_value produced by truncation stays valid UTF-8 when the input was. Malformed UTF-8 and trailing lead bytes degrade to plain byte truncation. Logic adapted from open-telemetry#4132 with attribution. The SDK-side ReadWriteLogRecord truncation stays as plain byte cut. The in-memory `OwnedAttributeValue::std::string` variant may legitimately carry raw bytes when constructed from a non-UTF-8 source, so forcing UTF-8 boundary semantics there would over-truncate that case (per @owent's r3434178971, echoing the same point on open-telemetry#4132 r3409677314). Each recordable's truncation strategy now matches its own consumer's wire-format requirement: SDK in-memory has no wire requirement, OTLP protobuf requires valid UTF-8. While in the same truncation paths, also apply the byte-length cap to raw bytes attributes (`vector<uint8_t>` on the SDK side, AnyValue `bytes_value` on the OTLP side). Both were previously passing through any size, even though the spec applies `attribute_value_length_limit` to bytes attributes as well. Test changes: - Rename DefaultsPassThroughWithoutLimitsObject to DefaultRecordEnforcesSpecCountCap (200 attrs in, 128 stored, 72 dropped) to reflect the new spec-correct default behavior. - Add bytes-truncation tests on both SDK and OTLP sides. - Add 3 UTF-8 regression tests on the OTLP side only (split prevention, exact fit at sequence boundary; malformed-fallback omitted since the algorithm's seq=1 fallback for invalid continuations is implementation detail rather than wire contract). - Add a default-cap test on the OTLP side. Refs: open-telemetry#4126 Co-authored-by: Hyeonho Kim <proost@apache.org> Signed-off-by: thc1006 <84045975+thc1006@users.noreply.github.com>
ba8f904 to
1612edc
Compare
dbarker
left a comment
There was a problem hiding this comment.
Thanks for the PR. Please see comments below.
| // in range (0x80-0xBF); otherwise the lead is treated as a one-byte unit, so | ||
| // malformed input degrades to plain byte truncation. This keeps the resulting | ||
| // protobuf `string_value` valid UTF-8 when the input was valid UTF-8. | ||
| std::size_t Utf8SafePrefixLength(const std::string &value, std::size_t max_bytes) noexcept |
There was a problem hiding this comment.
please move these common utilities to otlp_populate_attribute_utils.<h, cc> and add tests. They will also need to be used for enforcing the span limits attribute length in a follow up PR.
There was a problem hiding this comment.
Done in c4532cd1. Utf8SafePrefixLength and TruncateProtoAttributeValue (renamed from TruncateProtoStringValue to reflect it covers string_value, bytes_value, and array_value branches) now live as OtlpPopulateAttributeUtils static methods. New exporters/otlp/test/otlp_populate_attribute_utils_test.cc adds 14 direct unit tests — UTF-8 boundary preservation, malformed-input fallback, truncated-tail fallback, invalid lead bytes, bytes_value, array recursion, null safety. The future SpanLimits PR can call both helpers from the OTLP trace recordable.
| // boundary semantics here would over-truncate that case. Exporters that | ||
| // require a valid-UTF-8 wire format (OTLP protobuf, ES JSON) apply their | ||
| // own UTF-8-aware truncation at the recordable boundary. | ||
| void TruncateStringValue(opentelemetry::sdk::common::OwnedAttributeValue &value, |
There was a problem hiding this comment.
This will also be needed to enforce attribute length limits for spans as well. Consider moving to sdk/common/attribute_utils.<h, cc>
There was a problem hiding this comment.
Done in c4532cd1. The SDK byte-length truncation is now sdk::common::TruncateAttributeValueByteLength in sdk/include/opentelemetry/sdk/common/attribute_utils.h (renamed since it covers string, string-array, AND bytes variants). Five new unit tests in sdk/test/common/attribute_utils_test.cc covering all three variants plus shorter-than-budget and non-string types. The future SpanLimits PR can reuse this from ReadWriteSpanData. / / Implementation note: I put it inline in the existing header rather than splitting to a new .cc, since attribute_utils.h is currently header-only (no sdk/src/common/attribute_utils.cc exists) and the helper is ~15 LOC, matching the precedent of sdk/common/custom_hash_equality.h. Happy to split to a new .cc if you'd prefer the standard pattern.
| std::string safe_key(key); | ||
|
|
||
| if (attributes_map_.size() >= limits_.attribute_count_limit && | ||
| attributes_map_.find(safe_key) == attributes_map_.end()) |
There was a problem hiding this comment.
at the limit this may result in two lookups for an existing attribute (here and line 213 below). Can this just be one lookup for an existing key?
There was a problem hiding this comment.
Done in c4532cd1. SetAttribute now does .find() once, then conditional .emplace() only when inserting a new key. Existing-key replacement and new-key insertion each cost a single hash lookup. The end-to-end behavior contract (count cap drops new keys, replace existing key doesn't increment dropped count) is still verified by CountLimitDropsExcessAttributes and CountLimitReplaceExistingKeyDoesNotDrop.
…dupe SetAttribute lookup) Address @dbarker's three review comments on PR open-telemetry#4157. * r3442854800: Extract Utf8SafePrefixLength and TruncateProtoAttributeValue from the anonymous namespace in otlp_log_recordable.cc into OtlpPopulateAttributeUtils as static methods, with new direct unit tests in otlp_populate_attribute_utils_test.cc. The upcoming SpanLimits PR will reuse these from the OTLP trace recordable. The OTLP helper that was previously TruncateProtoStringValue is renamed TruncateProtoAttributeValue to reflect that it covers string_value, bytes_value, and array_value branches. * r3443005793: Extract the SDK byte-length truncation helper into sdk::common::TruncateAttributeValueByteLength (inline, declared in the existing sdk/common/attribute_utils.h), with new direct unit tests in attribute_utils_test.cc. The upcoming SpanLimits PR will reuse this from ReadWriteSpanData. The new name reflects that the helper covers string, string-array, AND bytes variants rather than only strings. * r3443034336: Rewrite ReadWriteLogRecord::SetAttribute to use a single unordered_map lookup. The previous code did .find() to gate the count cap, then operator[] to fetch-or-insert; the new code does .find() followed by conditional .emplace(), so existing-key replacement and new-key insertion each cost one hash lookup. Refs: open-telemetry#4126 Signed-off-by: thc1006 <84045975+thc1006@users.noreply.github.com>
| inline void TruncateAttributeValueByteLength(OwnedAttributeValue &value, | ||
| std::size_t max_length) noexcept | ||
| { | ||
| if (nostd::holds_alternative<std::string>(value)) |
There was a problem hiding this comment.
please use nostd::get_if since this is a noexcpt method and nostd::get throws.
There was a problem hiding this comment.
Done in 31449913. Rewrote with nostd::get_if (returns nullptr if the alternative doesn't hold) so the noexcept contract is statically honored without any potentially-throwing call. Bonus: the same commit also restores the Bazel link of the new otlp_populate_attribute_utils_test target (missing //sdk/src/metrics transitive dep), which should resolve the 8 Bazel CI failures triggered on the previous push.
Two small fixes on top of c4532cd: * Address @dbarker's r3444034234: rewrite sdk::common::TruncateAttributeValueByteLength to dispatch on the variant via nostd::get_if (returns nullptr if the alternative does not hold) instead of nostd::holds_alternative + nostd::get. The helper is declared noexcept; nostd::get throws when the alternative does not match, which would invoke std::terminate even though the preceding holds_alternative check makes that path unreachable in practice. The get_if rewrite removes the throwing call entirely so the noexcept contract is statically honored. * Restore the Bazel link of the new otlp_populate_attribute_utils_test target by adding //sdk/src/metrics to its deps, matching the existing otlp_log_recordable_test target. The new test target links against :otlp_recordable, which transitively references sdk::metrics::AdaptingCircularBufferCounter symbols; Bazel's strict layering requires the dep to be declared at the cc_test level. CMake did not catch this because its default link aggregates the whole library. Refs: open-telemetry#4126 Signed-off-by: thc1006 <84045975+thc1006@users.noreply.github.com>
Fixes #4126.
Implements the LogRecord attribute count and value length limits described
by the logs SDK spec
(https://opentelemetry.io/docs/specs/otel/logs/sdk/#logrecord-limits).
Limits flow from
LoggerProviderthroughLoggerContextto eachLogRecordcreated by
Logger::CreateLogRecord. The infrastructure existed in thedeclarative configuration tier (
LogRecordLimitsConfiguration) but wasnever wired into the runtime pipeline.
What changed
New header
sdk/include/opentelemetry/sdk/logs/log_record_limits.h:LogRecordLimitsstruct with spec defaultsattribute_count_limit = 128andattribute_value_length_limit = SIZE_MAX(unlimited).SDK Recordable hierarchy:
Recordablegains a non-pure virtualSetLogRecordLimitswith a no-opdefault body. Existing implementations that do not enforce limits
inherit the no-op and compile unchanged. The virtual is appended at
the end of the vtable to keep the change additive.
ReadableLogRecordgains a non-pure virtualGetDroppedAttributesCountreturning zero by default.
ReadWriteLogRecordoverrides both.SetAttributechecks the countlimit before inserting and truncates string / array-of-string values
whose length exceeds the configured limit. Truncation is byte level,
mirroring the existing Span attribute behavior.
MultiRecordablepropagatesSetLogRecordLimitsto every wrappedrecordable.
OTLP exporter:
OtlpLogRecordableoverridesSetLogRecordLimits.SetAttributedrops attributes beyond the count limit and increments the proto
LogRecord.dropped_attributes_countfield. Strings and string-arrayvalues are truncated after
OtlpPopulateAttributeUtils::PopulateAttributepopulates the proto
AnyValue.Wiring:
LoggerContextowns aLogRecordLimitsvalue and exposesGetLogRecordLimits(). A new fourth parameter is appended to theexisting constructor with a default-constructed default, so existing
call sites compile unchanged.
Logger::CreateLogRecord(both ABI v1 and ABI v2 variants) callsrecordable->SetLogRecordLimits(context_->GetLogRecordLimits())immediately after
MakeRecordable, before any user attribute write.LoggerProviderFactory::Createoverload acceptsLogRecordLimitsand builds aLoggerContextwith those limitsinternally.
Declarative configuration:
SdkBuilder::CreateLoggerProvidernow mapsLogRecordLimitsConfigurationfrom the parsedLoggerProviderConfigurationto the runtimeLogRecordLimitsandpasses them to the new factory overload. The existing FIXME-SDK
comment about wiring limits is removed.
Tests
sdk/test/logs/log_record_limits_test.cc(new):limits object is supplied.
counted. Replacing an existing key while at the limit must not drop.
int / double / bool pass through.
TrackingRecordableproduced by aTrackingProcessorconfirms that the limits configured onLoggerProviderreach the recordable returned byLogger::CreateLogRecord.exporters/otlp/test/otlp_log_recordable_test.ccaugmented with fourcases that verify the proto
dropped_attributes_countfield and thestring / array truncation logic.
Verification
Built and tested in the
otelcpp-devcontainer:log_record_limits_test9/9,otlp_log_recordable_test23/23,logger_sdk_test9/9,logger_provider_sdk_test9/9,simple_log_record_processor_test10/10,
batch_log_record_processor_test13/13. No regression.log_record_limits_test9/9.-fno-rtti(Bazel nortti CI mirror):log_record_limits_test9/9,otlp_log_recordable_test23/23.from a local clang-22 run.
git diff --checkwhitespace: clean.Known limitations
ElasticSearchRecordableis not modified in this PR. It inheritsthe no-op default and does not enforce limits. A follow-up PR can
add enforcement if desired.
a multibyte UTF-8 sequence will be truncated mid-sequence, matching
the existing Span attribute behavior in
OtlpRecordable.(
ParseLogRecordLimitsConfiguration) keeps the existingattribute_value_length_limit = 4096fallback for callers thatopt into the
limits:block without specifying a length. Aligningthat fallback with the spec default (unlimited) is left to a follow-up.