Expand attribute value type to support complex values everywhere and cleanup surrounding code#5266
Open
DylanRussell wants to merge 22 commits into
Open
Expand attribute value type to support complex values everywhere and cleanup surrounding code#5266DylanRussell wants to merge 22 commits into
attribute value type to support complex values everywhere and cleanup surrounding code#5266DylanRussell wants to merge 22 commits into
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR expands OpenTelemetry Python’s attribute value model across the API, SDK, and OTLP exporters to support “complex” values everywhere (e.g., None, heterogeneous arrays, and maps), aligning behavior with the referenced OTEP and updating surrounding utilities/tests accordingly.
Changes:
- Broadens
AnyValue/AttributeValue/Attributestyping and updates public API surfaces (tracing/logs/events/metrics) to accept complex attribute values. - Refactors attribute cleaning/storage (
BoundedAttributes) and metric attribute-keying (_hash_attributes) to handle complex values consistently. - Updates OTLP proto/json encoding to represent
Noneas an emptyAnyValueand adjusts unit tests/benchmarks for the new semantics.
Reviewed changes
Copilot reviewed 31 out of 31 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| opentelemetry-sdk/tests/trace/test_trace.py | Updates span attribute/event tests for complex values and bytes behavior. |
| opentelemetry-sdk/tests/resources/test_resources.py | Updates resource attribute validation tests for new cleaning/stringify behavior. |
| opentelemetry-sdk/tests/metrics/test_view_instrument_match.py | Updates metrics tests to use _hash_attributes aggregation keys. |
| opentelemetry-sdk/tests/metrics/test_measurement_consumer.py | Adjusts a concurrency test to pass attributes explicitly. |
| opentelemetry-sdk/src/opentelemetry/sdk/util/instrumentation.py | Switches instrumentation scope attributes typing to Attributes. |
| opentelemetry-sdk/src/opentelemetry/sdk/resources/init.py | Aligns resource attribute typing/imports with new Attributes definition. |
| opentelemetry-sdk/src/opentelemetry/sdk/metrics/_internal/_view_instrument_match.py | Introduces _hash_attributes and deep-copies measurement attributes prior to aggregation. |
| opentelemetry-sdk/src/opentelemetry/sdk/environment_variables/init.py | Clarifies docs for attribute value length limits with string/bytes focus. |
| opentelemetry-sdk/src/opentelemetry/sdk/_logs/_internal/_exceptions.py | Updates exception-attribute merging to the unified Attributes type. |
| opentelemetry-sdk/src/opentelemetry/sdk/_logs/_internal/init.py | Updates logging translation checks/warnings for AnyValue casting. |
| opentelemetry-sdk/src/opentelemetry/sdk/_events/init.py | Updates event logger provider attribute typing to Attributes. |
| opentelemetry-sdk/benchmarks/metrics/test_benchmark_metrics_histogram.py | Expands benchmarks to cover complex/mapping/array attribute shapes. |
| opentelemetry-api/tests/logs/test_proxy.py | Updates logs proxy tests to Attributes. |
| opentelemetry-api/tests/events/test_proxy_event.py | Updates events proxy tests to Attributes. |
| opentelemetry-api/tests/attributes/test_attributes.py | Reworks attribute tests around new _clean_attribute_value/BoundedAttributes. |
| opentelemetry-api/src/opentelemetry/util/types.py | Redefines AnyValue/AttributeValue and widens Attributes accordingly; removes _ExtendedAttributes. |
| opentelemetry-api/src/opentelemetry/trace/span.py | Updates span API type hints to use collections.abc.Mapping and types.Attributes in NoOp implementation. |
| opentelemetry-api/src/opentelemetry/attributes/init.py | Major refactor: new recursive cleaner and BoundedAttributes as a dict subclass. |
| opentelemetry-api/src/opentelemetry/_logs/_internal/init.py | Updates log API typing from _ExtendedAttributes to Attributes. |
| opentelemetry-api/src/opentelemetry/_events/init.py | Updates event API typing from _ExtendedAttributes to Attributes. |
| exporter/opentelemetry-exporter-otlp-proto-grpc/benchmarks/test_benchmark_trace_exporter.py | Updates patch target to TraceServiceStub. |
| exporter/opentelemetry-exporter-otlp-proto-common/tests/test_log_encoder.py | Refactors OTLP proto log encoder tests around new null/AnyValue behavior. |
| exporter/opentelemetry-exporter-otlp-proto-common/tests/test_attribute_encoder.py | Adjusts encoder failure tests to use an unencodable object. |
| exporter/opentelemetry-exporter-otlp-proto-common/src/opentelemetry/exporter/otlp/proto/common/_internal/_log_encoder/init.py | Stops using allow_null; encodes null as empty AnyValue. |
| exporter/opentelemetry-exporter-otlp-proto-common/src/opentelemetry/exporter/otlp/proto/common/_internal/init.py | Simplifies _encode_value and _encode_attributes around null-as-empty-AnyValue. |
| exporter/opentelemetry-exporter-otlp-json-common/tests/test_log_encoder.py | Updates JSON log encoder tests for null-as-empty-AnyValue body. |
| exporter/opentelemetry-exporter-otlp-json-common/tests/test_common_encoder.py | Updates JSON common encoder tests for null and unencodable values. |
| exporter/opentelemetry-exporter-otlp-json-common/src/opentelemetry/exporter/otlp/json/common/_internal/_log_encoder/init.py | Removes allow_null usage in JSON log encoding. |
| exporter/opentelemetry-exporter-otlp-json-common/src/opentelemetry/exporter/otlp/json/common/_internal/init.py | Simplifies _encode_value and _encode_attributes around null-as-empty-AnyValue. |
| docs/conf.py | Updates Sphinx type-hint resolution aliasing; now also touches metrics internals. |
| .changelog/5266.added | Adds changelog entry for extended/complex attribute support across packages. |
Comments suppressed due to low confidence (1)
opentelemetry-api/src/opentelemetry/trace/span.py:96
- The note in
Span.set_attributessays the behavior ofNonevalue attributes is undefined, but this PR explicitly adds support forNone/null attribute values across the API/SDK. Please update the docstring to reflect the new supported semantics (even ifNoneremains discouraged).
def set_attributes(
self, attributes: Mapping[str, types.AttributeValue]
) -> None:
"""Sets Attributes.
Sets Attributes with the key and value passed as arguments dict.
Note: The behavior of `None` value attributes is undefined, and hence
strongly discouraged. It is also preferred to set attributes at span
creation, instead of calling this method later since samplers can only
consider information already present during span creation.
"""
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+106
to
110
| class BoundedAttributes(dict): | ||
| """A dict with a fixed max capacity which cleans values to ensure they are valid attribute values. | ||
|
|
||
| An attribute value is valid if it is an AnyValue. | ||
| An attribute needs cleansing if: | ||
| - Its length is greater than the maximum allowed length. | ||
| """ | ||
|
|
||
| if not (key and isinstance(key, str)): | ||
| _logger.warning("invalid key `%s`. must be non-empty string.", key) | ||
| return None | ||
|
|
||
| try: | ||
| return _clean_extended_attribute_value(value, max_len=max_len) | ||
| except TypeError as exception: | ||
| _logger.warning("Attribute %s: %s", key, exception) | ||
| return None | ||
|
|
||
|
|
||
| def _clean_attribute_value( | ||
| value: types.AttributeValue, limit: int | None | ||
| ) -> types.AttributeValue | None: | ||
| if value is None: | ||
| return None | ||
|
|
||
| if isinstance(value, bytes): | ||
| try: | ||
| value = value.decode() | ||
| except UnicodeDecodeError: | ||
| _logger.warning("Byte attribute could not be decoded.") | ||
| return None | ||
|
|
||
| if limit is not None and isinstance(value, str): | ||
| value = value[:limit] | ||
| return value | ||
|
|
||
|
|
||
| class BoundedAttributes(MutableMapping): # type: ignore | ||
| """An ordered dict with a fixed max capacity. | ||
|
|
||
| Oldest elements are dropped when the dict is full and a new element is | ||
| added. | ||
| When the dict is full and a new element is added, the oldest element is dropped. | ||
| """ |
Comment on lines
+90
to
+94
| val, max_string_value_length | ||
| ) | ||
| element_type = type(element) # type: ignore | ||
|
|
||
| # The type of the sequence must be homogeneous. The first non-None | ||
| # element determines the type of the sequence | ||
| if sequence_first_valid_type is None: | ||
| sequence_first_valid_type = element_type | ||
| # use equality instead of isinstance as isinstance(True, int) evaluates to True | ||
| elif element_type != sequence_first_valid_type: | ||
| _logger.warning( | ||
| "Mixed types %s and %s in attribute value sequence", | ||
| sequence_first_valid_type.__name__, | ||
| type(element).__name__, | ||
| ) | ||
| return None | ||
|
|
||
| cleaned_seq.append(element) | ||
|
|
||
| # Freeze mutable sequences defensively | ||
| return tuple(cleaned_seq) | ||
|
|
||
| # Some applications such as Django add values to log records whose types fall outside the | ||
| # primitive types and `_VALID_ANY_VALUE_TYPES`, i.e., they are not of type `AnyValue`. | ||
| # Rather than attempt to whitelist every possible instrumentation, we stringify those values here | ||
| # so they can still be represented as attributes, falling back to the original TypeError only if | ||
| # converting to string raises. | ||
| ) != _InvalidAttributeValue.INVALID_VALUE: | ||
| cleaned_mapping[key] = cleaned_value | ||
| return cleaned_mapping |
Comment on lines
206
to
207
| # Assign _dict directly to avoid re-cleaning already clean values | ||
| # and to bypass the immutability guard in __setitem__ |
Comment on lines
+107
to
+111
| attributes = {} | ||
| if measurement.attributes: | ||
| # Deepcopy to prevent the user from modifying the attributes after the | ||
| # measurement has been consumed. | ||
| attributes = copy.deepcopy(measurement.attributes) |
Comment on lines
+29
to
35
| # TODO: Is there a better way to do this ? I have to do this re-import thing | ||
| # in like a dozen places. | ||
| # Provide AnyValue in opentelemetry.attributes module's namespace so the | ||
| # "AnyValue" forward reference in opentelemetry.util.types._ExtendedAttributes | ||
| # resolves when sphinx_autodoc_typehints calls typing.get_type_hints() on | ||
| # BoundedAttributes (whose __globals__ is the attributes module). Docs-only. | ||
| import opentelemetry.attributes # noqa: E402 |
Comment on lines
+621
to
625
| _logger.warning( | ||
| "LogRecord.msg is of type %s which is not a valid AnyValue, attempting to cast it to a string.", | ||
| type(record.msg), | ||
| ) | ||
| body = str(record.msg) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR updates the code to support complex values everywhere as described in this OTEP -- basically it adds support for
Nonetype, heterogeneous arrays, and maps as attribute values.I cleaned up the surrounding code, including refactoring
BoundedAttributes.. this was in part possible because we no longer have to support 2 sets of attribute values.For the
bytestype I just pass that through and don't try to decode to a string. We don't know how bytes were encoded. TheAnyValueproto has abytesfield that our encoders put bytes into.. This matches what we do currently forExtendedAttributesbut not what we do for regularAttributes, the current code tries to utf-8 decode to a string.I updated the encoding of
Noneso that it is always encoded as an emptyAnyValue.. I believe this is in-line with the spec: open-telemetry/opentelemetry-specification#4392 - previously we would do this for Arrays but not Maps and not when None is encoded as a primitive..I added a warning log when we cast types to
str()as a fallback (#4808 has some context on why we do that) -- I think instrumentations should be passing a well formed AnyValue type intoset_attribute, we don't know the best way to cast these unknown types..One thing that's missing from the spec is if leaf nodes should count towards the max attribute total count, I think that is the intent but it was not clearly documented (see #4587 and #4587) -- I can clarify and clean that up in a follow up PR though, as this one is already pretty big..
I think the public symbols changes are acceptable --- the methods being "removed" from
BoundedAttributesstill exist in the underlyingdictclass. The changes to the types inopentelemetry-apiare additive non-breaking changes.Type of change
Please delete options that are not relevant.
How Has This Been Tested?
Unit tests
Does This PR Require a Contrib Repo Change?
Yes I had to submit open-telemetry/opentelemetry-python-contrib#4646 -- if someone attempts to use an old (anything less than the next release of)
opentelemetry-instrumentation-loggingwith the next release ofopentelemetry-apiit will not work.. I'm confused about what to do to resolve that issue though..Checklist: