Skip to content

datamodel-code-generator leaks nested subobject properties into parent class #84

@simontaurus

Description

@simontaurus

Problem

When a JSON Schema has a property with items containing allOf references, datamodel-code-generator (v0.51.0 and v0.54.1) incorrectly promotes properties from the nested subobject schema into the parent class.

Example

ComposedUnit schema has:

  • Top-level properties: type, main_symbol, conversion_factor_from_si, ucum_codes, factor_units, system_of_quantities_and_units, composed_units
  • composed_units.items (ComposedUnitElement) has its own properties including osw_id, conversion_factor_to_main_unit

Expected:

class ComposedUnit(Item, OntologyRelated):
    # Only top-level properties
    main_symbol: str
    conversion_factor_from_si: float
    factor_units: list[FactorUnit]
    composed_units: list[ComposedUnitElement]

class ComposedUnitElement(OntologyRelated):
    osw_id: str
    conversion_factor_to_main_unit: float
    factor_units: list[ComposedFactorUnit]

Actual:

class ComposedUnit(Item, OntologyRelated):
    main_symbol: str
    conversion_factor_from_si: float
    factor_units: list[FactorUnit]
    composed_units: list[ComposedUnitElement]
    osw_id: str                          # LEAKED from ComposedUnitElement
    conversion_factor_to_main_unit: float # LEAKED from ComposedUnitElement

Root cause

The temp schema files written by osw-python's _fetch_schema are correct (no leaked properties at top level). The leak happens inside datamodel-code-generator during allOf resolution of nested items schemas. The nested composed_units.items.allOf to OntologyRelated causes its resolved properties to bubble up into the parent class.

Same issue affects PrefixUnit (leaks from prefix_units.items) and QuantityUnit (leaks from composed_units.items).

Additional allOf issues in datamodel-code-generator

In the same codebase, we also observed:

  1. Dropped allOf base class: ComposedUnit(Item, OntologyRelated) is generated as ComposedUnit(OntologyRelated) - the Item base is dropped despite both being in allOf
  2. Wrong type default: ComposedUnit gets QuantityUnit's type default instead of its own

Both are non-deterministic and seem related to schema processing order.

Current workaround

Manual fix of generated _model.py files + post-processing in osw-python-package-generator.

Affected versions

  • datamodel-code-generator 0.51.0 and 0.54.1
  • oold-python (uses datamodel-code-generator with allof_class_hierarchy=Always, reuse_model=True, use_title_as_name=True)

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions