Skip to content

Advance the facade to the ecosystem-pin-2026-06-22a MEOS surface#28

Open
estebanzimanyi wants to merge 23 commits into
MobilityDB:mainfrom
estebanzimanyi:feat/facade-surface-22a
Open

Advance the facade to the ecosystem-pin-2026-06-22a MEOS surface#28
estebanzimanyi wants to merge 23 commits into
MobilityDB:mainfrom
estebanzimanyi:feat/facade-surface-22a

Conversation

@estebanzimanyi

Copy link
Copy Markdown
Member

Regenerates codegen/input/meos-idl.json and the generated facade from the MEOS-API catalog at ecosystem-pin-2026-06-22a (4463 functions, 89 structs, 20 enums), and absorbs the two MEOS surface changes since the pin-15c regen.

Generator: managed wrapper for the value out-param family

The *_value_at_timestamptz accessors renamed their out-param resultvalue. FunctionsGenerator now recognises value as an output-result parameter, so it emits the managed three-argument wrapper (allocate the buffer, call the boolean native, return the typed value) uniformly for tbool, tint, tfloat, tbigint, ttext, tcbuffer, tjsonb, and tnpoint. Detection is gated on a pointer C type so a by-value Datum input named value (e.g. contains_set_value) is never mistaken for an out-param, and the forwarded native argument is decoupled from the parameter name.

OO call sites adapted to the catalog

  • spanset_spans no longer takes the count out-param; the .spans() accessors drop the stale argument.
  • The value / time split operations now return the Temporal ** fragment array directly and report the bins and the fragment count through out-params; value_split, value_time_split, and time_split allocate those buffers and read the count.

Tests

Adds runtime coverage for the value accessors and the split operations (previously uncovered). Full jmeos-core suite: 1739 tests, 0 failures, 0 errors against an all-families libmeos built at the pin.


Stacks on the open facade-regen chain (#27 pin-12l) and the pushed pin-14l / pin-15c regens, which do not yet have their own PRs — their commits appear in this PR's cumulative diff against main.

estebanzimanyi and others added 23 commits May 23, 2026 06:45
Source IDL regenerated by MEOS-API run.py from the MobilityDB
accumulate/parity-1.4 headers (@3764e6894) — the pre-merge parity target,
which carries the trgeo_* -> trgeometry_* user-API rename that master does
not yet have. 4068 functions.

This lands the trgeometry I/O + accessor surface the prior IDL missed:
trgeometry_in (constructor), trgeometry_instant_n, trgeometry_instants, and
the renamed trgeometry_* accessors/relations (the old abbreviated trgeo_*
public names are gone from libmeos, so the prior facade called renamed-away
symbols). GeneratedFunctions regenerated from it (jmeos-core compiles clean;
the legacy functions.functions surface the tests use is untouched, 0 test
refs to GeneratedFunctions).

Unblocks IDL-driven consumers (e.g. the streaming-parity Flink/Kafka facade)
to build a trgeometry sample and exercise the ~66 trgeo operators.
… tjsonb family + recovered types

Regenerate the meos-idl.json and functions.GeneratedFunctions against
ecosystem-pin-2026-06-11f (8a3a6db64): the base json/jsonb/jsonpath API is now
public in meos_json.h (IDL 137 -> 213 json fns), plus the tjsonb temporal type.
jsonb_to_text recovers to text* (was implicit-int). jsonb_in/out + tjsonb
round-trip through the binding.
MEOS keeps process-global state — meos_initialize cannot be re-run after a
meos_finalize in the same JVM. A fresh fork per test class keeps the native
MEOS lifecycle clean.
…t -> mul)

First scoped step of wiping the dual facade: route the five tnumber multiply
calls through the generated functions.GeneratedFunctions (mul_* — the
normalized name) instead of the hand-rolled legacy functions.functions
(mult_*). One family at a time; the legacy import stays until the file is fully
migrated.
Wipe step 2: route the 29 type/collection files whose every functions.functions
call has an identical-signature counterpart in the generated
functions.GeneratedFunctions through the generated facade, and drop their legacy
import. Mechanical 1:1 name+signature repoint — no behaviour change. The files
with signature-divergent calls (value_at_timestamptz / *set_values / spanset_spans
families) and rename families are migrated in follow-up scoped commits.
…f the legacy facade

Wipe step 3: IntSet/FloatSet/SpanSet/IntSpanSet/FloatSpanSet onto the generated
facade. intset_values / floatset_values / spanset_spans gained a trailing
Pointer-count out-param in the generated signature; the OO callers read the
length from the separate num_elements()/num_spans(), so they pass a throwaway
4-byte count buffer and ignore it. Array result unchanged; all five files now
fully off functions.functions.
…(pg_date_* -> date_*)

Wipe step 4: route datespan + datespanset through the generated facade. The
generated date I/O drops the legacy pg_ prefix with identical signatures
(pg_date_in -> date_in, pg_date_out -> date_out).
… date_*, dateset_values count)

Wipe step 5: route dateset through the generated facade — pg_date_in/out ->
date_in/out (identical sigs), and dateset_values gained a trailing Pointer-count
out-param (length comes from the separate num_elements(), so pass a throwaway
count buffer).
…ix value_at

Wipe step 6: route TInt/TFloat/TBool through the generated facade. value_at
reshaped to the generated *_value_at_timestamptz (manages the out-param
internally, returns a Pointer to the value or null). Fixes three latent bugs the
hand-rolled facade hid (value_at was untested): the value sits at offset 0 (was
read at offset Integer.BYTES=4); tfloat values are doubles -> read getDouble and
cast (was getFloat -> always 0.0); tbool values are 1 byte -> getByte (was getInt
-> out-of-bounds). Now null-safe: throws on a timestamp where this has no value
(was undefined/garbage). Verified 5/2.5/true via smoke.
…t_to_cstring/interval_make now public

11g exports the base PG-compat conversion helpers in postgres_ext_defs so the
generator catalogs them (resolves the legacy-facade-wipe helper relay). IDL
+3 fns.
…cade

Wipe step 7: TextSet (text2cstring -> text_to_cstring) and ConversionUtils
(interval_make now public in 11g; pg_timestamptz_in/out -> timestamptz_in/out,
pg_interval_out -> interval_out) onto the generated facade.
…+ value_at)

Wipe step 8: TText onto the generated facade. cstring2text/text2cstring ->
cstring_to_text/text_to_cstring (now public in 11g). value_at reshaped: the
generated ttext_value_at_timestamptz returns the text* directly (or null), so
read it via text_to_cstring; null-safe (throws on no-value, was the offset-8
garbage read). Verified "hello" via smoke.
Wipe step 9: tstzspan (adjacent_period_timestamp -> adjacent_span_timestamptz,
pg_timestamptz_in -> timestamptz_in) and tstzset (timestampset_out ->
tstzset_out; tstzset_values gained a count out-param) onto the generated facade.
… public)

Wipe step 10: STBox onto the generated facade. gserialized_in -> geom_in
(identical sig); geo_expand_spatial(gs, d) -> stbox_expand_space(geo_to_stbox(gs),
d) (the public composition). 77 STBox tests green.
Functions that return a struct larger than 16 bytes by value (the seven
*Split returns plus MvtGeom) use the SysV/AArch64 sret calling convention:
the caller allocates the struct and passes a hidden pointer as an implicit
first argument; the callee fills it and returns it. The emitter previously
collapsed such a return to a bare Pointer, so jnr-ffi mis-read the return
register and the struct fields (notably count) came back as garbage.

Parse the IDL "structs" section, compute each struct's size, and for a
by-value return larger than 16 bytes emit a hidden leading Pointer _sret
parameter in the interface, allocate it in the wrapper, and return the
filled buffer. Register-returned structs (<=16B) are logged, not silently
mis-bound. Regenerates GeneratedFunctions with the new bindings.
Repoint value_split / value_time_split / time_split / space_split /
space_time_split (TNumber, Temporal, TPoint) onto the generated
GeneratedFunctions split wrappers, which now return the *Split struct via
the sret convention. The methods read fragments at offset 0 and count at
the struct's count offset (16 for the 3-field splits, 24 for the 4-field
time splits) instead of the stale pre-735f out-parameter signature.

Also fixes defects surfaced while migrating:
- duration/start were parsed in an inverted branch so the duration was
  dropped (or null-dereferenced) on the default-start path; parse the
  duration unconditionally and default only the start.
- timedelta_to_interval passed cumulative units (toHours/toMinutes/
  toSeconds give the whole span in each unit) to interval_make, which then
  re-added them on top of the days. It now decomposes per field and parses
  a textual interval with interval_in, sidestepping a jnr-ffi quirk that
  mis-passes interval_make's trailing double after its six int arguments.

Verified end to end on both the Duration and the String duration paths:
all five split methods return the correct fragment counts through the OO API.
Repoint the last three main-code classes from the hand-rolled functions.functions
facade onto the generated GeneratedFunctions surface, completing the main-code
side of the dual-facade wipe. No main-code class imports functions.functions now.

Renames and reshapes applied (verified against the generated signatures):
- pg_timestamptz_in/pg_interval_in -> timestamptz_in/interval_in (identical sigs).
- the temporal spatial-relationship calls (tcontains/tdisjoint/tdwithin/
  tintersects/ttouches) drop the trailing restr,atvalue booleans the current
  MEOS signatures no longer take (dwithin keeps its distance argument).
- value_at_timestamp uses the generated bool+out-param wrapper, which returns the
  GSERIALIZED* directly, instead of reading the out buffer at the wrong offset.

Defects surfaced and fixed while migrating (all confirmed via smoke):
- count out-parameters were read at offset 4 (getInt(Integer.BYTES)) instead of 0
  in values/make_simple/stboxes, yielding out-of-bounds garbage counts.
- Memory.allocate(Runtime.getRuntime(runtime), n) threw ClassCastException
  (Runtime.getRuntime expects a library proxy, not a Runtime); pass runtime.

Smoke: values=3, make_simple=1, stboxes=2, value_at=POINT(5 5). Type suites
TGeomPoint/TGeogPoint/TInt/TFloat/TBool all green (624 tests, 0 fail/0 err); the
only residual is the pre-existing varstr_cmp ttext_in crash in TTextTest.
JMEOS bootstraps MEOS with meos_initialize_timezone + meos_initialize_error_handler
but never meos_initialize_collation(). Text comparison goes through varstr_cmp,
which dereferences the (uninitialized) collation and segfaults; integer, float and
geometry temporals never compare text, so only the text paths crashed. This is the
long-standing ttext_in -> varstr_cmp SIGSEGV that took out TTextTest/TextSetTest and
the error-branch classes in the full suite — a binding bootstrap gap, not a MEOS bug
(raw jnr confirms: timezone-only crashes, meos_initialize() or timezone+collation
work; pure C is fine).

Initialize the collation alongside the existing init. For classes that build text
objects in instance-field initializers (TextSetTest), the init goes in a static
block so it runs at class load, before the fields are constructed. The collation
call uses GeneratedFunctions because the legacy facade has no static wrapper for it.

Full suite now fully green for the first time: 1735 tests, 0 failures, 0 errors,
0 native crashes (was 1625 passing with two classes core-dumping).
Repoint every test from functions.functions to the generated GeneratedFunctions
surface (the calls are same-name same-signature, so this is a mechanical repoint)
and remove the hand-rolled functions.java. The dual-facade irregularity is gone:
the whole library — main code and tests — now uses the single generated facade.
The other functions-package classes (GeneratedFunctions, the Meos* error types,
error_handler/error_handler_fn) are unaffected.

Full suite after deletion: 1735 tests, 0 failures, 0 errors, 0 native crashes.
Track the pin fast-forward train to its tip (a816eec9b). Purely additive over 11n
(+5 functions, no removals, no signature changes): the base text-case helpers
text_upper / text_lower / text_initcap, meos_strtof, and the borrowed-pointer
accessor tsequenceset_value_n_p. (11o/11p in between were surface-neutral —
vendored cppcheck + a Windows tzdata cmake option.) Rebuild libmeos with -DH3=ON
(70 th3index exports), regenerate the IDL via MEOS-API (4389 functions), and
regenerate GeneratedFunctions.

Carries the full delta over the wipe: H3/th3index, text_in/out, the case helpers,
pg_interval/pg_timestamptz, and the uint64 hash_extended fix. sret + collation
preserved.

Verified: jmeos-core compiles; full suite green (1735 tests, 0 failures, 0
crashes); text_upper("hello")="HELLO" through JMEOS.
Re-vendor codegen/input/meos-idl.json from the MEOS-API catalog regenerated against
ecosystem-pin-2026-06-14l (de8b322483) — composed from the deliverable PRs (@sqlfn +
comparison-family aliases + jsonb recovery + doxygroups + sql-arity) — and re-run the
generator. 4389 -> 4466 methods (+77 new 14l functions). Picks up the 14l count-accessor
change: set_vals / set_values / set_spans / tsequenceset_sequences_p and the per-type
*set_values now carry the int *count out-param (2-arg); tiling returns the *Split structs.

Refresh the bundled libmeos.so to the 14l build. jmeos-core builds and the suite passes
1735/0/0 against libmeos 14l.
Re-vendor codegen/input/meos-idl.json from the MEOS-API catalog regenerated against
ecosystem-pin-2026-06-15c (d875308e) and re-run the generator: 4466 -> 4469 methods
(+ eintersects_tpcpoint_geo / nad_tpcpoint_geo pointcloud spatial rels). The operator
dialect is SQL-name-only, so the C surface is otherwise unchanged. Bundled libmeos.so
refreshed to 15c; suite 1735/0/0.
Regenerate codegen/input/meos-idl.json and the generated facade from the
MEOS-API catalog at ecosystem-pin-2026-06-22a (4463 functions, 89 structs,
20 enums).

Emit the managed wrapper for the value out-param of the
*_value_at_timestamptz accessor family: the wrapper allocates the result
buffer, calls the boolean native method, and returns the typed value, so
the OO layer keeps the three-argument call across tbool, tint, tfloat,
tbigint, ttext, tcbuffer, tjsonb, and tnpoint. Detection is gated on a
pointer C type so a by-value Datum input named "value" is not treated as
an out-param, and the forwarded native argument is decoupled from the
parameter name.

Adapt the OO call sites to the catalog: spanset_spans no longer takes the
count out-param, and the value and time split operations return the
fragment array directly while reporting the bins and the fragment count
through out-params.

Add runtime coverage for the value accessors and the split operations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant