[dss_bench] Tool to generate automatic graphs for q/s based on various parameters by the-glu · Pull Request #1519 · interuss/monitoring

the-glu · 2026-06-18T12:55:36Z

This PR adds a new tool to generate meaningful graphs to compare the performance of various scenarios.

As of now, we do have Locust tests. They serve some purposes (mainly variations over time), but using them to validate performance can be time-consuming and prone to error. We also have a tendency to use various, incompatible parameters between tests.
An extra consideration is the fact that CockroachDB data is distributed differently between every run, meaning that tests with NUM_USS and NUM_NODE greater than one must average performance across every DSS, not just the first one.

The framework proposed here aims to measure performance as a single point: no change over time, and in theory, each test cleans up after itself. Example: a test that creates and deletes a single operational intent (included here as an example).

Then, we add a variant, which represents the X-axis of our graphs. These could be multiple; for example: the number of existing subscriptions, or the number of workers. This PR includes an inter-USS latency context as an example.

Finally, an option is available to compare different images or different datastores, with the idea of doing comparisons (for example, in a PR against master, or to compare performance between datastores, which will be needed for Raft).

The framework automatically cleans up and runs 'start-locally' for every data point, then produces a graph. A JSON file is also stored for future use.

The test is executed against all DSS at the same time and averaged.

Example graph with latest version:

This allows us to generate useful graphs, like this one showing how latency heavily impacts queries as simple as RID operational intents:

(⚠️ This graph has been generated before displaying errors)

Another example comparing the current master and the latest release on RID:

(⚠️ This graph has been generated before displaying errors)

This shows small variations (at least in terms of QPS), probably explained by the fact that I ran it on my machine while other processes were running. Note that tests should probably be run on a dedicated machine, free from external influences as much as possible. The graph shown there are only for demonstration.

Notice that a run can take a significant amount of time, especially with database initialization on high latencies.

This PR is a first test, goal is to add more tests or variant in future PR, especially a RID ISA with one subscription, and one SCD test (based on flightinsubs).

BenjaminPelletier · 2026-06-18T18:38:52Z

+    scopes: list[str] = []
+    default: bool = True


name is probably sufficiently self-documenting, but I'm not sure what these are from inspection and this is a base class that will be used in (presumably) a number of places -- let's document what these are.

BenjaminPelletier · 2026-06-19T03:34:25Z

+
+    try:
+        test.setup(session, base_url)
+    except Exception:


This will prevent even the user from cancelling execution with KeyboardInterrupt; it seems like we should be much narrower in the exceptions we catch. What exceptions would we want to accept and continue for here? Wouldn't we expect the setup to work, and want to stop a test as probably invalid if the setup wasn't successful?

It doesn't prevent the user to cancel execution, Exception is not a subclass of KeyboardException.

>>> import time >>> try: ... time.sleep(200) ... except Exception as e: ... print("Catched") ... ^CTraceback (most recent call last): File "<python-input-2>", line 2, in <module> time.sleep(200) ~~~~~~~~~~^^^^^ KeyboardInterrupt

However yes, letting the test run when setup fail is probably wrong, I switched to an early return.

I let the teardown catched however: failing is probably less an issue, especially since datastore are reset everytime. It that ok?

BenjaminPelletier · 2026-06-19T03:50:47Z

+            test.action(session, base_url)
+            latencies_ms.append((time.monotonic() - t0) * 1000.0)
+            done += 1
+        except Exception:


This seems like an overbroad catch; could we just use query_and_describe to catch the right exceptions in the right circumstances and then check whether the query succeeded?

We could restrict the catch, but the idea is to be large to catch others potential errors (like wrong data returned, etc.).

query_and_describe also do much more that simple queries (including potential retries), and in the testing case I don't think we want to do that? Idea is to do simple queries (like others loadtest), not to have the "full" query framework.

BenjaminPelletier · 2026-06-19T04:04:40Z

+
+
+def run_test(
+    test: BenchTest, targets: list[tuple[str, str]], cfg: GlobalConfig


It's hard to figure out what "targets" is, requiring tracing though the code; let's just make a simple data structure so it's super clear:

@dataclass class Target: base_url: str audience: str

Suggested change

test: BenchTest, targets: list[tuple[str, str]], cfg: GlobalConfig

test: BenchTest, targets: list[Target], cfg: GlobalConfig

...but, it doesn't seem like carrying audience is even necessary since it's a function of the base URL (using an AuthAdapter/UTMClientSession will take care of this automatically).

BenjaminPelletier · 2026-06-19T04:23:03Z

+    # survivorship bias of percentiles computed over successes only.
+    with_errors = merged + merged_errors
+
+    return {


Data type please

…s parameters

the-glu mentioned this pull request Jun 18, 2026

[dss_bench] Add rid isa with one subscription #1520

Open

the-glu force-pushed the dss_bench branch from f378605 to 0c1ff05 Compare June 18, 2026 13:33

the-glu mentioned this pull request Jun 18, 2026

[rid] Add lock on subscriptions interuss/dss#1523

Draft

BenjaminPelletier reviewed Jun 19, 2026

View reviewed changes

the-glu force-pushed the dss_bench branch from 3179867 to e1a5371 Compare June 22, 2026 11:45

[dss_bench] Tool to generate automatic graphs for q/s based on variou…

1cb1e94

…s parameters

the-glu force-pushed the dss_bench branch from e1a5371 to 1cb1e94 Compare June 22, 2026 11:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[dss_bench] Tool to generate automatic graphs for q/s based on various parameters#1519

[dss_bench] Tool to generate automatic graphs for q/s based on various parameters#1519
the-glu wants to merge 1 commit into
interuss:mainfrom
Orbitalize:dss_bench

the-glu commented Jun 18, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BenjaminPelletier Jun 18, 2026

Uh oh!

Uh oh!

BenjaminPelletier Jun 19, 2026

Uh oh!

the-glu Jun 29, 2026 •

edited

Loading

Uh oh!

BenjaminPelletier Jun 19, 2026

Uh oh!

the-glu Jun 29, 2026 •

edited

Loading

Uh oh!

BenjaminPelletier Jun 19, 2026

Uh oh!

Uh oh!

BenjaminPelletier Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants



		def run_test(
		test: BenchTest, targets: list[tuple[str, str]], cfg: GlobalConfig

	test: BenchTest, targets: list[tuple[str, str]], cfg: GlobalConfig
	test: BenchTest, targets: list[Target], cfg: GlobalConfig

Uh oh!

Conversation

the-glu commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BenjaminPelletier Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

BenjaminPelletier Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

the-glu Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

BenjaminPelletier Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

the-glu Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

BenjaminPelletier Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

BenjaminPelletier Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

the-glu commented Jun 18, 2026 •

edited

Loading

the-glu Jun 29, 2026 •

edited

Loading

the-glu Jun 29, 2026 •

edited

Loading