Skip to content

Revert "ipc4: handler: maintain IPC set_pipeline_state order"#10868

Draft
serhiy-katsyuba-intel wants to merge 3 commits into
thesofproject:mainfrom
serhiy-katsyuba-intel:revert_wait
Draft

Revert "ipc4: handler: maintain IPC set_pipeline_state order"#10868
serhiy-katsyuba-intel wants to merge 3 commits into
thesofproject:mainfrom
serhiy-katsyuba-intel:revert_wait

Conversation

@serhiy-katsyuba-intel

Copy link
Copy Markdown
Contributor

This reverts commit 05bffd7.

The LL scheduler already guarantees that pipelines run in the order they were triggered by the host (while respecting priorities). The original workaround (#8504), which is now being reverted, has few problems: adds 1 ms startup latency per pipeline, ignores pipeline priorities and, hopefully, is unnecessary.

The original workaround was added as a result of investigating issue #8481. I believe there were two different problems that both caused the same issue.

The first is a problem in the topology. The driver sent a multi-pipeline start for pipelines 0 and 1, meaning the driver expects pipeline 1 to start after pipeline 0. However, pipeline 0 was created with priority 1 and pipeline 1 with priority 0, so the firmware started pipeline 1 first.

However, fixing the priority in the topology did not change the pipeline start order. This was probably caused by a bug in the firmware: complex IPC3 pipeline triggering logic was used instead of straightforward IPC4 logic. This appears to have been fixed by #8506.

I'm not exactly sure why the original workaround with wait_for_compound_msg() was added. Perhaps having two problems causing the same issue simultaneously led to some confusion during verification. Let's try removing it, and if we encounter any problems with pipeline triggering order (hopefully not), we'll fix them elsewhere.

Using IPC4_MOD_ID() is not the best way to check if IPC4 is enabled. For
module ID == 0, IPC4_MOD_ID() returns 0 for both IPC3 and IPC4. Module
ID 0 is a valid IPC4 BASEFW ID. Since BASEFW is never added to a pipeline,
this change doesn't fix any real problem. However, it's just more
appropriate and safer to use IS_ENABLED(CONFIG_IPC_MAJOR_4): if module ID
data becomes corrupted (zeroed) at runtime, this shouldn't make debugging
even harder by causing unexpected pipeline behavior.

Signed-off-by: Serhiy Katsyuba <serhiy.katsyuba@intel.com>
With IPC4, each pipeline is triggered separately. Exactly 1 pipeline is
expected in the pipelines list in pipeline_schedule_triggered().
Unfortunately, we still have considerable complex IPC3 pipeline triggering
code being used with IPC4. This assertion ensures that the code works
correctly for the IPC4 case.

Signed-off-by: Serhiy Katsyuba <serhiy.katsyuba@intel.com>
This reverts commit 05bffd7.

The LL scheduler already guarantees that pipelines run in the order they
were triggered by the host (while respecting priorities). The original
workaround (thesofproject#8504), which is now
being reverted, has few problems: adds 1 ms startup latency per pipeline,
ignores pipeline priorities and, hopefully, is unnecessary.

The original workaround was added as a result of investigating issue
thesofproject#8481. I believe there were two
different problems that both caused the same issue.

The first is a problem in the topology. The driver sent a multi-pipeline
start for pipelines 0 and 1, meaning the driver expects pipeline 1
to start after pipeline 0. However, pipeline 0 was created with priority 1
and pipeline 1 with priority 0, so the firmware started pipeline 1 first.

However, fixing the priority in the topology did not change the pipeline
start order. This was probably caused by a bug in the firmware: complex
IPC3 pipeline triggering logic was used instead of straightforward IPC4
logic. This appears to have been fixed by
thesofproject#8506.

I'm not exactly sure why the original workaround with
wait_for_compound_msg() was added. Perhaps having two problems causing
the same issue simultaneously led to some confusion during verification.
Let's try removing it, and if we encounter any problems with pipeline
triggering order (hopefully not), we'll fix them elsewhere.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reverts the IPC4 SET_PIPELINE_STATE ordering workaround that introduced a blocking wait (and added trigger latency), relying instead on the LL scheduler to preserve trigger order. It also adjusts several pipeline-walk/trigger/reset/prepare code paths to treat “IPC4 behavior” as a build-time property (CONFIG_IPC_MAJOR_4) rather than inferring it from component IDs.

Changes:

  • Remove the per-pipeline blocking ipc_wait_for_compound_msg() between delayed IPC4 pipeline triggers.
  • Update pipeline traversal/trigger/prepare/reset logic to use IS_ENABLED(CONFIG_IPC_MAJOR_4) for IPC4-specific behavior.
  • Add an IPC4-only assertion that the “pipelines to schedule” list contains at most one pipeline.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/ipc/ipc4/handler-user.c Removes the blocking wait between delayed triggers in multi-pipeline SET_PIPELINE_STATE handling.
src/audio/pipeline/pipeline-stream.c Prevents cross-pipeline trigger propagation in IPC4 builds using CONFIG_IPC_MAJOR_4.
src/audio/pipeline/pipeline-schedule.c Uses CONFIG_IPC_MAJOR_4 for IPC4-specific reset behavior; adds an IPC4-only assertion on the triggered pipeline list size.
src/audio/pipeline/pipeline-params.c Prevents prepare propagation across pipelines in IPC4 builds using CONFIG_IPC_MAJOR_4.
src/audio/pipeline/pipeline-graph.c Prevents reset propagation across pipelines in IPC4 builds using CONFIG_IPC_MAJOR_4.

@serhiy-katsyuba-intel

Copy link
Copy Markdown
Contributor Author

@kv2019i , please have a look.

@kv2019i

kv2019i commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

Thanks @serhiy-katsyuba-intel for reviving this! I looked at this again (and especially your comment #8481 (comment) ), and I think we indeed have a mismatch between driver+topology and FW. If you look at kernel commit http://github.com/thesofproject/linux/commit/4df7d6a61f2c0e0920f4f4caa02e41797974a487 , on playback, pipelines are ordered from larger priority values to small priority values. In FW, above comments plus e.g. zephyr_ll_task_insert_unlocked(), 0 is the highest prio.

So host will a SET_PIPELINE_STATE with pipelines orderered from low priority (some larger number) to high priority (0 value).

  • With this PR, FW will pass handling to the LL thread, which will process the multiple state changes in priority order, so in reverse order where kernel sent them.
  • With baseline FW, only one pipeline is processed per LL tick, so the order kernel chose is maintained.

I do agree this is wrong. The author is no longer working with SOF, but I remember discussing with him that he followed the reference host behaviour here, but it now looks wrong. Why spend the effort to reorder the pipeline list on host side, if the FW will then reverse it?

Any takes @ujfalusi ? Not sure how could we change the policy in kernel.

Priorities are used very little in Linux topologies. We don't have any cases in CI to test the echo ref, so testing the original scenario is not straighforward. Still, I do think this looks wrong now, and we'd better look at to fix kernel and/or topologies in the few cases where we need to ensure a particular order of startup (like the echo reference pipe in on of the old bugs).

@ujfalusi

ujfalusi commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Thanks @serhiy-katsyuba-intel for reviving this! I looked at this again (and especially your comment #8481 (comment) ), and I think we indeed have a mismatch between driver+topology and FW. If you look at kernel commit http://github.com/thesofproject/linux/commit/4df7d6a61f2c0e0920f4f4caa02e41797974a487 , on playback, pipelines are ordered from larger priority values to small priority values. In FW, above comments plus e.g. zephyr_ll_task_insert_unlocked(), 0 is the highest prio.

Interesting, the Linux commit added sof_ipc4_add_pipeline_by_priority() which treats 0 as the lowest priority, higher the number, higher the priority and does the ordering accordingly (created 2023 September 15, merged on Oct 3), on FW/topology side the priority attribute got clarified by 07b762e that the highest priority is 0, lowest is 7 (dated to 2023 Nov 21).
There is obviously contradicting understanding of what the number behind priority means. What might caused the kernel patch to do what it does is a comment in reference that it has a stale comment saying that 31 is the 'Max priority`.
imho, the kernel is doing this completely wrong, upside down:

So host will a SET_PIPELINE_STATE with pipelines orderered from low priority (some larger number) to high priority (0 value).

* With this PR, FW will pass handling to the LL thread, which will process the multiple state changes in priority order, so in reverse order where kernel sent them.

* With baseline FW, only one pipeline is processed per LL tick, so the order kernel chose is maintained.

I do agree this is wrong. The author is no longer working with SOF, but I remember discussing with him that he followed the reference host behaviour here, but it now looks wrong. Why spend the effort to reorder the pipeline list on host side, if the FW will then reverse it?

Any takes @ujfalusi ? Not sure how could we change the policy in kernel.

Only this:

diff --git a/sound/soc/sof/ipc4-pcm.c b/sound/soc/sof/ipc4-pcm.c
index 0adf06097dac..6e7d8ab4c48f 100644
--- a/sound/soc/sof/ipc4-pcm.c
+++ b/sound/soc/sof/ipc4-pcm.c
@@ -125,6 +125,7 @@ static void sof_ipc4_add_pipeline_by_priority(struct ipc4_pipeline_set_state_dat
 	struct sof_ipc4_pipeline *pipeline = pipe_widget->private;
 	int i, j;
 
+	/* 0 = highest priority, 7 = lowest */
 	for (i = 0; i < trigger_list->count; i++) {
 		/* add pipeline from low priority to high */
 		if (ascend && pipeline->priority < pipe_priority[i])
@@ -165,13 +166,13 @@ sof_ipc4_add_pipeline_to_trigger_list(struct snd_sof_dev *sdev, int state,
 		 */
 		if (spipe->started_count == spipe->paused_count)
 			sof_ipc4_add_pipeline_by_priority(trigger_list, pipe_widget, pipe_priority,
-							  false);
+							  true);
 		break;
 	case SOF_IPC4_PIPE_RESET:
 		/* RESET if the pipeline is neither running nor paused */
 		if (!spipe->started_count && !spipe->paused_count)
 			sof_ipc4_add_pipeline_by_priority(trigger_list, pipe_widget, pipe_priority,
-							  true);
+							  false);
 		break;
 	case SOF_IPC4_PIPE_PAUSED:
 	case SOF_IPC4_PIPE_EOS:
@@ -182,7 +183,7 @@ sof_ipc4_add_pipeline_to_trigger_list(struct snd_sof_dev *sdev, int state,
 		 */
 		if (spipe->paused_count == (spipe->started_count - 1))
 			sof_ipc4_add_pipeline_by_priority(trigger_list, pipe_widget, pipe_priority,
-							  true);
+							  false);
 		break;
 	default:
 		break;

The sof_ipc4_add_pipeline_to_trigger_list() does not really care about priorities, it just re-sorts pipelines ascending or descending order based on the priority, so we just need to swap the direction at call sites.

Priorities are used very little in Linux topologies. We don't have any cases in CI to test the echo ref, so testing the original scenario is not straighforward. Still, I do think this looks wrong now, and we'd better look at to fix kernel and/or topologies in the few cases where we need to ensure a particular order of startup (like the echo reference pipe in on of the old bugs).

@ujfalusi

Copy link
Copy Markdown
Contributor

fyi: thesofproject/linux#5810

* is expected in the pipelines list (it's unclear whether an empty list
* should be tolerated).
*/
assert(list_is_empty(&ctx->pipelines) ||

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@serhiy-katsyuba-intel have you ever seen empty lists here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, but the function will work without errors if an empty list is passed as a parameter. So I decided to add this check just to avoid introducing a crash when accessing the first list element without checking whether it exists.

@kv2019i

kv2019i commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

@serhiy-katsyuba-intel @ujfalusi Ok, it seems there's a genuine delta between the reference we use for IPC4 and SOF implementation. We now have two methods to signal order of triggers:

  • order of pipelines when SET_PIPELINE_STATE IPC is sent with multiple pipelines
  • the priority set of each of the pipelines

The problem/difference is caused by the SOF mechanism to delegate pipeline changes to happen in the context of the audio pipeline (looping in @lyakh ). Upon first state change to running, the pipelines are not running yet (no task scheduled in LL scheduler), so ipc4_pipeline_trigger() will not return with "delayed==true". Instead the pipeline state change is performed immediately in IPC handler and the pipelines are processed in the order they were in the IPC command list (driver has decided).

Upon next trigger, the ipc4_pipeline_trigger() delegates the state change to LL thread. Here's the important bit. The LL scheduler runs the LL tasks in priority order, so if the kernel/host sent with reverse order (start low priority first), this order is ignored/reverse.

The original commit #8504 change the LL implementation such that only one pipeline is processed per one LL tick. This ensures the order of pipelines is always followed and we have some (not many but a few) topologies relying on this to ensure correct start order (e.g. otherwise #8481 (comment) happens).

I think we have a genuine conflict of semantics here. If we revert #8504, then SOF FW no longer allows host to trigger state-changes in reverse order ,compared to the order the LL scheduler will run the pipelines. I think one compromise is to put the "synchronous pipeline state changes" code of #8504 under a Kconfig and only enable it in Linux overlay (as Linux driver does depend on this).

@kv2019i kv2019i left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marking with -1 just in case (see long comment above). I'm ok to go with a separate Kconfig.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants