Skip to content

MDEV-39548 Further cleanup of MDL_request boilerplate#5160

Open
longjinvan wants to merge 1 commit into
MariaDB:mainfrom
longjinvan:MDEV-39548
Open

MDEV-39548 Further cleanup of MDL_request boilerplate#5160
longjinvan wants to merge 1 commit into
MariaDB:mainfrom
longjinvan:MDEV-39548

Conversation

@longjinvan
Copy link
Copy Markdown
Contributor

@longjinvan longjinvan commented Jun 1, 2026

Description

This is a follow-up work to PR 5075 ("Cleanup MDL_request boilerplate with ACQUIRE_LOCK macro").

Change thd->backup_commit_lock from MDL_request* to MDL_ticket*

Previously, thd->backup_commit_lock stored a pointer to a stack-allocated MDL_request, creating dangling pointer risks if the function returned early or the code was refactored without clearing the pointer. This patch changes the type to MDL_ticket*, which is managed by the MDL subsystem and does not depend on the caller's stack frame. All related call sites (handler.cc, log.cc, sql_class.cc, xa.cc, sql_reload.cc) are converted to use MDL_ACQUIRE_LOCK directly.

In sql_repl.cc, reset_master() and purge_master_logs() are also converted to MDL_ACQUIRE_LOCK, but the ticket is held in a local MDL_ticket* rather than thd->backup_commit_lock, matching the original semantics where these functions used a local MDL_request.

Convert partition_info.cc to MDL_ACQUIRE_LOCK

Investigation confirmed that tl->mdl_request is not accessed by downstream code after lock acquisition -- only table->mdl_ticket is used. The conversion is safe.

Add MDL_REQUEST_LIST_ADD() helper

For batch lock acquisition via acquire_locks(), callers previously needed 3 steps: allocate an MDL_request on MEM_ROOT, initialize it, and push it into the list. The new MDL_REQUEST_LIST_ADD() macro combines these into a single call with built-in OOM checking. Call sites in sql_base.cc and sp.cc are converted.

sql_show.cc partial conversion

Internal acquire_lock() / try_acquire_lock() calls in try_acquire_high_prio_shared_mdl_lock() are replaced with MDL_ACQUIRE_LOCK() / MDL_TRY_ACQUIRE_LOCK(). However, MDL_request itself cannot be fully eliminated here because tdc_acquire_share() reads from table_list.mdl_request.key for hash lookup in the table definition cache.

Release Notes

N/A

How can this PR be tested?

  • All MTR tests pass, confirming no regressions introduced by this refactoring.

Basing the PR against the correct MariaDB version

  • This is a refactoring change, and the PR is based against the latest MariaDB development branch.

Copyright

All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors metadata locking (MDL) by replacing local MDL_request stack allocations and manual lock acquisitions with helper macros like MDL_ACQUIRE_LOCK and MDL_REQUEST_LIST_ADD, and changing backup_commit_lock to store an MDL_ticket pointer. The review feedback highlights a code smell in sql_show.cc where a sentinel pointer (MDL_ticket *) 1 is returned to indicate errors; the reviewer suggests returning a boolean instead, populating the ticket directly, and encapsulating key initialization within the helper function.

Comment thread sql/sql_show.cc Outdated
Comment on lines +5165 to +5169
static MDL_ticket *
try_acquire_high_prio_shared_mdl_lock(THD *thd, TABLE_LIST *table,
bool can_deadlock)
{
bool error;
MDL_REQUEST_INIT(&table->mdl_request, MDL_key::TABLE, table->db.str,
table->table_name.str, MDL_SHARED_HIGH_PRIO,
MDL_TRANSACTION);
MDL_ticket *ticket;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Returning (MDL_ticket *) 1 as a sentinel value to indicate an error is a code smell and can lead to undefined behavior or hard-to-debug issues if the caller ever attempts to dereference it.\n\nInstead of using a sentinel pointer, we should keep the return type as bool (returning true on error, false on success) and populate table->mdl_request.ticket directly. This preserves the original clean semantics, avoids sentinel pointer hacks, and ensures table->mdl_request.ticket is correctly populated for any downstream code.\n\nWe should also initialize table->mdl_request.key inside this function using mdl_key_init to keep the implementation details encapsulated.

static bool\ntry_acquire_high_prio_shared_mdl_lock(THD *thd, TABLE_LIST *table,\n                                      bool can_deadlock)\n{\n  MDL_ticket *ticket;\n  table->mdl_request.key.mdl_key_init(MDL_key::TABLE, table->db.str,\n                                      table->table_name.str);

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already addressed in the latest revision — the function now returns bool, populates table->mdl_request.ticket directly, and initializes the key inside.

Comment thread sql/sql_show.cc
Comment on lines +5185 to 5203
bool error;
ticket= thd->mdl_context.MDL_TRY_ACQUIRE_LOCK(
MDL_key::TABLE, table->db.str, table->table_name.str,
MDL_SHARED_HIGH_PRIO, MDL_TRANSACTION, error);
if (error)
return (MDL_ticket *) 1; /* error indicator */
}
else
error= thd->mdl_context.acquire_lock(&table->mdl_request,
thd->variables.lock_wait_timeout);
{
ticket= thd->mdl_context.MDL_ACQUIRE_LOCK(
MDL_key::TABLE, table->db.str, table->table_name.str,
MDL_SHARED_HIGH_PRIO, MDL_TRANSACTION,
thd->variables.lock_wait_timeout);
if (!ticket)
return (MDL_ticket *) 1; /* error indicator */
}

return error;
return ticket;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Update the return statements and populate table->mdl_request.ticket directly to avoid using the (MDL_ticket *) 1 sentinel pointer.

    bool error;\n    ticket= thd->mdl_context.MDL_TRY_ACQUIRE_LOCK(\n        MDL_key::TABLE, table->db.str, table->table_name.str,\n        MDL_SHARED_HIGH_PRIO, MDL_TRANSACTION, error);\n    if (error)\n      return true;\n  }\n  else\n  {\n    ticket= thd->mdl_context.MDL_ACQUIRE_LOCK(\n        MDL_key::TABLE, table->db.str, table->table_name.str,\n        MDL_SHARED_HIGH_PRIO, MDL_TRANSACTION,\n        thd->variables.lock_wait_timeout);\n    if (!ticket)\n      return true;\n  }\n\n  table->mdl_request.ticket= ticket;\n  return false;

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in the latest revision.

Comment thread sql/sql_show.cc Outdated
Comment on lines 5305 to 5309
table_list.mdl_request.key.mdl_key_init(MDL_key::TABLE,
table_list.db.str,
table_list.table_name.str);

if (schema_table->i_s_requested_object & OPEN_TRIGGER_ONLY)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Since table_list.mdl_request.key is now fully initialized inside try_acquire_high_prio_shared_mdl_lock, we can remove this manual initialization from the caller.

  if (schema_table->i_s_requested_object & OPEN_TRIGGER_ONLY)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in the latest revision.

Comment thread sql/sql_repl.cc
@knielsen
Copy link
Copy Markdown
Member

knielsen commented Jun 1, 2026 via email

@gkodinov gkodinov added the External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements. label Jun 2, 2026
Copy link
Copy Markdown
Member

@gkodinov gkodinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a preliminary review. Thank you for your contribution!

LGTM. Please stand by for Svoj's final review.

- Change thd->backup_commit_lock from MDL_request* to MDL_ticket*,
  and convert related call sites in handler.cc, log.cc, sql_class.cc,
  xa.cc, and sql_reload.cc to use MDL_ACQUIRE_LOCK.

- Convert reset_master() and purge_master_logs() in sql_repl.cc to
  MDL_ACQUIRE_LOCK, holding the ticket in a local MDL_ticket* (these
  functions used a local MDL_request originally and should not touch
  thd->backup_commit_lock).

- Convert acquire_lock() in partition_info.cc to MDL_ACQUIRE_LOCK.

- Convert try_acquire_lock()/acquire_lock() in sql_show.cc to
  MDL_TRY_ACQUIRE_LOCK/MDL_ACQUIRE_LOCK. MDL_request cannot be fully
  removed here because tdc_acquire_share() reads from
  table_list.mdl_request.key for cache lookup.

- Add MDL_REQUEST_LIST_ADD() helper for enqueuing lock requests into
  MDL_request_list, and convert call sites in sql_base.cc and sp.cc.

All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer Amazon Web
Services, Inc.
Copy link
Copy Markdown
Contributor

@svoj svoj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks nice, thanks! A few minor comments inline.

Comment thread sql/log.cc
*/
if (thd->backup_commit_lock && thd->backup_commit_lock->ticket &&
if (thd->backup_commit_lock &&
!backup_lock_released)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to keep backup_lock_released check here? Looks like it was redundant: I don't see any loops and gotos around it.

Comment thread sql/sp.cc
if (MDL_REQUEST_LIST_ADD(&mdl_requests, thd->mem_root,
sph->get_mdl_type(), db.str, sp_name,
MDL_EXCLUSIVE, MDL_TRANSACTION))
goto error;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't just goto error here without calling ha_index_end().

Comment thread sql/sql_reload.cc
if (thd &&
thd->mdl_context.acquire_lock(&mdl_request,
thd->variables.lock_wait_timeout))
!(thd->backup_commit_lock= thd->mdl_context.MDL_ACQUIRE_LOCK(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't use thd->backup_commit_lock here, make local MDL_ticket *ticket instead.

Comment thread sql/handler.cc
// reset the pointer to the ticket when it's stack instantiated
if (thd->backup_commit_lock == &mdl_backup)
// release the backup commit lock if we acquired it in this function
if (backup_commit_lock_acquired && thd->backup_commit_lock)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

backup_commit_lock_acquired == true implies valid thd->backup_commit_lock.

Comment thread sql/sql_reload.cc
{
if (mdl_request.ticket)
thd->mdl_context.release_lock(mdl_request.ticket);
if (thd->backup_commit_lock)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition looks redundant, you already check that ticket contains valid value when you acquire the lock.

Comment thread sql/sql_show.cc
table->table_name.str, MDL_SHARED_HIGH_PRIO,
MDL_TRANSACTION);
table->mdl_request.key.mdl_key_init(MDL_key::TABLE, table->db.str,
table->table_name.str);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting approach and it is where we should end up eventually: TABLE_LIST should contain only MDL_key and not MDL_request. But let's avoid this change for now, we need to develop some consistent approach for removing MDL_request from TABLE_LIST.

@svoj
Copy link
Copy Markdown
Contributor

svoj commented Jun 3, 2026

@knielsen

The intention here is to block RESET MASTER from running concurrently with
the stage in mariabackup that backs up the binlog files. Since it's somewhat
tricky to handle copying the binlog files if the server is removing them
concurrently, and quite unnecessary. BACKUP_START is a minimal lock that
doesn't block most server operations.

Sounds like MDL_BACKUP_START may indeed be suitable for this purpose. Though I'd probably add a new lock type specifically for it. Still, I guess we can live with this.

I don't see that in the piece of code you quoted? But no,
thd->backup_commit_lock doesn't sound like something that would be involved
here in reset_master().

All 3 occurrences in FLUSH BINARY LOGS, PURGE BINARY LOGS and RESET MASTER are having this pattern:

  MDL_request mdl_request;
  MDL_REQUEST_INIT(&mdl_request, MDL_key::BACKUP, "", "", MDL_BACKUP_START,
                   MDL_EXPLICIT);
  if (thd->mdl_context.acquire_lock(&mdl_request,
                                    thd->variables.lock_wait_timeout))
    return TRUE;
  thd->backup_commit_lock= &mdl_request;

FWICS it is used exclusively by the "wait for the prior commit" routine, which is irrelevant here, right?

@knielsen
Copy link
Copy Markdown
Member

knielsen commented Jun 3, 2026 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements.

Development

Successfully merging this pull request may close these issues.

4 participants