Skip to content

Fix failure in Flex Attention UT #4031

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Conversation

ESI-SYD
Copy link
Contributor

@ESI-SYD ESI-SYD commented Apr 28, 2025

// Case 4: Value is defined by arith::ExtSIOp, tt::AddPtrOp or
// arith::AddIOp operation.
// Case 4: Value is defined by arith::ExtSIOp, tt::AddPtrOp,
// arith::AddIOp or tt::BroadcastOp,tt::SplatOp operation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// arith::AddIOp or tt::BroadcastOp,tt::SplatOp operation
// arith::AddIOp, tt::BroadcastOp or tt::SplatOp operation.

@@ -147,6 +147,11 @@ struct TritonIntelGPUMaterializeBlockPointerPass

LDBG("Considering tensor of pointer load op: " << loadOp);

if (!ttgi::isDivisible(ptr, 4)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why below code is not sufficient?

      // Base pointer can be compensate by the offset and base width, where they
      // each has restriction that it has to be 4 bytes aligned.
      if (axisInfo->getDivisibility(fastChangeDim) % 4 != 0) {
        LDBG(
            "Found Non 4 bytes aligned base: " << axisInfo->getDivisibility(1));
        return false;
      }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check base ptr itself that just similar to check for base ptr created by make_tensor_ptr.

// Ensure the base ptr is 4-byte aligned.
// Note: the HW requires the address to be 64-byte aligned, however we will
// compensate by imposing restrictions on the offsetX and baseWidth.
TypedValue<tt::PointerType> base = makeTensorPtrOp.getBase();
if (!ttgi::isDivisible(base, 4)) {
  LDBG("Found non 4-bytes aligned base: " << base);
  return false;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline, root cause still need to be identified.

@alexbaden
Copy link
Contributor

Please add a description explaining what changed and why (see https://cbea.ms/git-commit/#why-not-how for more info).

@ESI-SYD ESI-SYD marked this pull request as draft April 30, 2025 01:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FlexAttention] PassManager::run failed during running FlexAttention UT
4 participants