-
Notifications
You must be signed in to change notification settings - Fork 60
Fix failure in Flex Attention UT #4031
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
// Case 4: Value is defined by arith::ExtSIOp, tt::AddPtrOp or | ||
// arith::AddIOp operation. | ||
// Case 4: Value is defined by arith::ExtSIOp, tt::AddPtrOp, | ||
// arith::AddIOp or tt::BroadcastOp,tt::SplatOp operation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// arith::AddIOp or tt::BroadcastOp,tt::SplatOp operation | |
// arith::AddIOp, tt::BroadcastOp or tt::SplatOp operation. |
@@ -147,6 +147,11 @@ struct TritonIntelGPUMaterializeBlockPointerPass | |||
|
|||
LDBG("Considering tensor of pointer load op: " << loadOp); | |||
|
|||
if (!ttgi::isDivisible(ptr, 4)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why below code is not sufficient?
// Base pointer can be compensate by the offset and base width, where they
// each has restriction that it has to be 4 bytes aligned.
if (axisInfo->getDivisibility(fastChangeDim) % 4 != 0) {
LDBG(
"Found Non 4 bytes aligned base: " << axisInfo->getDivisibility(1));
return false;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check base ptr itself that just similar to check for base ptr created by make_tensor_ptr.
// Ensure the base ptr is 4-byte aligned.
// Note: the HW requires the address to be 64-byte aligned, however we will
// compensate by imposing restrictions on the offsetX and baseWidth.
TypedValue<tt::PointerType> base = makeTensorPtrOp.getBase();
if (!ttgi::isDivisible(base, 4)) {
LDBG("Found non 4-bytes aligned base: " << base);
return false;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed offline, root cause still need to be identified.
Please add a description explaining what changed and why (see https://cbea.ms/git-commit/#why-not-how for more info). |
#3998