[triton] Gluon TMA Op Verifier 강화 및 Illegal Instruction Sanitize 모드 추가

2026년 1월 7일수정: 2026년 1월 7일

PR 링크: triton-lang/triton#9112 상태: Merged | 변경: +218 / -102

들어가며

Triton의 TMA(Tensor Memory Accelerator) 연산들은 descriptor와 tensor 사이의 타입 일치가 필수적입니다. 기존 verifier는 shape이 정확히 일치해야 했는데, 이는 rank reduction이나 reshape를 허용하지 않는 문제가 있었습니다. 이 PR은 element 총 수 기반의 검증으로 변경하고, gather/scatter의 verifier도 공통 함수로 통합합니다.

핵심 코드 분석

Before - shape 완전 일치 요구

static LogicalResult verifyDescriptorLoadStoreType(
    Operation *op, TensorDescType desc, RankedTensorType tensor) {
  if (blockShape == tensorShape &&
      block.getElementType() == tensor.getElementType())
    return success();
  return op->emitOpError("tensor descriptor block and tensor types must match");
}

After - element 수 기반 검증

LogicalResult verifyDescriptorLoadStoreOp(Operation *op, TensorDescType desc,
                                          ShapedType tensor) {
  unsigned blockNumels = product(blockShape);
  unsigned tensorNumels = product(tensorShape);
  if (blockNumels != tensorNumels) {
    return op->emitOpError("descriptor block and tensor must have the same "
                           "number of elements");
  }
  return success();
}

Gather/Scatter verifier 통합

// DescriptorGatherOp::verifyResultType 제거
// 공통 함수로 통합
LogicalResult verifyGatherScatterOp(Operation *op, ShapedType blockType,
                                    ShapedType resultType,
                                    ShapedType indicesType);