NVIDIA ISA Viewer

SM75 (Turing) Instructions

111 base instructions, 605 total variants

Atomic Operation on Generic Memory

Atomic Operation on Global Memory

Atomic Operation on Shared Memory

Move Barrier To Register

Bit Matrix Multiply and Accumulate

Move Special Register to Register

FP64 Fused Mutiply Add

FP64 Compare And Set Predicate

Floating Point To Integer Conversion

Floating-point Range Check

FP32 Fused Multiply and Add

Find Leading One

FP32 Minimum/Maximum

Floating Point Select

FP32 Compare And Set

FP32 Compare And Set Predicate

FP32 Swizzle Add

Get Local Memory Base Address

FP16 Fused Mutiply Add

Matrix Multiply and Accumulate

FP16 Compare And Set

FP16 Compare And Set Predicate

Integer To Floating Point Conversion

Integer To Integer Conversion

Integer To Integer Conversion and Packing

Integer Absolute Value

3-input Integer Addition

Integer Dot Product and Accumulate

Integer Multiply And Add

Integer Matrix Multiply and Accumulate

Integer Minimum/Maximum

Integer Compare And Set Predicate

Load from generic Memory

Load from Global Memory

Load within Local Memory Window

Load within Shared Memory Window

Load Matrix from Shared Memory with Element Size Expansion

LOAD Effective Address

Load Effective PC

Logic Operation

Match Register Values Across Thread Group

Move Matrix with Transposition or Expansion

FP32 Multi Function Operation

Move Predicate Register To Register

Predicate Logic Operation

Performance Monitor Trigger

Population count

Permute Register Pair

Move Register To Predicate Register

Move from Vector Register to a Uniform Register

PC Register Move

Move Special Register to Register

Move Special Register to Uniform Register

Select Source with Predicate

Warp Wide Register Shuffle

Atomic Op on Surface Memory

Texture MipMap Level

Texture Fetch With Derivatives

Uniform Bitfield Mask

Uniform Bit Reverse

Load Effective Address for a Constant

Uniform Find Leading One

Uniform Integer Addition

Uniform Integer Multiplication

Integer Compare and Set Uniform Predicate

Load from Constant Memory into a Uniform Register

Uniform Load Effective Address

Logic Operation

Uniform Predicate to Uniform Register

Uniform Predicate Logic Operation

Uniform Population Count

Uniform Byte Permute

Uniform Register to Uniform Predicate

Uniform Sign Extend

Uniform Funnel Shift

Absolute Difference

Absolute Difference

Vote Across SIMD Thread Group

Voting across SIMD Thread Group with Results in Uniform Destination

Unfound Instructions

Our fuzzer has not found these 61 instructions. If you have a cubin that contains any of these instructions and would like to contribute it, message us at collab@sf-tensor.com

BARunfound

Barrier Synchronization

BMOVunfound

Move Convergence Barrier State

BPTunfound

BreakPoint/Trap

BRAunfound

Relative Branch

BREAKunfound

Break out of the Specified Convergence Barrier

BRXunfound

Relative Branch Indirect

BRXUunfound

Relative Branch with Uniform Register Based Offset

BSSYunfound

Barrier Set Convergence Synchronization Point

BSYNCunfound

Synchronize Threads on a Convergence Barrier

CALLunfound

Call Function

CCTLunfound

Cache Control

CCTLLunfound

Cache Control

CCTLTunfound

Texture Cache Control

DEPBARunfound

Dependency Barrier

EXITunfound

Exit Program

F2Funfound

Floating Point To Floating Point Conversion

FADD32Iunfound

FP32 Add

FFMA32Iunfound

FP32 Fused Multiply and Add

FMUL32Iunfound

FP32 Multiply

FRNDunfound

Round To Integer

HADD2_32Iunfound

FP16 Add

HFMA2_32Iunfound

FP16 Fused Mutiply Add

HMUL2_32Iunfound

FP16 Multiply

IADDunfound

Integer Addition

IADD32Iunfound

Integer Addition

IDP4Aunfound

Integer Dot Product and Accumulate

IMULunfound

Integer Multiply

IMUL32Iunfound

Integer Multiply

ISCADDunfound

Scaled Integer Addition

ISCADD32Iunfound

Scaled Integer Addition

JMPunfound

Absolute Jump

JMXunfound

Absolute Jump Indirect

JMXUunfound

Absolute Jump with Uniform Register Based Offset

KILLunfound

Kill Thread

LOPunfound

Logic Operation

LOP32Iunfound

Logic Operation

MEMBARunfound

Memory Barrier

MOV32Iunfound

Move

NANOSLEEPunfound

Suspend Execution

PSETPunfound

Combine Predicates and Set Predicate

R2Bunfound

Move Register to Barrier

REDunfound

Reduction Operation on Generic Memory

RETunfound

Return From Subroutine

RTTunfound

Return From Trap

SETLMEMBASEunfound

Set Local Memory Base Address

SHLunfound

Shift Left

SHRunfound

Shift Right

STunfound

Store to Generic Memory

STGunfound

Store to Global Memory

STLunfound

Store to Local Memory

STSunfound

Store to Shared Memory

SUREDunfound

Reduction Op on Surface Memory

SUSTunfound

Surface Store

UIADD3.64unfound

Uniform Integer Addition

ULOPunfound

Logic Operation

ULOP32Iunfound

Logic Operation

UPSETPunfound

Uniform Predicate Logic Operation

USHLunfound

Uniform Left Shift

USHRunfound

Uniform Right Shift

WARPSYNCunfound

Synchronize Threads in Warp

YIELDunfound

Yield Control