Cutlass
CUDA Templates for Linear Algebra Subroutines and Solvers
|
Defies structural properties of single-precision GEMM where any number of the input/output could be fp16 or fp32. The accumulator type stays in fp32. More...
#include "cutlass/gemm/gemm.h"
#include "cutlass/gemm/gemm_epilogue.h"
#include "cutlass/gemm/gemm_epilogue_traits.h"
#include "cutlass/gemm/gemm_global_tile.h"
#include "cutlass/gemm/gemm_shared_tile.h"
#include "cutlass/gemm/gemm_traits.h"
#include "cutlass/gemm/fp16_sgemm_multiply_add.h"
Go to the source code of this file.
Namespaces | |
cutlass | |
cutlass::gemm | |