The theoretical performance calculator is based on the clock speed of the GPU, the amount of cores (CUDA or Stream processors) and the number of floating point operations the GPU can do per clock cycle. 

The amount of FLOPS a GPU can do is calculated by the equation FLOPS = Clock Speed * Cores * floating point operations GPU can do per clock cycle

Usually the boost clock is used in the calculations to get the highest theoretical performance the GPU is capable of.

Example Calculation

The Nvidia GeForce RTX 3080 10GB has boost clock of 1710 MHz, 8704 Cuda Cores and can do 2 floating point operations per clock cycle at FP16 Half, 2 at FP32 Single and 1/32 at FP64 double. This means that at half precision FP16, FLOPS = 1710 * 8704 * 2 = 29767680 Mega FLOPS or divide by 1000 to get 29767.68 Giga FLOPS or divide by 1000 again to get 29.76768 Tera FLOPS. For single precision FP32, we get FLOPS = 1710 * 8704 *2 = 29767680 MFLOPS as well or 29.76768 TFLOPS. For double precision FP64, we get FLOPS = 1710 * 8704 * (1/32) = 465120 MFLOPS or 465.12 GFLOPS.

The answer is given in Mega FLOPS because the clock speed is given in Mega Hertz (MHz).

View our calculator to convert TFLOPS to GFLOPS, or GFLOPS to TFLOPS.

Floating point operations per clock cycle

The following tables are used to get the floating point operations per clock cycle to calculate the performance at that specific precision level. The amount of floating point operations per clock cycle differ by series.

Nvidia GPUs

Ada Lovelace Cards FP16 Half FP32 Single FP64 Double
All Ada Lovelace Cards 2 2 1/32
Ampere Cards FP16 Half FP32 Single FP64 Double
Nvidia Ampere (30 Series, MX 570, A2000, A3000, A2, A40) 2 2 1/32
Nvidia Ampere (Quadro & Tesla A4000 to A6000, A10, A16) 2 2 1/16
Nvidia Ampere (Only Tesla A100) 8 2 1
Nvidia Ampere (RTX 2050 Mobile Only) 4 2 1/16
Nvidia Ampere (A30 Only) 2 2 1
Turing Cards FP16 Half FP32 Single FP64 Double
Nvidia Turing (Except Tesla & MX550) 4 2 1/16
Nvidia Turing (MX550 Only) 2 2 1/32
Nvidia Turing (Tesla T4 Only) 16 2 1/16
Volta Cards FP16 Half FP32 Single FP64 Double
All Volta Cards 4 2 1
Pascal & Maxwell Cards FP16 Half FP32 Single FP64 Double
All Pascal & Maxwell Cards (Except GT 1010, Quadro GP100, Tesla P100) 1/32 2 1/16
Nvidia Pascal (Quadro GP100 & Tesla P100 Only) 4 2 1
Nvidia Pascal (GT 1010 Only) ? 2 1/12
Nvidia Pascal (Jetson TX2 Only) 4 2 1/16
Kepler Cards FP16 Half FP32 Single FP64 Double
Nvidia Kepler (GeForce (except Titan and Titan Black), Quadro (except K6000), Tesla K10) 0 2 1/12
Nvidia Kepler (GeForce GTX Titan and Titan Black, Quadro K6000, Tesla (except K10)) 0 2 2/3
Fermi Cards FP16 Half FP32 Single FP64 Double
Nvidia Fermi (only GeForce GTX 465–480, 560 Ti, 570–590) 0 2 1/4
Nvidia Fermi (only Quadro 600–2000) 0 2 1/6
Nvidia Fermi (only Quadro 4000–7000, Tesla) 0 2 1
Tesla 2.0 Cards FP16 Half FP32 Single FP64 Double
Nvidia Tesla 2.0 (GeForce GTX 260–295) ? 2 ?

AMD GPUs

AMD CDNA Cards FP16 Half FP32 Single FP64 Double
CDNA 3.0 16 2 2
CDNA 2.0 16 2 2
CDNA 1.0 16 2 1
AMD RDNA Cards FP16 Half FP32 Single FP64 Double
All RDNA 3.0 RX (Excluding 760M & 780M) 8 4 1/8
RDNA 3.0 non RX (760M & 780M) 8 4 1/4
RDNA 2.0 & RDNA 1.0 4 2 1/8
AMD GCN Cards FP16 Half FP32 Single FP64 Double
AMD GCN (only Radeon Pro W 8100–9100) ? 2 1
AMD GCN (all except Radeon Pro W 8100–9100, Vega 10–20) 4 2 1/8
AMD GCN Vega 10 4 2 1/8
AMD GCN Vega 20 (only Radeon VII) 4 2 1/2
AMD GCN Vega 20 (only Radeon Instinct MI50 / MI60 and Radeon Pro VII) 4 2 1
AMD TeraScale Cards FP16 Half FP32 Single FP64 Double
AMD TeraScale 1 (Radeon HD 4000 series) ? 2 2/5
AMD TeraScale 2 (Radeon HD 5000 series) ? 2 1
AMD TeraScale 3 (Radeon HD 6000 series) ? 4 1
Scroll to Top