















From vector to GPU based architectures POLYTECH GPU in the 2023-Top500 Group of TOP500 machines with GPU NVIDIA TH-IV







From vector to GPU based architectures **Tesla H100 packaging** (V100  $\Rightarrow$  A100  $\Rightarrow$  H100) **Tesla H100 packaging** (V100  $\Rightarrow$  A100  $\Rightarrow$  H100) **Tesla H100** has been a 3 billion dollar R&D project .... A100 ? ... H100 ?





























27





26

## Recent architecture issues POLYTECH Motivation to design RT cores

## **Ray Tracing cores:**

- · Final objective: « real time ray tracing for video »
- Currently: GPU not powerful enough
- → Real Time RT on a subset of rays + interpolation with Tensor Cores



#### Video game remains the main market for NVIDIA

→ GPU architecture evolutions must be useful for the video game market

64 specialized

computing unit

D

SOL MAN from NVIDIA SOL ray tracing demo running on a Turing TU102 GPU with NVIDIA RTX technology in real-time

28

POLYTECH

# Tensor Core features Tensor cores: 1 TC achieves a flow of product-add on a flow of 4x4 matrixes

• D = A.B : produces a flow of D output matrixes

Recent architecture issues

- D = A.B + C, with accumulation of A.B product
  - flow into C matrix

## A Tensor core:

- is a hardware implementation of a matrix operator,
- is a very useful operator for modern applications,
- including graphic applications (main GPU market).
- → A mathematical operator whose genericity justifies that it occupies a part of the chip!



# Objectives of this new fast memory architecture and management:

To decrease the performance loss when not using shared memory...
... many users have refused to design and implement a new cache management strategy (too difficult).





