Apple describes the GPU improvements in the A17 Pro and M3

Apple describes the GPU improvements in the A17 Pro and M3

Apple describes the GPU improvements in the A17 Pro and M3. Metal API-enabled apps and games target specific functions of Apple Silicon GPUs, which get even better with major increases in parallel processing in M3 and A17 Pro. This is how it works.

Apple describes the GPU improvements in the A17 Pro and M3

Apple gave a developer session on these new Apple Silicon GPU features, outlining exactly what’s going on to produce better outcomes. The video goes into tremendous technical detail, but it also provides enough information to explain in layman’s terms.

Developers using the Metal API do not need to make any changes to their apps in order to enjoy performance benefits with the M3 and A17 Pro. These chipsets make the GPU more performant than ever before by utilising Dynamic Caching, hardware-accelerated ray tracing, and hardware-accelerated mesh mapping.

GPU enhancements in A17 Pro and M3

Dynamic Shader Core Memory

Dynamic Caching is enabled via a next-generation shader core. When the latest GPU cores in A17 Pro and M3 are used, these shaders may execute in parallel much more effectively than previously, significantly boosting output performance.

Normally, the GPU can only allocate register RAM for the duration of an executed action depending on the highest bandwidth process within that action. As a result, if one component of an action consumes much more register memory than the others, the action will consume significantly more register memory for a given process.

Dynamic Caching enables the GPU to allocate just the right amount of register memory for each activity it performs. The previously inaccessible register memory is released, allowing many more shader tasks to run concurrently.

Dynamic shader core memory

Also Read: Apple Introduces New 14′′ and 16′′ MacBook Pro with M3 Processors

Flexible On-Chip Memory

Previously, on-chip memory had fixed memory allocations for register, thread group, and tile memory, together with a buffer cache. This meant that if an action consumed more of one type of memory than another, significant amounts of memory were left unused.

All of the on-chip memory in flexible on-chip memory is a cache that may be used for any memory type. As a result, an action that primarily relies on thread group memory can use the full on-chip memory span and even overflow activities into the main memory.

To maximise performance, the shader core dynamically modifies on-chip memory occupancy. This means that developers will have to spend less effort optimising occupancy.

Flexible on-chip memory

High-performance ALU pipelines in Shader Core

Apple advises that developers use FP16 math in their programs, however, high-performance ALUs use a variety of integer, FP32, and FP16 combinations in tandem. Because instructions are implemented over many actions that are conducted concurrently, ALU utilisation improves with increasing occupancy.

Essentially, if multiple actions contain the same FP32 or FP16 instructions that are executed at various times, the executions can be overlapped to boost parallelism.

High-performance ALUs

Hardware-accelerated graphics pipelines

Hardware-accelerated ray tracing speeds up the process by removing the critical intersection computations from the GPU function. Because hardware handles a portion of the calculations, more operations may be performed in parallel, speeding up ray tracing with a hardware component.

A similar mechanism is used by hardware-accelerated mesh shading. It takes the geometric computations pipeline in the middle and routes it to a specialised unit, allowing for more parallel processes.

These are complex systems that cannot be summarised in a few paragraphs. We recommend viewing the video to gain all the information but keep in mind that the A17 Pro and M3 rely on computing parallelism to accelerate processes.

The M3 is offered in MacBook Pro and 24-inch iMac configurations. The A17 Pro can be found in the iPhone 15 Pro.

Hardware-accelerated graphics pipelines

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top