Šiandien AMD pagaliau išleis pirmąsias RDNA 2 architektūros vaizdo plokštes. Sulauksime dviejų modelių: RX 6800 ir RX 6800 XT. Kol kas bus pardavinėjamos tik AMD gamyklinio dizaino vaizdo plokštės, o partnerių patobulinti modeliai parduotuvėse atsiras už savaitės. AMD išleidimui turi paruoši visą šūsnį skaidrių, o mes iš anksto jau galime pamatyti joje esančią informaciją. Atskleistas tik tekstas, tad pačias skaidres pamatysime tik po pietų.

Skaidrėse informacijos labai daug. AMD kalba kaip sukurtos vaizdo plokštės, kaip veikia aušinimas, ko galime tikėtis iš Ray Tracing, kaip padės „Infinity Cache“ ir t.t. Paaiškėjo, kad RX 6800 serijos montažinėje plokštėje yra 15 galios blokų, labai tikėtina, kad grafikos procesoriui bus palikta 13, nes dvi maitinimo fazės teks atminčiai. AMD sako, kad „Infinity Cache“ leido pasiekti grafikos procesorių dažnio potencialą, o taip pat padidinti efektyvumą. Spartinančioji atmintis taip pat sumažina RX 6800 XT atminties atsako laiką 48 % lyginat su RX 5700 XT. Daugiau apie būsimas RX 6800 serijos vaizdo plokštes galite paskaityti žemiau esančiame AMD skaidrių tekste.

 

THE FOUNDATION OF AN AMAZING PRODUCT

AMD RDNA 2 ARCHITECTURE DESIGNS GOALS

  • Pushing performance with higher frequencies
  • New levels of power efficiency with AMD Infinity Cache
  • Designed with features for gamers

PRODUCT DESIGN GOALS

  • Engineering – Exceptional thermals, PCB, and electrical
  • Platform – Built with the entire PC platform in mind
  • Experience – Tangible benefits for end-users

THE ROAD TO POWER EFFICIENCY
Achieving an average of 4.1X perf/watt with AMD RDNA2

[ graph where R9 290X is 1x, RX 6800 XT is 4.1x ]

EXCEPTIONAL THERMAL DESIGN

  • Extended vapor chamber for maximum performance
  • Graphite thermal interface material on GPU for high-performance and maximum relatability
  • Die-cast aluminum frame for structural rigidity
  • High-performance, ultra-soft gap pads for efficient GDDR6 and MOSFET cooling
  • Zero RPM fan mode for silent operation during light workloads
  • Custom-designed axial fans for outstanding cooling and quiet operation
  • Premium die-cast aluminum shroud

PREMIUM PCB | INNOVATIVE ELECTRICAL

  • HDMI 2.1 with FRL
  • USB Type-C
  • Low PCIe slot peak currents
  • Premium IT-170 material
  • 15 high efficiency power-stages phases
  • Standard edge location of power connectors
  • RGB Control [header]
  • 14-layer high performance PCB with 4 layers of 2 oz. copper for exceptional power delivery

MEMORY POWER PHASE COUNTS
High performance, low power

  • RX 6800 XT: 2 power phases, 8 memory devices
  • RTX 3090: 4 power phases, 24 memory devices
  • RTX 3080: 3 power phases  10 memory devices

PLATFORM: BUILT FOR STANDARDS
Enabled by exceptional engineering

[ A render with RX 6800 air-flow in chassis, similar to the famous RTX 30 air flow render ]

  • STANDARD Air flow for push-pull chassis configuration
  • STANDARD Enthusiast power draw for simple upgrades (RX 6800: 650W min, RX 6800XT: 750W min PSU)
  • STANDARD Power connector and location for clean cable management

DESIGNED WITH PARTNERS IN MIND
Enabling broad ecosystem and platform partnership

  • STANDARD SIZE – A 2 to 2.5 slot form factor enables seamless integration into existing chassis and partners systems
  • STANDARD PCB FORM FACTOR – A common design language suited for after-market cooling including AIO liquid cooling casing
  • STANDARD POWER – Suited for operation with existing enthusiast PSUs starting at 650W

EXPERIENCE – PHENOMENAL ACOUSTICS
Enabled by custom fan design and extended vapor chamber

  • Radeon RX 6800 XT 6 dBA quieter than Radeon RX 5700 XT
  • 70% less perceived noise with Radeon RX 6800 XT (compared to the Radeon RX 5700 XT at 35C intake),

LOW POWER IDLE AND FAST WAKE-UP
Enabled by system-level power management innovations

  • Low power graphics off – 0.54X power – monitor idle vs RX 5700XT
  • Display – 850ms monitor wake-up from long idle

EXCELLENT OVERCLOCKING
Extra performance on Radeon RX 6800 XT

  • 14-layer premium PCB – 4 layers of 2 ounces of copper for overclocking stability
  • 15 power stage phases – High efficiency power stages for clean voltage draw
  • Exceptional cooling – Extra thermal and acoustics margin built-in

AMD RADEON SOFTWARE

PERFORMANCE TUNING PRESETS
Simple, one-click custom power tuning modes to improve performance or save power

BENEFITS

  • QUIET – Reduces power and fan noise for cool & quiet operation with little impact on performance
  • BALANCED – Default power  levels
  • RAGE MODE – Takes advantage of any extra headroom on the GPU to deliver the ultimate gaming performance
Radeon RX 6800 XT Preset Game Clock Boost Clock
QUIET 1950 MHz up to 2185 MHz
BALANCED 2015 MHz up to 2250 MHz
RAGE 2065 MHz up to 2310 MHz

INTRODUCING AMD FidelityFX Super Resolution

  • Currently in development at AMD
  • Stay tuned for more information as we collaborate with game developers

RASTERIZATION VS RAY TRACING

RASTERIZATION

  • Traditional path for real-time graphics rendering
  • Fast & Flexible
  • Can look very, very good, but results not “perfect”
    • Trade-offs between performance and & quality are the norm

RAY TRACING

  • Ultimate solution to recreating reality in games
  • High performance cost
  • Typically reserved for offline rendering

RAY-TRACING ACCELERATION 
Changes the game

  • As rasterization becomes more cable and complex, its performance cost grows
  • In some cases, tracing rays becomes a reasonable trade-off for improved image quality
  • Hardware acceleration of ray tracing makes some ray-traced effects feasible now

SELECTIVE RAY-TRACED EFFECTS ARE NOW POSSIBLE

  • Developers can judiciously deploy ray tracing to improve realism in their games
  • Real-time ray tracing will involve quality and performance tradeoffs
  • Developers are still learning about how best to use ray-traced effects in combination with rasterization

COMMON USES OF RAY TRACING IN HYBRID RENDERING

REFLECTIONS

  • Can show reflections of objects nut currently on-screen which rasterized reflections typically miss
  • Fallback option: FidelityFX Screen Space Reflections

SHADOWS

  • Replaces often incredibly complex shadow volume implementations with higher-quality results

AMBIENT OCCLUSION

  • More accurately renders the finer detail of light and shadow, especially in the nooks and crannies  of indirectly lit areas
  • Fallback options: FidelityFX Ambient Occlusion

GLOBAL ILLUMINATION

  • Attempts to model the transport of light around a  scene, especially diffuse reflections from object to object

INTRODUCING FIDELITYFX DENOISER

  • Tracing rays is computation expensive, so ray-traced effects are typically sparsely sampled
  • The resolution ray-traced images include some visual noise
  • FidelityFX Denoiser removes this noise and produces a clean, clear image

OUR GOAL: ENABLING DEVELOPERS TO DELIVER ASTOUNDING EXPERIENCES

  • The AMD RDNA 2 architecture and its ray-tracing acceleration hardware will set the standard for the industry
  • AMD is working with developers to enable the use of ray-traced effects where they will have the best impact
  • The goal, as always, remains fast and fluid animation with compelling results

AMD RDNA 2 DEEP DIVE

AMD RDNA 2 ARCHITECTURE

Enthusiast gaming with performance-per-watt leadership

  • PERFORMANCE – Up to 2X AMD Radeon RX 5700 XT in Just Over One Year
  • EFFICIENCY – Up to 54% Performance-per-Watt Gains in Same Process Node
  • FEATURES  – Deliver DX12  Ultimate Experience for Every Gamer

RDNA 2 GAMING ARCHITECTURE
MORE PERFORMANCE, LESS POWER

  • BREAKTHROUGH HIGH-SPEED DESIGN – High frequencies and superb efficiency
  • REVOLUTIONARY AMD INFINITY CACHE – 128MB cache with extreme bandwidth at lower power
  • ADVANCED FEATURES – DX12 Ultimate and support for DirectStorage API

NAVI21 GPU details

  • 7nm
    • 519.8 sqmm
    • 26.8 Billion Transistors
  • I/O
    • x16 PCIe Gen4
    • 256 GDDR6 @ 16 Gbps peak
  • Display Engine
    • HDMI 2.1, AMD FreeSync Technology, DSC, and VRR
    • Future Ready for up to 8K 120Hz
  • Multi-Media Engine
    • 8K AV1 Decode
    • High Quality 8K HEV Encode Accelerator
    • H.265 B-frame support
  • Command Processors
    • Graphics Engine
    • 4 Async Compute Engine
  • Cache Hierarchy
    • 128MB AMD Infinity Cache
    • 4MB L2
    • 1MB Distributed L1
  • Up to 80 Compute Units
    • 5120 Stream Processors
    • 320 Texture Units
    • 80 Ray Accelerators
  • Geometry Processor
    • 8 Pre-Cull Prims/Cycle
    • 4 Post-Cull Prims/Cycle
  • RB+
    • 1024 Hiz Pixels/Cycle
    • 256 Death Samples/ Cycle
    • 128 Pixel Launch/Cycle
    • 128 32b Pixel color write/Cycle
    • 64 64b Pixel color write/Cycle
    • 64 Pixel color blend/Cycle

BREAKTHROUGH HIGH-SPEED DESIGN 

HIGH FREQUENCY IN THE DNA

  • Leverages world-class CPU design methodologies
  • Streamlined micro-architecture

PERFORMANCE-POWER SCALABILITY

  • Up to 1.3 frequency at the same power per CPU
  • Up to 50% per CU power at the same frequency

PERFORMANCE-PER-WATT ACHIEVEMENT UP TO 54%

16% – DESIGN FREQUENCY INCREASE

  • Leverages CPU high frequency expertise
  • High speed performance libraries
  • Streamlined micro-architecture and design
  • Aggressive re-pipelined logic for speed

17% – CAC and Power Optimizations

  • Pervasive fine-grain clock gating
  • Clock tree splitting and gating
  • Redesigned for minimal data movement
  • Aggressive pipeline rebalancing

21% – Performance per Clock Enhancement

  • Infinity Cache amplified low latency/power bandwidth
  • TLD streamlined for latency reductions
  • Redesign 32bt pipe and included new HDR format
  • Optimized geometry distribution and tessellation

THE ENHANCED AMD RDNA 2 COMPUTE UNIT

  • Streamlined for increased frequency and low power
  • Mixed Precision Operations for tensor math
  • Sampler feedback streaming and texture space shading
  • Ray Accelerator: 4 Box or 1 Triangle Intersection per cycle
OPERAND / RESULT MODE OPS/CYCLE/CU
FP16/FP16 Packed 256
FP16/FP32 Mixed Precision 256
FP32 Native 128
FP64 Native 8
Int64 Native 32
Int32 Native 128
Int16/Int16 Packed 256
Int16/Int32 Mixed Precision 256
Int8/Int32 Mixed Precision 512
Int4/Int32 Mixed Precision 1024

REDESIGNED RB+
DESIGNED GROUND UP FOR FREQUENCY, POWER, AND EFFICIENCY

  • Each RB+ natively doubled the 32bpp color rate by processing eight 32-bit pixels per cycle.
  • The RB+ in conjunction with Rasterization expands Variable Rate Sharing (VRS) results for 2×1, 1×2, 2×2 modes to the destination surface.

AMD RDNA 2 MESH SHADING

Mesh shader process workgroups of primitives

  • A geometry front-end with the flexibility of GPU Compute

Shader-based culling and work optimizations

  • Object ID, facedness, depth, occlusion
  • Bouning volume
  • LOD-based mesh determination
  • Custom vertex and geometry data de-composition

Data reuse

  • Vertex reuse on a workgroup scale

Optimized Computation

  • Attribute shading only for primitives that are not culled
  • Particle system physics + mesh in the same shader

AMD RDNA 2 SAMPLER FEEDBACK
Sampler feedback supports both advanced streaming and next-generation rendering

Advanced streaming

  • Memory footprint optimization
  • Texture filtering constrained to resident mipmap levels
  • Asynchronous updates of resident texture data

Texture space rendering

  • Identification of texture locations used in rasterization
  • Feedback data to optimize shading workloads

AMD RDNA 2 RAYTRACING

  • Dynamic Global Illumination
  • Ray-traced soft shadows from area lights
  • Hybrid reflections mixing compute and screen-space effects with full raytracing

AMD RDNA 2 RAYTRACING 

  • 4 Ray/Box Intersections processed per CU per clock
  • 1 Ray/Triangle Intersection processed per C per clock
  • AMD RDNA 2 implements a high-performance ray tracing intersection acceleration architecture
    • The Ray Accelerator handles intersection of rays with the BVH, and sorting of ray intersections times
  • It provides an order of magnitude increase in intersection performance compared to a software implementation
  • Traversal of the BVH and shading of ray results is handled by shader code running on the Compute Units
  • AMD Infinity Cache can hold a very high percentage of the BVH working set, reducing intersection latency

AMD RDNA VARIABLE RATE SHADING

  • AMD RDNA2 variable rate sharing is designed to deliver the maximum usability and flexibility for developers
  • Fine grained rate selection (per 8×8 pixels) makes it easier to select the appropriate shading date for each region. Larger regions could cause more image quality or performance compromises.
  • AMD RDNA 2 supports coarse shading rates up to 2×2 with consistent and predictable performance improvements. Up to 4x improvements in effective shading throughput are attainable.

AMD INFINITY CACHE BENEFITS

  • 1.3 pJ Infinity Cache Access vs 7-8 pJ GDDR6 Access (Average hit rates for 4K titles up to 58%)
  • AMD Infinity Cache unleashes the potential of high-frequency GPU
  • Performance gains with a frequency significantly amplified with the cache
  • Key to unlocking more power-efficient gaming performance
  • A larger configuration will generally mean higher latency (wasted power and lower performance)
  • But with Radeon RX 6800 XT we source most of our bandwidth from the AMD Infinity Cache with up to 48% lower latency than Radeon RX 5700 XT memory
  • With our higher AMD Infinity Fabric clock rates, even raw memory accesses are faster
  • Combined, we get 34% reduction in average latency for improved energy efficiency and performance

BANDWIDTH ON DEMAND
Cache boost clock for turbo-charged bandwidth

  • Games go through phases with widely varying bandwidth requirements
  • Since AMD Infinity Cache sources most bandwidth, power management can boost om-demand
  • Boost Infinity fabric clock for up to a 550 GB/s BW increase when needed, save power when not
Categories: Hardware / Geležis

Parašykite komentarą

El. pašto adresas nebus skelbiamas. Būtini laukeliai pažymėti *