- introduces Cooperative Vectors, a GPU API that enables hardware-accelerated vector–matrix multiplications inside shaders
- explains the motivation from neural material and neural radiance caching use cases
- covers practical inference using MulOptimal matrix layouts and training via OuterProductAccumulate and VectorAccumulate
- a CUDA-based software rasterizer capable of large triangle-count datasets without precomputed LODs or acceleration structures
- uses a 3-stage pipeline: small triangles are rasterized by one thread, larger triangles by one warp, and the largest triangles are split and queued for a third stage
- presents performance comparisons against Vulkan-based hardware rasterization
- presentation about the micropolygon system implemented in Anvil
- presents the whole pipeline from data preparation, over culling, to the actual rasterization step
- show how to integrate into the streaming system and optimizations across the system
- additionally presents the challenges of getting the system to run on Switch 2
- The presentation discusses the challenges for the path tracing mode in F1 2025
- discusses ReSTIR, ReGIR combination used
- focuses on the challenges of night races with a large number of dynamic lights
- additionally presents a look at the debug and development tools developed for the process
- The talks discuss the rendering engine used for Anno 117
- covering how objects are built from individual subobjects layered together, how meshes are adapted to the terrain, and interactions with the LOD system
- covers procedural grass generation
- presents the approach to ray tracing integration
- talk from the technical artist summit track that focuses on the stylized rendering of a 3D world with a stylized 2D sprite look
- presents how to apply outline rendering, stylized shading, sky, and clouds
- additionally presents challenges related to pixel snapping, depth sorting, and shadows
- reverse engineers Apple’s undocumented Metal Lossy Compression format
- describes the memory layout: variable-size blocks packed within 128-byte tiles, with per-block metadata
- details the multiple encoding modes supported
- an experimental C# library that interprets HLSL shader code on the CPU, enabling CPU-side execution of shaders
- includes a built-in shader testing framework with HLSL-native test support
- supports the majority of HLSL features, including wave intrinsics, divergent control flow, groupshared memory, and texture types
- Investigates using Bayer matrix traversal for pixel visitation in online k-means color quantization, resulting in an ordered dithering pattern
- Shows that omitting the final Euclidean pixel mapping in Bayer traversal creates visible dither patterns without extra dithering steps
- Shuffling the block order within the Bayer matrix combines structure and randomness, yielding cleaner dithering than pure raster or random methods
- presents how to implement bloom using OpenGL
- explains the theory and shows the practical implementation walkthrough
Thanks to Jasper Bekkers for support of this series.
Would you like to see your name here too? Become a Patreon of this series.