Nono.MA

APRIL 26, 2024

I have an Apple M3 Max 14-inch MacBook Pro with 64 GB of Unified Memory (RAM) and 16 cores (12 performance and 4 efficiency).

It's awesome that PyTorch now supports Apple Silicon's Metal Performance Shaders (MPS) backend for GPU acceleration, which makes local inference and training much, much faster. For instance, each denoising step of Stable Diffusion XL takes ~2s with the MPS backend and ~20s on the CPU.

Want to see older publications? Visit the archive.

Listen to Getting Simple .