P2I.ai

Apple Edge LLM Lab: MPS vs MLX vs Metal benchmarking

A reproducible benchmark suite to compare inference latency, throughput, and memory on Apple Silicon.

Thu Jan 15 2026 • Apple Silicon, MPS, MLX, Metal, Benchmark

Summary

This work benchmarks LLM inference on Apple Silicon across MPS, MLX, and Metal-centric pipelines with repeatable measurements.

Apple Edge LLM Lab: MPS vs MLX vs Metal benchmarking — architecture diagram 1
A reproducible benchmark suite to compare inference latency, throughput, and memory on Apple Silicon.
Mermaid source
flowchart LR\n  A[Model + Prompt] --> B[Runner]\n  B --> C{Backend}\n  C --> D[MPS]\n  C --> E[MLX]\n  C --> F[Metal]\n  D --> G[Metrics]\n  E --> G\n  F --> G

Results (placeholder)

  • Latency: TBD
  • Tokens/sec: TBD
  • Peak memory: TBD