Get started with tensorflow-metal

Accelerate the training of machine learning models with TensorFlow right on your Mac. Install base TensorFlow and the tensorflow-metal PluggableDevice to accelerate training with Metal on Mac GPUs.

Learn about TensorFlow PluggableDevices

Requirements

  • Mac computers with Apple silicon or AMD GPUs
  • macOS 12.0 or later (Get the latest beta)
  • Python 3.8 or later
  • Xcode command-line tools: xcode-select --install

Get started

1. Set up

arm64 : Apple silicon

Download Conda environment

bash ~/miniconda.sh -b -p $HOME/miniconda
source ~/miniconda/bin/activate
conda install -c apple tensorflow-deps
x86 : AMD

Virtual environment

python3 -m venv ~/venv-metal
source ~/venv-metal/bin/activate
python -m pip install -U pip

2. Install base TensorFlow

python -m pip install tensorflow-macos

3. Install tensorflow-metal plug-in

python -m pip install tensorflow-metal

4. Verify

You can verify using a simple script:

import tensorflow as tf

cifar = tf.keras.datasets.cifar100
(x_train, y_train), (x_test, y_test) = cifar.load_data()
model = tf.keras.applications.ResNet50(
    include_top=True,
    weights=None,
    input_shape=(32, 32, 3),
    classes=100,)

loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
model.compile(optimizer="adam", loss=loss_fn, metrics=["accuracy"])
model.fit(x_train, y_train, epochs=5, batch_size=64)

Releases

tensorflow-macos tensorflow-metal macOS version Features
v2.5 0.1.2 12.0+ Pluggable device
v2.6 0.2.0 12.0+ Variable sequences for RNN layers
v2.7 0.3.0 12.0+ Custom op support
v2.8 0.4.0 12.0+ RNN performance improvements
v2.9 0.5.0 12.1+ Distributed training

Troubleshooting

  • Error: “Could not find a version that satisfies the requirement tensorflow-macos (from versions: none).” A tensorflow installation wheel that matches the current Python environment couldn’t be found by the package manager. Check that the Python version used in the environment is supported (Python 3.8, Python 3.9, Python 3.10).
  • Error: “No registered OpKernel. (OpKernel was found, but attributes didn’t match) Requested Attributes: dtype=DT_COMPLEX64.” Complex data type isn’t supported by tensorflow-metal.
  • Error: “Cannot assign a device for operation: Could not satisfy explicit device specification because the node was colocated with a group of nodes that required incompatible device.” A colocation issue takes place when an operation doesn’t have a GPU implementation available. Please report the missing operation by posting on the Apple Developer Forums.
  • CPU performance is faster than GPU on your network. Find out if your workload is sufficient to take advantage of the GPU. On small networks running with small batch sizes, the CPU may perform faster overall due to the overhead related to dispatching computations to the GPU. This will get amortized when the batch or model sizes grow, since the GPU can then take better advantage of the parallelism in performing the computations.

Currently not supported

  • Multi-GPU support
  • Acceleration for Intel GPUs
  • V1 TensorFlow networks

Questions and feedback

To ask questions and share feedback about the tensorflow-metal plug-in, visit the Apple Developer Forums.