ML Compute

RSS for tag

Accelerate training and validation of neural networks using the CPU and GPUs.

ML Compute Documentation

Posts under ML Compute tag

43 Posts
Sort by:
Post not yet marked as solved
0 Replies
19 Views
NLEmembedding.wordEmbedding is not available in your language. This is a very serious issue for any service that caters to Koreans, please fix it quickly. We have added the sample code below. import UIKit import CoreML import NaturalLanguage class MLTextViewController: UIViewController { override func viewDidLoad() { super.viewDidLoad() execute() } func execute() { if let embedding = NLEmbedding.wordEmbedding(for: .korean) { let word = "bicycle" if let vector = embedding.vector(for: word) { print(vector) } let specificDistance = embedding.distance(between: word, and: "motorcycle") print("✅ \(specificDistance.description)") embedding.enumerateNeighbors(for: word, maximumCount: 5) { neighbor, distance in print("\(neighbor): \(distance.description)") return true } } } }
Posted
by karrotman.
Last updated
.
Post not yet marked as solved
0 Replies
81 Views
I cannot find the bug ... but run this code (python) on torch device mps0 is slow quicker and cpu0 or cpu1 ... but where is the bug? or run it on neural engine with cpu1? you need a setup like this: #!/bin/bash export HOMEBREW_BREW_GIT_REMOTE="https://github.com/Homebrew/brew" # put your Git mirror of Homebrew/brew here export HOMEBREW_CORE_GIT_REMOTE="https://github.com/Homebrew/homebrew-core" # put your Git mirror of Homebrew/homebrew-core here /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install.sh)" eval "$(/opt/homebrew/bin/brew shellenv)" brew update --force --quiet chmod -R go-w "$(brew --prefix)/share/zsh" export OPENBLAS=$(/opt/homebrew/bin/brew --prefix openblas) export CFLAGS="-falign-functions=8 ${CFLAGS}" brew install wget brew install unzip conda init --all conda create -n torch-gpu python=3.10 conda activate torch-gpu conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 -c pytorch conda install -c conda-forge jupyter jupyterlab python3 -m pip install --upgrade pip python3 -m pip install insightface==0.2.1 onnx imageio scikit-learn scikit-image moviepy python3 -m pip install googledrivedownloader python3 -m pip install imageio==2.4.1 python3 -m pip install Cython python3 -m pip install --no-use-pep517 numpy python3 -m pip install torch python3 -m pip install image python3 -m pip install timm python3 -m pip install PlL python3 -m pip install h5py for i in `seq 1 6`; do python3 test.py done conda deactivate exit 0 test.py: import torch import math # this ensures that the current MacOS version is at least 12.3+ print(torch.backends.mps.is_available()) # this ensures that the current current PyTorch installation was built with MPS activated. print(torch.backends.mps.is_built()) dtype = torch.float device = torch.device("cpu",0) #device = torch.device("cpu",1) #device = torch.device("mps",0) # Create random input and output data x = torch.linspace(-math.pi, math.pi, 2000, device=device, dtype=dtype) y = torch.sin(x) # Randomly initialize weights a = torch.randn((), device=device, dtype=dtype) b = torch.randn((), device=device, dtype=dtype) c = torch.randn((), device=device, dtype=dtype) d = torch.randn((), device=device, dtype=dtype) learning_rate = 1e-6 for t in range(2000): # Forward pass: compute predicted y y_pred = a + b * x + c * x ** 2 + d * x ** 3 # Compute and print loss loss = (y_pred - y).pow(2).sum().item() if t % 100 == 99: print(t, loss) # Backprop to compute gradients of a, b, c, d with respect to loss grad_y_pred = 2.0 * (y_pred - y) grad_a = grad_y_pred.sum() grad_b = (grad_y_pred * x).sum() grad_c = (grad_y_pred * x ** 2).sum() grad_d = (grad_y_pred * x ** 3).sum() # Update weights using gradient descent a -= learning_rate * grad_a b -= learning_rate * grad_b c -= learning_rate * grad_c d -= learning_rate * grad_d print(f'Result: y = {a.item()} + {b.item()} x + {c.item()} x^2 + {d.item()} x^3')
Posted
by Smiril .
Last updated
.
Post not yet marked as solved
0 Replies
173 Views
Hi, I am looking for a routine to perform complex-valued linear algebra on the GPU in python for scientific programming, in particular quantum physics simulations. At the moment I am looking for a routine for complex-valued matrix multiplication. I found MLX has a routine for float matrix multiplication, but it does not directly work for complex-valued matrices. I figured a work-around by splitting the complex valued matrix into real and imaginary part and working with the pair, but it makes it cumbersome to integrate with the remainder of the code. I was hoping for a library-based implementation similar to cupy. I also tried out using the tensorflow linear algebra routines, but I couldn't get them to run on the GPU by now. Specifically, a testfile with a tensorflow.keras.applications.ResNet50 routine runs on the GPU, but the routines from tensorflow.linalg and tensorflow.math that I tested (matmul, expm, eigh) were not running on the GPU. Any advice on how to make linear algebra calculations on mac GPUs work is highly appreciated! For my application the unified memory might be especially beneficial. Thank you!
Posted
by MG607.
Last updated
.
Post not yet marked as solved
0 Replies
373 Views
In theory, sending signals from iPhone apps to and from the brain with non-invasive technology could be achieved through a combination of brain-computer interface (BCI) technologies, machine learning algorithms, and mobile app development. Brain-Computer Interface (BCI): BCI technology can be used to record brain signals and translate them into commands that can be understood by a computer or a mobile device. Non-invasive BCIs, such as electroencephalography (EEG), can track brain activity using sensors placed on or near the head[6]. For instance, a portable, non-invasive, mind-reading AI developed by UTS uses an AI model called DeWave to translate EEG signals into words and sentences[3]. Machine Learning Algorithms: Machine learning algorithms can be used to analyze and interpret the brain signals recorded by the BCI. These algorithms can learn from large quantities of EEG data to translate brain signals into specific commands[3]. Mobile App Development: A mobile app can be developed to receive these commands and perform specific actions on the iPhone. The app could also potentially send signals back to the brain using technologies like transcranial magnetic stimulation (TMS), which can deliver information to the brain[5]. However, it's important to note that while this technology is theoretically possible, it's still in the early stages of development and faces significant technical and ethical challenges. Current non-invasive BCIs do not have the same level of fidelity as invasive devices, and the practical application of these systems is still limited[1][3]. Furthermore, ethical considerations around privacy, consent, and the potential for misuse of this technology must also be addressed[13]. Sources [1] You can now use your iPhone with your brain after a major breakthrough | Semafor https://www.semafor.com/article/11/01/2022/you-can-now-use-your-iphone-with-your-brain [2] ! Are You A Robot? https://www.sciencedirect.com/science/article/pii/S1110866515000237 [3] Portable, non-invasive, mind-reading AI turns thoughts into text https://techxplore.com/news/2023-12-portable-non-invasive-mind-reading-ai-thoughts.html [4] Elon Musk's Neuralink implants brain chip in first human https://www.reuters.com/technology/neuralink-implants-brain-chip-first-human-musk-says-2024-01-29/ [5] BrainNet: A Multi-Person Brain-to-Brain Interface for Direct Collaboration Between Brains - Scientific Reports https://www.nature.com/articles/s41598-019-41895-7 [6] Brain-computer interfaces and the future of user engagement https://www.fastcompany.com/90802262/brain-computer-interfaces-and-the-future-of-user-engagement [7] Mobile App + Wearable For Neurostimulation - Accion Labs https://www.accionlabs.com/mobile-app-wearable-for-neurostimulation [8] Signal Generation, Acquisition, and Processing in Brain Machine Interfaces: A Unified Review https://www.frontiersin.org/articles/10.3389/fnins.2021.728178/full [9] Mind-reading technology has arrived https://www.vox.com/future-perfect/2023/5/4/23708162/neurotechnology-mind-reading-brain-neuralink-brain-computer-interface [10] Synchron Brain Implant - Breakthrough Allows You to Control Your iPhone With Your Mind - Grit Daily News https://gritdaily.com/synchron-brain-implant-controls-tech-with-the-mind/ [11] Mind uploading - Wikipedia https://en.wikipedia.org/wiki/Mind_uploading [12] BirgerMind - Express your thoughts loudly https://birgermind.com [13] Elon Musk wants to merge humans with AI. How many brains will be damaged along the way? https://www.vox.com/future-perfect/23899981/elon-musk-ai-neuralink-brain-computer-interface [14] Models of communication and control for brain networks: distinctions, convergence, and future outlook https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7655113/ [15] Mind Control for the Masses—No Implant Needed https://www.wired.com/story/nextmind-noninvasive-brain-computer-interface/ [16] Elon Musk unveils Neuralink’s plans for brain-reading ‘threads’ and a robot to insert them https://www.theverge.com/2019/7/16/20697123/elon-musk-neuralink-brain-reading-thread-robot [17] Essa and Kotte https://arxiv.org/pdf/2201.04229.pdf [18] Synchron's Brain Implant Breakthrough Lets Users Control iPhones And iPads With Their Mind https://hothardware.com/news/brain-implant-breakthrough-lets-you-control-ipad-with-your-mind [19] An Apple Watch for Your Brain https://www.thedeload.com/p/an-apple-watch-for-your-brain [20] Toward an information theoretical description of communication in brain networks https://direct.mit.edu/netn/article/5/3/646/97541/Toward-an-information-theoretical-description-of [21] A soft, wearable brain–machine interface https://news.ycombinator.com/item?id=28447778 [22] Portable neurofeedback App https://www.psychosomatik.com/en/portable-neurofeedback-app/ [23] Intro to Brain Computer Interface http://learn.neurotechedu.com/introtobci/
Posted
by ztick.
Last updated
.
Post marked as solved
1 Replies
320 Views
Hello, My understanding of the paper below is that iOS ships with a MobileNetv3-based ML model backbone, which then uses different heads for specific tasks in iOS. I understand that this backbone is accessible for various uses through the Vision framework, but I was wondering if it is also accessible for on-device fine-tuning for other purposes. Just as an example, if I want to have a model to detect some unique object in a photo, can I use the built in backbone or do I have to include my own in the app. Thanks very much for any advice and apologies if I didn't understand something correctly. Source: https://machinelearning.apple.com/research/on-device-scene-analysis
Posted
by Sark.
Last updated
.
Post not yet marked as solved
0 Replies
324 Views
I am currently facing a performance issue while using CoreML on iOS 16+ devices to run a simple grid_sample model. When profiling the model using xcode Profiler, I noticed that before each NPU computation, there is a significant delay caused by the "input copy" and "neural engine-data copy" operations.I have specified that both the input and output of the model are of type float16, there shouldn't be any data type convert. I would appreciate any insights or suggestions regarding the reasons behind this delay and possible solutions My simple model is class GridSample(torch.nn.Module): def __init__( self, ): super().__init__() def forward(self, input: torch.Tensor, grid: torch.Tensor) -> torch.Tensor: output = F.grid_sample( input, grid.to(input), mode='nearest', padding_mode='zeros', align_corners=True, ) return output tr_input = torch.randn((8, 64, 512, 512) tr_grid = torch.randn((8, 256, 256, 2) simple_model = GridSample() simple_model.eval() traced_model = torch.jit.trace(simple_model, [tr_input, tr_grid]) coreml_input = [coremltools.TensorType(name="image_input", shape=tr_input.shape, dtype=np.float16), coremltools.TensorType(name="warp_grid", shape=tr_grid.shape, dtype=np.float16)] mlmodel = coremltools.converters.convert(traced_model, inputs=coreml_input, convert_to="mlprogram", minimum_deployment_target=coremltools.target.iOS16, compute_units=coremltools.ComputeUnit.ALL, compute_precision = coremltools.precision.FLOAT16, outputs=[ct.TensorType(name="x0", dtype=np.float16)], debug=False) mlmodel.save("./grid_sample.mlpackage") os.system(f"xcrun coremlcompiler compile './grid_sample.mlpackage' './')
Posted
by jwyyy.
Last updated
.
Post not yet marked as solved
0 Replies
243 Views
I have a neural network that should run on my device with 3 different input shapes. When converting it to mlmodel or mlpackage files with fixed input size it runs on ANE. But when converted it with EnumeratedShape it runs only on CPU. Why? I think that the problematic layer is the slice (which converted in the flexible model to SliceStatic), but don't understand why and if there is any way to solve it and run the Enumerated model on ANE. Here is my code class TestModel(torch.nn.Module): def __init__(self): super(TestModel, self).__init__() self.dw1 = torch.nn.Conv2d(in_channels=641, out_channels=641, kernel_size=(5,4), groups=641) self.pw1 = torch.nn.Conv2d(in_channels=641, out_channels=512, kernel_size=(1,1)) self.relu = torch.nn.ReLU() self.pw2 = torch.nn.Conv2d(in_channels=512, out_channels=641, kernel_size=(1,1)) self.dw2 = torch.nn.Conv2d(in_channels=641, out_channels=641, kernel_size=(5,1), groups=641) self.pw3 = torch.nn.Conv2d(in_channels=641, out_channels=512, kernel_size=(1,1)) self.block1_dw = torch.nn.Conv2d(in_channels=512, out_channels=512, kernel_size=(5,1), groups=512) self.block1_pw = torch.nn.Conv2d(in_channels=512, out_channels=512, kernel_size=(1,1)) def forward(self, inputs): x = self.dw1(inputs) x = self.pw1(x) x = self.relu(x) x = self.pw2(x) x = self.dw2(x) x = self.pw3(x) x = self.relu(x) y = self.block1_dw(x) y = self.block1_pw(y) y = self.relu(y) z = x[:,:,4:,:] + y return z ex_input = torch.rand(1, 641, 44, 4) traced_model = torch.jit.trace(TestModel().eval(), [ex_input,]) ct_enum_inputs = [ct.TensorType(name='inputs', shape=enum_shape)] ct_outputs = [ct.TensorType(name='out')] mlmodel_enum = ct.convert(traced_model, inputs=ct_enum_inputs, outputs=ct_outputs, convert_to="neuralnetwork") mlmodel.save(...) Thanks.
Posted
by yanivz.
Last updated
.
Post not yet marked as solved
0 Replies
231 Views
I created a new environment on Conda and then installed TensorFlow using the command "pip install TensorFlow" on my Mac M1 Pro machine. But TensorFlow is not working.
Posted Last updated
.
Post marked as solved
1 Replies
366 Views
Hello Apple Developer community, I hope this message finds you well. I am currently facing an issue with Create ML in Xcode, and I am seeking assistance from the knowledgeable members of this forum. Any help or guidance would be greatly appreciated. Problem Description: I am encountering an unexpected issue when attempting to create a classification model for images using Create ML in Xcode. Upon opening Create ML, the application closes unexpectedly when I choose to create a new image classification model. Steps I Have Taken: I have already tried the following steps to troubleshoot the issue: Updated Xcode and macOS to the latest versions. Restarted Xcode and my computer. Created a new sample project to isolate the issue. Despite these efforts, the problem persists. System Information: Xcode Version: 15.2 macOS Version: Sonoma 14.0 I am on a tight deadline for a project, and resolving this issue quickly is crucial. Your help is invaluable, and I thank you in advance for any support you can provide. Best regards.
Posted
by JuanLos.
Last updated
.
Post not yet marked as solved
2 Replies
454 Views
Hello, I followed the instructions provided here: https://developer.apple.com/metal/tensorflow-plugin/ and while trying to run the example I am getting following error: otFoundError: dlopen(/Users/nedimhadzic/venv-metal/lib/python3.11/site-packages/tensorflow-plugins/libmetal_plugin.dylib, 0x0006): Symbol not found: __ZN10tensorflow16TensorShapeProtoC1ERKS0_ Referenced from: <C62E0AB4-567E-3E14-8F96-9F07A746C4DC> /Users/nedimhadzic/venv-metal/lib/python3.11/site-packages/tensorflow-plugins/libmetal_plugin.dylib Expected in: <FFF31651-3926-3E79-A442-143B7156FB13> /Users/nedimhadzic/venv-metal/lib/python3.11/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so tensorflow: 2.15.0 tensorlow-metal: 1.0.0 macos: 14.2.1 Intel CPU and AMD Radeon Pro 5500M Any idea? Regards, Nedim
Posted
by nedo99.
Last updated
.
Post not yet marked as solved
0 Replies
322 Views
When the input dimension is 600w, the operator runs on ANE. But when the input shape is 100w or 200w, this operator can only run on the CPU. The data dimension has decreased, but it does not run on ANE. What is the reason for this and what are the ways to avoid it
Posted
by zhouzheng.
Last updated
.
Post not yet marked as solved
1 Replies
466 Views
Hello, I'm trying to train a MLImageClassifier dataset using Swift using the function MLImageClassifier.train. It doesn't change the dataset size (I have the same problem with a smaller one), but when the train reaches the 9 completedUnitCount of 10, even if the CPU usage is still high, seems to happen a soft lock that doesn't never brings the model to its completion (or error). The dataset is made of jpg images, using the CreateML app doesn't appear any problem during the training. There is any known issue with CreateML training APIs about part 9 of the process? There is any information about this part of the training job? Thank you
Posted Last updated
.
Post not yet marked as solved
1 Replies
540 Views
I'm trying to create an updatable model, but this seems possible only by creating from scratch a neural network model and then, using the NeuralNetworkBuilder, call the make_updatable method. But I met a lot of problems on this way for the solution. In this example I try to open a converted ML Model (neural network) using the NeuralNetworkBuilder: import coremltools model = coremltools.models.MLModel("SimpleImageClassifier.mlpackage") spec = model.get_spec() builder = coremltools.models.neural_network.NeuralNetworkBuilder(spec=spec) builder.inspect_layers() But I met this error in the builder instance line: AttributeError: 'NoneType' object has no attribute 'layers' I also tried to define a neural network using the NeuralNetworkBuilder but then what do I have to do with this object? I didn't find a way to save it or convert it. The result I want is simple, the possibility to train more the model on the user device to meet his exigences. However the way to obtain an updatable model seems incomprehensible. In my case, the model should be an image classification. What approach should I follow to achieve this result? Thank you
Posted Last updated
.
Post not yet marked as solved
2 Replies
513 Views
I Instrument's CPU Profiling tool I've noticed that a significant portion (22.5%) of the CPU-side overhead related to MPS matrix multiplication (GEMM) is in a call to getenv(). Please see attached screenshot. It seems unnecessary to perform this same check over and over, as whatever hack that needs this should be able to perform the getenv() only once and cache the result for future use.
Posted
by jacobgorm.
Last updated
.
Post not yet marked as solved
1 Replies
395 Views
I've been running tensorflow with python 3.9 to training a CNN model, and this process is accelerated by the GPU. After 80 epochs the process went to sleep (status S) and its GPU usage drops to 0 percent, I am wondering if this traing process crashed the GPU or the OS is mandatating the process to go to sleep because it takes up too much GPU time? Thanks a lot!
Posted
by chaoyi240.
Last updated
.
Post not yet marked as solved
0 Replies
444 Views
Hi everyone, Wondering if you know how the device decide which compute unit (GPU, CPU or ANE) to use when compute units are set to ALL? I'm working on optimizing a GPT2 model to run on ANE. I ran the performance report for the existing model and the report showed me operators not supported by ANE. Then I went onto remove these operators and converted the model to CoreML again. This time the performance report showed that every operator is supported by ANE but the device still prefers GPU when the compute units are set to ALL and perfers CPU when the compute units are set to CPU and ANE. ALL CPU and ANE Does anyone know why? Thank you in advance!
Posted
by dcdcdc123.
Last updated
.
Post not yet marked as solved
1 Replies
1.7k Views
Project is based on python3.8 and 3.9, containing some C and C++ source How can I do parallel computing on CPU and GPU of M1max In deed, I buy Mac m1max for the strong GPU to do quantitative finance, for which the speed is extremely important. Unfortunately, cuda is not compatible with Mac. Show me how to do it, thx. Are Accelerate(for CPU) and Metal(for GPU) can speed up any source by building like this: Step 1: download source from github Step 2: create a file named "site.cfg"in this souce file, and add content: [accelerate] libraries=Metal, Acelerate, vecLib Step 3: Terminal: NPY_LAPACK_Order=accelerate python3 setup.py build Step 4: pip3 install . or python3 setup.py install ? (I am not sure which method to apply) 2、how is the compatibility of such method? I need speed up numpy, pandas and even a open souce project, such as https://github.com/microsoft/qlib 3、just show me the code 4、when compiling C++, C source, a lot of errors were reported, which gcc and g++ to choose? the default gcc installed by brew is 4.2.1, which cannot work. and I even tried to download gcc from the offical website of ARM, still cannot work. give me a hint. thx so much urgent
Posted
by jefftang.
Last updated
.
Post not yet marked as solved
1 Replies
515 Views
Hi folks, I'm working on converting a GPT2 model to coreml with KV caching enabled. I have a GPT2 model runinng on GPU with static input shape It seems once I enable flexible shape (i.e. either range shape or enumerated shape), the model will be run on CPU according to the performance report. I can see new operators being added ( get_shape and general_slice ) and it is not supported by GPU / ANE Wondering if there's any way to get around this to get the model running on GPU / ANE? How does the machine decide whether to run the model on GPU / Neural Engine? Thanks!
Posted
by dcdcdc123.
Last updated
.