Skip to content

Add coreml_compute_plan.py: report which CoreML ops dispatch to ANE / GPU / CPU#19252

Open
john-rocky wants to merge 2 commits into
pytorch:mainfrom
john-rocky:coreml/compute-plan-analyzer
Open

Add coreml_compute_plan.py: report which CoreML ops dispatch to ANE / GPU / CPU#19252
john-rocky wants to merge 2 commits into
pytorch:mainfrom
john-rocky:coreml/compute-plan-analyzer

Conversation

@john-rocky
Copy link
Copy Markdown
Contributor

@john-rocky john-rocky commented May 1, 2026

Summary

CoreML decides at compile/load time which device each MIL operation will
execute on, and coremltools 9.0+ exposes that through MLComputePlan.
The recurring question on the issue tracker is "why isn't my model
running fully on the ANE?"
— for example:

Today the only way for an ExecuTorch user to answer it is to break out
Swift / Xcode. This PR adds a Python wrapper around MLComputePlan so
the answer is one shell command:

$ python coreml_compute_plan.py --model_path my_model.mlpackage \
      --compute_units cpu_and_ne --show_non_ane

=== my_model.mlpackage ===
  ANE:   412 / 480 ( 85.8%)
  CPU:    68 / 480 ( 14.2%)

  Non-ANE op types:
       32  ios17.cast
       18  ios17.gather
       12  ios17.reshape
        6  ios17.constexpr_blockwise_shift_scale

Inputs supported:

Input Behavior
.pte Extract every Core ML partition into a tempdir, then analyze each.
.mlpackage Compile to .mlmodelc in a tempdir, then analyze.
.mlmodelc Analyze directly.

The PTE path reuses the same JSON/named-data extraction logic that
extract_coreml_models.py uses, and is inlined into the script so it can
be run against a plain CoreML model without depending on the executorch
package.

Test plan

Added test_coreml_compute_plan.py covering:

  • _device_name(...) for None and a stub MLNeuralEngineComputeDevice.
  • _COMPUTE_UNIT_CHOICES mapping (cpu_and_ne / all).
  • analyze_one(...) end-to-end on a tiny relu(x @ x.T) + x.sum()
    mlpackage built with coremltools.convert(...): returns rows for
    every dispatched op, with a main function and the expected MIL op
    types (matmul, relu, add, reduce_sum).
$ python -m pytest examples/apple/coreml/scripts/test_coreml_compute_plan.py -v
============================== 7 passed in 3.68s ===============================

I also ran the script against a few hand-built .mlpackage and
.mlmodelc files on macOS 26 with coremltools 9.0 and verified the
output matches what MLComputePlan returns directly.

Authored with Claude.

cc @kimishpatel @YifanShenSZ @cymbalrush @metascroy

CoreML decides at compile/load time which device each MIL operation
will execute on; that decision is exposed through MLComputePlan in
coremltools 9.0+.  This script wraps it so users can answer 'why
isn't my model running on the ANE?' without writing Swift, which is
the recurring question behind issues like pytorch#4091, pytorch#11541, and pytorch#8439.

Inputs supported:
  * .pte         — extracts every Core ML partition first.
  * .mlpackage   — compiles to .mlmodelc in a tempdir.
  * .mlmodelc    — analyzed directly.

Reports per-op dispatch (ANE / GPU / CPU), an aggregate breakdown,
and optionally the op types that did not get assigned to the ANE
(--show_non_ane).

Authored with Claude.
@john-rocky john-rocky requested a review from metascroy as a code owner May 1, 2026 05:53
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented May 1, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19252

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 1, 2026
@john-rocky
Copy link
Copy Markdown
Contributor Author

@pytorchbot label "release notes: apple"

@pytorch-bot pytorch-bot Bot added the release notes: apple Changes to the Apple backend delegate label May 2, 2026
@nil-is-all nil-is-all added the module: coreml Issues related to Apple's Core ML delegation and code under backends/apple/coreml/ label May 4, 2026
@kimishpatel
Copy link
Copy Markdown
Contributor

I thought what op runs where is decided at compile time. Is this being extracted from AOT compile or just from lowering?

@john-rocky
Copy link
Copy Markdown
Contributor Author

@kimishpatel Compile time, not runtime — MLComputePlan is a static analysis hook that coremltools 9.0+ exposes around the same dispatch decision the CoreML runtime would make. It loads from a compiled .mlmodelc (the script compiles a .mlpackage to a tempdir if needed) and reports preferred_compute_device per MIL operation without actually running predictions.

Quoting coremltools' own docs:

Represents the plan for executing a model. The application can use the plan to estimate the necessary cost and resources of the model before running the predictions.

So no AOT extraction or runtime profiling involved; just a wrapper that walks MLModelStructureProgram and calls get_compute_device_usage_for_mlprogram_operation per op. That's why this works against any of .pte, .mlpackage, or .mlmodelc — they all reduce to a compiled program the framework can plan against.

)


def _extract_models_from_pte(pte_path: str, out_dir: str) -> List[str]:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we reuse utilties in the extract_coreml_model script in the same folder?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — _extract_models_from_pte is gone. extract_coreml_models.extract_coreml_models now takes an optional out_dir and returns the list of extracted paths, and coreml_compute_plan.py imports it via the executorch.examples.apple.coreml.scripts namespace.

@metascroy
Copy link
Copy Markdown
Contributor

I thought what op runs where is decided at compile time. Is this being extracted from AOT compile or just from lowering?

I think it's being extracted from modelc compilation (so won't work on linux), see the _ensure_compile call.

import coremltools as ct
import torch

sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — the sys.path.insert is gone. The test imports through the same namespace path (from executorch.examples.apple.coreml.scripts.coreml_compute_plan import ...).

@metascroy
Copy link
Copy Markdown
Contributor

Looks good @john-rocky ! Main comment is to reuse or consolidate the existing utilities in extract_coreml_models script in the same folder.

Can you also test this works with multi-function?

Move the .pte extraction logic into extract_coreml_models.extract_coreml_models,
which now takes an optional out_dir and returns the list of extracted paths;
coreml_compute_plan.py imports and uses it instead of carrying its own copy.

Multifunction .mlpackage inputs are now analyzed function-by-function: each
function is projected as the `main` of a temp single-function copy so
MLComputePlan.load_from_path covers it (coremltools 9.0 only exposes the plan
for the default function otherwise).

test_coreml_compute_plan.py uses the executorch.examples.apple.coreml.scripts
namespace import directly instead of mutating sys.path, and adds two tests
confirming both functions of a multifunction package are surfaced.
@john-rocky
Copy link
Copy Markdown
Contributor Author

Thanks for the review! Addressed both inline comments and added real multifunction support (rather than just a test that pins the limitation).

What changed (216256a)

  • _extract_models_from_pte is removed. extract_coreml_models.extract_coreml_models now takes an optional out_dir and returns the list of extracted partition paths; coreml_compute_plan.py imports it via the executorch.examples.apple.coreml.scripts namespace.
  • test_coreml_compute_plan.py drops the sys.path.insert and imports through the same namespace path.
  • analyze_one now handles multifunction .mlpackage inputs. MLComputePlan.load_from_path in coremltools 9.0 only exposes the plan for the default function — for multifunction inputs, each function is re-projected as main of a temp single-function copy via MultiFunctionDescriptor / save_multifunction, and rows are gathered back under the original function names.

Tests

$ python -m unittest -v executorch.examples.apple.coreml.scripts.test_coreml_compute_plan
... 9 tests, all OK ...
TestAnalyzeOneMultifunction.test_reports_every_function ............. ok
TestAnalyzeOneMultifunction.test_each_function_lowers_the_same_ops .. ok

The two new tests build a multifunction package (prefill + decode sharing a tiny relu/matmul/add body), run analyze_one, and assert both function names — plus their MIL ops — appear in the result.

Net diff: +154 / −122 across 3 files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. module: coreml Issues related to Apple's Core ML delegation and code under backends/apple/coreml/ release notes: apple Changes to the Apple backend delegate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants