Skip to content

Even more counters in Sysman are broken on Arc Alchemist #926

@ProjectPhysX

Description

@ProjectPhysX

Pre-submission Checklist

  • I am using the latest GPU driver version (releases)
  • I have searched for similar issues and found none

GPU Hardware

Intel Arc A750

DRI Devices Information

GPU Detailed Information (lspci output)

Driver Version

26.05.37020.3

Installed GPU Driver Packages

No response

Driver Installation Details

Followed installation instructions from here: https://github.com/intel/compute-runtime/releases/tag/26.14.37833.4

Linux Distribution

Ubuntu 24.04 LTS

Other Linux Distribution

No response

Kernel Version & Boot Parameters

kernel 6.17.0-22-generic

Actual Behavior

Issue moved from oneapi-src/level-zero#440

Plugging in my trusty Intel Arc A750 for more testing, I found many more broken Sysman counters specifically for Arc Alchemist GPUs, additionally to the broken counters reported in oneapi-src/level-zero#434:

counter on Arc A750 Windows Linux
zes_mem_bandwidth_t::maxBandwidth ❌ wrongly reports max bandwidth in bits/s, not bytes/s ❌ returns 0
zes_mem_bandwidth_t::readCounter/writeCounter/timestamp ✅ works ❌ always return 0
zesDeviceEnumTemperatureSensors() ✅ works zesDeviceEnumTemperatureSensors() function returns ZE_RESULT_SUCCESS but returns 0 zes_temp_handle_t's
zesTemperatureGetState() ⚠ works, but suffers with frequent value dropouts to 0 ❔ cannot test because no zes_temp_handle_t's
zes_fan_handle_t ✅ works ❌ broken (no fans available)
ZES_FREQ_DOMAIN_MEMORY ✅ available ❌ unavailable
zes_freq_state_t::actual for ZES_FREQ_DOMAIN_MEMORY ❌ returns frequency in MT/s, not MHz (a factor 8 too large for GDDR6) ❌ unavailable
zesDeviceGetCardPowerDomain() ❌ returns ZE_RESULT_ERROR_UNSUPPORTED_FEATURE, workaround with zesDeviceEnumPowerDomains() required ❌ returns ZE_RESULT_ERROR_UNSUPPORTED_FEATURE, workaround with zesDeviceEnumPowerDomains() required
zes_pci_stats_t::txCounter/rxCounter/timestamp ❌ return 0 ❌ return 0

For Arc Alchemist on Windows, at least no administrator permissions are required.

Expected Behavior

All Sysman counters should work and none of them should require sudo / administrator permissions.

Reproduction Rate

Always reproduces - 100%

Steps to Reproduce

For debugging you may use https://github.com/ProjectPhysX/hw-smi

Is this a regression?

  • Yes, this is a regression - functionality that previously worked is now broken

Last Known Working Driver Version

No response

First Known Failing Driver Version

No response

API Call Logs

No response

strace Logs

No response

System Logs / dmesg Output

No response

Backtrace (if crash or hang occurred)

No response

Source Code / Reproducer

No response

Command Line / Application Details

No response

oneAPI Version (if applicable)

No response

Screenshots / Video

No response

Additional Notes

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    OS: LinuxIssue specific to Linux distributions (Ubuntu, Fedora, RHEL, etc.)Type: BugGeneral bug report, unexpected behavior or crash

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions