Skip to content

JIT: Use faster mod for uint16 values#128509

Draft
MihaZupan wants to merge 3 commits into
dotnet:mainfrom
MihaZupan:uint16-mod
Draft

JIT: Use faster mod for uint16 values#128509
MihaZupan wants to merge 3 commits into
dotnet:mainfrom
MihaZupan:uint16-mod

Conversation

@MihaZupan
Copy link
Copy Markdown
Member

@MihaZupan MihaZupan commented May 22, 2026

Another attempt at #111535
Closes #111492

Change the transformation for

int Mod(char c) => c % 42;
-int Mod(char c) => (int)(c - (uint)(((ulong)((uint)c >> 1) * 818089009u) >> 34) * 42);
+int Mod(char c) => (int)(((ulong)(102261127u * c) * 42) >> 32);

    movzx    rax, di
-   mov      ecx, eax
-   shr      ecx, 1
-   imul     rcx, rcx, 0x30C30C31
-   shr      rcx, 34
-   imul     ecx, ecx, 42
-   sub      eax, ecx
+   imul     eax, eax, 0x6186187
+   imul     rax, rax, 42
+   shr      rax, 32
    ret

Let's see what CI thinks.

Diffs: https://gist.github.com/MihuBot/38b0b6eeabd80de528a6e1967a7df7cb

@MihaZupan MihaZupan added this to the 11.0.0 milestone May 22, 2026
@MihaZupan MihaZupan self-assigned this May 22, 2026
Copilot AI review requested due to automatic review settings May 22, 2026 23:41
@MihaZupan MihaZupan added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 22, 2026
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@MihaZupan
Copy link
Copy Markdown
Member Author

@MihuBot

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates CoreCLR JIT morphing/assertion propagation/lowering to enable a cheaper remainder sequence when both operands are proven to fit in uint16 and the divisor is a non-zero constant, and adds a JIT regression test that exercises relevant % const patterns (including char-based modulo).

Changes:

  • Teach morphing to avoid rewriting % const into a - (a / b) * b when lowering can apply a cheaper uint16 FastMod-style sequence (and convert MODUMOD when safe).
  • Add a new GTF_UMOD_UINT16_OPERANDS hint set by assertion propagation and consumed by lowering to trigger the specialized expansion.
  • Add a new JIT test under src/tests/JIT/opt/Divide/Regressions/ to cover representative modulo patterns.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/coreclr/jit/morph.cpp Skips early MOD-to-SUB/MUL/DIV morphing (and may flip MOD→UMOD) so lowering can apply the cheaper uint16 modulo path.
src/coreclr/jit/lower.cpp Implements the uint16-specialized FastMod lowering for GT_UMOD by constant divisors.
src/coreclr/jit/assertionprop.cpp Improves IntegralRange reasoning and sets GTF_UMOD_UINT16_OPERANDS when uint16-range operands are proven.
src/coreclr/jit/gentree.h Introduces the new GTF_UMOD_UINT16_OPERANDS flag.
src/coreclr/jit/gentree.cpp Ensures tree comparison accounts for the new mod/div-related flag.
src/tests/JIT/opt/Divide/Regressions/Regression4_Divide.csproj Adds the new regression test project.
src/tests/JIT/opt/Divide/Regressions/Regression4_Divide.cs Adds the new regression test cases exercising modulo scenarios.

Comment thread src/coreclr/jit/lower.cpp
Comment thread src/coreclr/jit/morph.cpp
@MihaZupan
Copy link
Copy Markdown
Member Author

/azp run Fuzzlyn

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Copilot AI review requested due to automatic review settings May 23, 2026 22:04
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Comment thread src/coreclr/jit/lower.cpp
BlockRange().InsertBefore(divMod, multiplier, mul1, castUp);

// Reuse the existing constant divisor as a TYP_LONG operand for the second multiply.
divisor->BashToConst(static_cast<int64_t>(divisorValue));
The VN-based small-type refinement in IntegralRange::ForNode for
GT_LCL_VAR claimed a tight range whenever the local's conservative VN
was a CAST to a small type. That is unsound for normalize-on-load
locals: their storage only contains the small-type bits in the low byte
(upper bytes are stale), but the tight range let fgOptimizeCast drop a
required sign/zero extending load downstream, causing it to read those
stale bits.

Restrict the refinement to locals whose storage is fully normalized:
either non-small locals, or small locals with lvNormalizeOnStore set.

Repros fuzzlyn seed 12902382323863156506 (and others) where
'(short)(arg0 ^ arg2)' with sbyte arg0 began returning the unsigned
byte interpretation in release.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@MihaZupan
Copy link
Copy Markdown
Member Author

/azp run Fuzzlyn

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

// extending load and read those stale bits.
if ((compiler->vnStore != nullptr) && (!varTypeIsSmall(varDsc->TypeGet()) || varDsc->lvNormalizeOnStore()))
{
ValueNum vn = compiler->vnStore->VNConservativeNormalValue(node->gtVNPair);
Copy link
Copy Markdown
Member

@jakobbotsch jakobbotsch May 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think IntegralRange::ForNode should depend on VNs. VNs are a flow sensitive concept that is only valid during specific phases. I think this is too general a utility to start using VNs.

I would be ok with introducing specific helpers that can refine integral ranges based on VNs though, and then using it explicitly from phases where VNs are known to be valid. But in the end you would probably want to use range check instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Constant mod over chars can use a cheaper FastMod

3 participants