Skip to content

Update GetRangeFromAssertions to handle some basic TYP_LONG scenarios where it FitsIn<int32_t>#128906

Open
tannergooding wants to merge 3 commits into
dotnet:mainfrom
tannergooding:better-rngchk2
Open

Update GetRangeFromAssertions to handle some basic TYP_LONG scenarios where it FitsIn<int32_t>#128906
tannergooding wants to merge 3 commits into
dotnet:mainfrom
tannergooding:better-rngchk2

Conversation

@tannergooding
Copy link
Copy Markdown
Member

@tannergooding tannergooding commented Jun 2, 2026

This is an alternative to #128676. It needs confirmation that the diffs/TP is acceptable and may require a few iterations or pulling back prior to it being ready for review.

Copilot AI review requested due to automatic review settings June 2, 2026 16:45
@github-actions github-actions Bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jun 2, 2026
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR broadens assertion-based range derivation in the JIT so it can reason about some TYP_LONG value numbers when the resulting values are known to fit in int32, and wires the updated API through rangecheck and assertion propagation to enable additional folding / bounds-check reasoning.

Changes:

  • Update ValueNumStore::IsVNIntegralConstant to coerce constants as int64_t, allowing TYP_LONG constants that fit to be recognized as int32 constants.
  • Extend RangeCheck::GetRangeFromAssertions/worker to accept an explicit var_types and add limited handling for TYP_LONG scenarios (notably RSZ/RSH shift cases and other VN ops).
  • Update assertion propagation and range analysis callsites to pass the expression type and tolerate unknown ranges where TYP_LONG can’t be represented as an int32-based Range.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
src/coreclr/jit/valuenum.h Enables integral-constant extraction from TYP_LONG VNs via int64_t coercion.
src/coreclr/jit/rangecheck.h Updates GetRangeFromAssertions signature and adds Range::IsUnknown() helper.
src/coreclr/jit/rangecheck.cpp Implements the new typed assertion-range logic and extends range computation to consult it in more cases.
src/coreclr/jit/assertionprop.cpp Adapts assertion-prop folding to the new API and to possibly-unknown ranges for wider types.

Comment thread src/coreclr/jit/rangecheck.cpp
Copilot AI review requested due to automatic review settings June 2, 2026 19:19
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

src/coreclr/jit/rangecheck.cpp:796

  • In VNF_Cast handling, when result is non-constant (e.g., casting to/from types that Range can’t represent) the code unconditionally propagates castOpRange if it’s a constant range. This is unsound for sign-changing casts (e.g., int -> uint, uint -> long) when the operand range can include negatives: the cast changes negative values to large positives, but castOpRange would still contain negatives and exclude the large values.

This can cause incorrect tightening and downstream folding/removal based on a range that doesn’t describe the cast result.

                // Now see if we can do better by looking at the cast source.
                // if its range is within the castTo range, we can use that (and the cast is basically a no-op).
                if (varTypeIsIntegral(arg0Typ))
                {
                    Range castOpRange =
                        GetRangeFromAssertionsWorker(comp, arg0Typ, arg0VN, assertions, --budget, visited);

                    if (castOpRange.IsConstantRange())
                    {
                        if (!result.IsConstantRange())
                        {
                            result = castOpRange;
                        }
                        else if ((castOpRange.LowerLimit().GetConstant() >= result.LowerLimit().GetConstant()) &&
                                 (castOpRange.UpperLimit().GetConstant() <= result.UpperLimit().GetConstant()))
                        {
                            result = castOpRange;
                        }
                    }

Comment thread src/coreclr/jit/rangecheck.h
@tannergooding
Copy link
Copy Markdown
Member Author

tannergooding commented Jun 2, 2026

Extracted two parts out of this into #128922 and #128923 to get better TP and diff metrics to decide how much to preserve or not. Most of this change requires things working together to fully lightup, otherwise we only get relatively small diffs for any singular portion.

tannergooding added a commit that referenced this pull request Jun 3, 2026
…ent/symbolic cases (#128922)

This is a smaller change from
#128906 that doesn't involve more
complex handling around `TYP_LONG`
Copilot AI review requested due to automatic review settings June 4, 2026 00:34
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

Comment thread src/coreclr/jit/rangecheck.cpp
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

Comment thread src/coreclr/jit/rangecheck.cpp
@tannergooding
Copy link
Copy Markdown
Member Author

/azp run fuzzlyn, runtime-coreclr jitstress, runtime-coreclr jitstressregs

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 3 pipeline(s).

@tannergooding
Copy link
Copy Markdown
Member Author

Diffs are here.

Linux Arm64

Overall (-476,080 bytes)
FullOpts (-476,080 bytes)

Linux x64

Overall (-432,892 bytes)
FullOpts (-432,892 bytes)

Windows Arm64

Overall (-393,596 bytes)
FullOpts (-393,596 bytes)

Windows x64

Overall (-260,911 bytes)
FullOpts (-260,911 bytes)

Linux arm

Overall (-83,344 bytes)
FullOpts (-83,344 bytes)

Windows x86

Overall (-46,131 bytes)
FullOpts (-46,131 bytes)

Linux x64

Overall (+0.05% to +0.18%)
FullOpts (+0.05% to +0.20%)

Windows arm64

Overall (+0.08% to +0.25%)
FullOpts (+0.08% to +0.26%)

Windows x64

Overall (+0.07% to +0.25%)
FullOpts (+0.07% to +0.25%)

@tannergooding
Copy link
Copy Markdown
Member Author

Overall the diffs are teh standard set you'd expect. We have places that change from sign-extension to zero-extension because we know its never negative and we have removal of code that is now provably dead, unreachable, or unnecessary.

This lights up for places where we're explicitly using long or nint on 64-bit platforms, including places like TensorPrimitives, BigInteger, and the various SpanHelpers where we extend the length up to nint to do the rest of the algorithm.

}
else if ((elementCount == 32) && varTypeIsLong(rangeType))
{
return {SymbolicIntegerValue::Zero, UpperBoundForType(TYP_UINT)};
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just rangeType = TYP_UINT like above?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because LowerBoundForType doesn't handle TYP_UINT, only UpperBoundForType does, and so ForType would hit an unreached.

I believe this is intentional and to avoid bugs since we shouldn't normally encounter TYP_UINT for anything except rare special scenarios like this

*isKnownNonNegative = true;
}
if ((rng.LowerLimit().GetConstant() > 0) || (rng.UpperLimit().GetConstant() < 0))
Range rng = RangeCheck::GetRangeFromAssertions(this, tree->TypeGet(), treeVN, assertions);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't like the fact we need to pass type. It should be evaluated from VN, shouldn't it?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is the initial if ((num == ValueNumStore::NoVN) || (budget <= 0)) check in GetRangeFromAssertions.

i.e, handling producing a range if no VN exists, which previously relied on the fact we would only ever have TYP_INT, but now we can have TYP_LONG as well.

We'd have to have no VN produce keUnknown and for the callers to assume a constant range based on the type in that case instead, which is additional churn either way. I don't have a strong preference here on either approach, and went with passing the type through to avoid regressing the status quo.

Copy link
Copy Markdown
Member

@EgorBo EgorBo Jun 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so can we return just keUnknown if no VN? since this function already may return keUnknown

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, but then that will regress TYP_INT scenarios with no VN where we previously would've gotten an appropriate [INT32_MIN, INT32_MAX] constant range.

That might be fine, since most things should have VNs, but it might also not be since we lose VN info in various places or may not have it for new nodes.

I could restrict it to just GetRangeFromAssertions, have it take the tree, extract the VN, and then call GetRangeFromAssertionsWorker which doesn't further propagate the type since it must be from a VN at that point, but I think that's the "best" we can do since we need a range for two possible types now.

if ((rng.LowerLimit().GetConstant() > 0) || (rng.UpperLimit().GetConstant() < 0))
Range rng = RangeCheck::GetRangeFromAssertions(this, tree->TypeGet(), treeVN, assertions);

if (rng.IsConstantRange())
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's unfortunate we broke the contract for GetRangeFromAssertions to always return a constant range. My opinion we shouldn't do it and instead either upgrade Range to TYP_LONG or introduce a new GetRangeFromAssertions for 64-bit ranges

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or introduce a new GetRangeFromAssertions for 64-bit ranges

This is a lot of unnecessary code duplication and keeping the same overall handling regardless. That is, it doesn't change what assertionprop has to handle here, rather it just forces it to two paths one calling GetRangeFromAssertions32 and one calling GetRangeFromAssertions64 and still handling the fact that one path may not produce a constant range. It saves nothing and just makes things more complex.

With this approach, we have a single method handling both and the caller just has to handle the fact it might not be constant, rather than having to own dispatch to the right method and still handle that nuance anyways.

instead either upgrade Range to TYP_LONG

This is the eventual goal, but I'm trying to get this done incrementally and in a way that is easier to test, review, and handle.

Fully handling TYP_LONG is quite a bit more complex than only extending it to handle places where FitsIn<int32_t> remains "trivially true".

Namely, it involves extending Range/Limit to track int64_t cns, to then have all RangeOperations support int64_t, to have additional range ops that handle the 32 vs 64-bit limits, to ensure we understand ADD(int, x, y) and ADD(long, x, y) have different overflow limits to check and track for example, and to have the various handlers consider those nuances as well.

}
else
{
// TODO: We could return `0, keUnknown` for `elementCount == 32` if the result is TYP_LONG
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't understand what 0, keUnknown is supposed to mean, keUnknown implies it can be something that overflows making the range invalid

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point here is more conceptual and using it as a sentinel because we don't just use it for overflow, but just generally as a "limit cannot be determined/represented". It could well just be something like say 0, keMaxValue instead.

The general point, however, is that while Range is limited to int32_t, it may still be beneficial to propagate up that we know the lower bound and simply cannot represent the upper bound, thus a given TYP_LONG is known to be "never negative" and all the usual "known unsigned value" optimizations can kick in.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this problem won't exist if we we make Range 64bit (or Range)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, which is an eventual goal. I'm going to try to do it after this PR lands even, its just a much more involved change and has potentially reaching arms into many other places in the JIT where we're limiting checks to just genActualType() == TYP_INT

Comment thread src/coreclr/jit/rangecheck.h Outdated
if (varTypeIsGC(vnType))
{
#if TARGET_64BIT
return Limit(Limit::keUnknown);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect it's fine to always give up on gc types unconditionally

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are always giving up unconditionally? The difference is just preserving the status quo where it returned a constant range for GC types on 32-bit.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's useful, just some weird scenario where a TYP_INT PHI had a byref PHI_ARG on 32bit, I doubt we can ever deduce anything from that anyway, and it's 32bit

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but the same consideration will exist when we eventually extend this to full 64-bit. So we basically want to make a decision of "return the full range in that scenario, if we can" or "always return keUnknown". I deferred to maintaing the status quo since that's the less risky change.

Comment thread src/coreclr/jit/assertionprop.cpp Outdated
Comment thread src/coreclr/jit/rangecheck.cpp Outdated
Copilot AI review requested due to automatic review settings June 4, 2026 12:34
@EgorBo
Copy link
Copy Markdown
Member

EgorBo commented Jun 4, 2026

{69A3D74F-AFE7-4B3B-8CD4-34F3737FC987}

I am uncomfortable with assumptions here: if op1 and op2 being TYP_LONG but we found out that their ranges fit into TYP_INT - what guarantees none of these operations don't do something like "we have two single constants, even if (operation) on them overflows it's fine to return it since Ranges are TYP_INT only

Same for unary operators
Same concerns regarding unsigned comparisons

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

Comment on lines +756 to +761
// We're going from a small type to a large type
// and so regardless of whether we zero or sign-extend
// the value is preserved within the confines of its
// original input for the destination, i.e. it always
// passes the FitsIn<fromType> check.

Comment on lines +5543 to +5547
GenTreeBoundsChk* arrBndsChk = tree->AsBoundsChk();
GenTree* arrBndsChkIdx = arrBndsChk->GetIndex();
GenTree* arrBndsChkLen = arrBndsChk->GetArrayLength();
ValueNum vnCurIdx = vnStore->VNConservativeNormalValue(arrBndsChk->GetIndex()->gtVNPair);
ValueNum vnCurLen = vnStore->VNConservativeNormalValue(arrBndsChk->GetArrayLength()->gtVNPair);
@tannergooding
Copy link
Copy Markdown
Member Author

I am uncomfortable with assumptions here: if op1 and op2 being TYP_LONG but we found out that their ranges fit into TYP_INT - what guarantees none of these operations don't do something like "we have two single constants, even if (operation) on them overflows it's fine to return it since Ranges are TYP_INT only

All of the APIs that can overflow (ADD, MUL) currently return keUnknown on any overflow, so say we get ADD(long, [INT32_MAX], [1]), then we try RangeOps::Add, find out that it overflows, and return keUnknown.

But then for RSH/RSZ we're explicitly handling this anways, since the underlying RangeOps::ShiftRight operation doesn't assume overflow is possible. LSH does assume overflow is possible (it calls Multiply even), but we handle it anyways since it has checks assuming the shiftAmount is for TYP_INT and its better to be safer here.

AND, OR are bitwise and so can never introduce new bits.

UMOD is always an identity operation or reduction, it also cannot produce new bits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants