Kokkos Automatic Differentiation (Undisplaced)#32597
Conversation
1e1ea85 to
9e48e13
Compare
|
Job Documentation, step Docs: sync website on a2f3df9 wanted to post the following: View the site here This comment will be updated on new commits. |
e09c807 to
50259af
Compare
be28b83 to
c18bb1b
Compare
17b218d to
a8ac7f0
Compare
|
They can also do copy construction. It will be just a bit of waste |
b9b4ccf to
e5a0d29
Compare
| _dot = tag == problem.getVectorTagID(Moose::SOLUTION_DOT_TAG); | ||
|
|
||
| if (problem.vectorTagExists(Moose::OLD_SOLUTION_TAG)) | ||
| _old = _old || tag == problem.getVectorTagID(Moose::OLD_SOLUTION_TAG); |
| _dot = tag == problem.getVectorTagID(Moose::SOLUTION_DOT_TAG); | ||
|
|
||
| if (problem.vectorTagExists(Moose::OLD_SOLUTION_TAG)) | ||
| _old = _old || tag == problem.getVectorTagID(Moose::OLD_SOLUTION_TAG); |
|
could we maybe build some unit tests to bring down the 140 unused lines introduced in |
|
|
||
| Kokkos-MOOSE kernels also support [automatic differentiation (AD)](automatic_differentiation/index.md). | ||
| AD kernels can be derived from `Moose::Kokkos::ADKernel` and should be registered with `registerKokkosADResidualObject()`. | ||
| AD kernels requires `computeQpResidual()` to be defined with the following signature, where everything remains the same with the ordinary kernels except the return type being `Moose::Kokkos::ADReal`: |
There was a problem hiding this comment.
| AD kernels requires `computeQpResidual()` to be defined with the following signature, where everything remains the same with the ordinary kernels except the return type being `Moose::Kokkos::ADReal`: | |
| AD kernels require `computeQpResidual()` to be defined with the following signature, where everything remains the same with the ordinary kernels except the return type being `Moose::Kokkos::ADReal`: |
| AssemblyDatum & datum) const; | ||
| ``` | ||
|
|
||
| `computeQpJacobian()` and `computeQpOffDiagJacobian()` are unused, as AD automatically assembles Jacobian. |
There was a problem hiding this comment.
| `computeQpJacobian()` and `computeQpOffDiagJacobian()` are unused, as AD automatically assembles Jacobian. | |
| `computeQpJacobian()` and `computeQpOffDiagJacobian()` are unused, as AD automatically assembles the Jacobian. |
| _dims.copyToDevice(); | ||
| _offsets.copyToDevice(); | ||
| _data.create(_offsets.last() + stride); | ||
| _data.create(_offsets.last() + stride + 1); |
There was a problem hiding this comment.
was this a bug before? Some comment in the code regarding the +1 here would be nice
| auto & sys = kokkosSystem(_kokkos_var.sys(comp)); | ||
| auto row = sys.getElemLocalDofIndex(datum.elem().id, i, _kokkos_var.var(comp)); | ||
|
|
||
| for (dof_id_type t = 0; t < _matrix_tags.size(); ++t) |
There was a problem hiding this comment.
I'd really like to get KOKKOS_FUNCTION support for make_range and index_range (FYI @roystgnr). Because strictly speaking we could be promoting the integer type returned by _matrix_tags.size(), which is the safe way to do it compared to the other other way around. But the pedantic/perfectionist part of me doesn't love it
There was a problem hiding this comment.
Wow, now that you mention it, so do I!
I don't want to put it in libMesh/libmesh#4451 (because that's almost ready to get merged, because I've already hassled @rochi00 enough about it, and because it is a logically separate feature) ... although, now that I look at things, I'm quite surprised that Rochi didn't have to get it in there. Looks like, although we #include <int_range.h> in type_vector.h, we only got around to using it for l1_norm() (which his code doesn't use) and we never switched any of the older methods to it!
| { | ||
| auto tag = _matrix_tags[t]; | ||
|
|
||
| if (sys.isMatrixTagActive(tag) && !sys.hasNodalBCMatrixTag(row, tag)) |
There was a problem hiding this comment.
So this is to avoid an intermediate matrix close between the application of integrated residual objects and application of nodal BC constraints? I'm surprised I haven't noticed this before; maybe because of our brief convo with respect to the Claude review. I don't like the asymmetry with the residual. We don't check for a nodal BC vector tag when deciding whether to accumulate into the vector or not, which would allow us to avoid the intermediate vector close.
I don't know if we'll ever apply Kokkos to contact, but we do rely on the intermediate population of residuals and Jacobians by integrated residual objects as then the residual at the interface represents forces which we sometimes then move over to the non-constrained surface of the contact interface. Skipping the intermediate population of the global residual/matrix would make it impossible to do that kind of kinematic constraint enforcement
There was a problem hiding this comment.
This becomes very tricky in parallel. If an elemental kernel contributes to nodes covered by nodal BCs but not belonging to the current process, the remote process can't zero the nodes. I can't retrieve the closed matrix back into the COO ordering that I used for preallocation and assembly. PETSc doesn't have such capability.
There was a problem hiding this comment.
That sucks. Is the COO ordering required for things like nodal bcs?
There was a problem hiding this comment.
MatSetPreallocationCOO and MatSetValuesCOO are the only GPU matrix assembly method they support.
There was a problem hiding this comment.
I wonder if they can make something like MatGetValuesCOO. Then I can compute nodal BCs on it.
There was a problem hiding this comment.
What is different between MatZeroRows (which is what we've always used in MOOSE) and MatZeroRowsColumns with respect to the COO format?
There was a problem hiding this comment.
According to the documentation:
Unlike MatZeroRows(), this ignores the MAT_KEEP_NONZERO_PATTERN option value set with MatSetOption(), it merely zeros those entries in the matrix, but never removes them from the nonzero pattern. The nonzero pattern of the matrix can still change if a nonzero needs to be inserted on a
diagonal entry that was previously missing.
There was a problem hiding this comment.
I'm already zeroing out the row where nodal BCs apply in the current code I think?
There was a problem hiding this comment.
So is the plan then to remove these nodal bc matrix tags, close the matrix after element assembly, apply MatZeroRowsAndColumns, and then do the nodal BC matrix application? To me I still feel like MatZeroRows with MAT_KEEP_NONZERO_PATTERN would still be a more robust pattern as in general we don't know if a nodal BC has dof coupling in which case the preset trick won't work
There was a problem hiding this comment.
I implemented something. I hope the performance is not too bad because MatZeroRows is a CPU API and MatSetValuseCOO is now called twice.
| Real3 grad = 0; | ||
| Real3 grad; |
There was a problem hiding this comment.
I don't like this change combined with the += below. I guess it must work but a developer who sees this would be worried that we're doing += with unitialized memory
| auto n_dofs = kokkosAssembly().getNumDofs(info.type, fe); | ||
| auto & grad_phi = kokkosAssembly().getGradPhiFace(info.subdomain, info.type, fe)(side); | ||
|
|
||
| ADReal3 grad; |
| input = 'kokkos_2d_ad_diffusion_test.i' | ||
| cli_args = 'Outputs/exodus=false' | ||
| recover = false | ||
| requirement = 'The Kokkos system shall provide exact Jacobian using automatic differentation.' |
There was a problem hiding this comment.
| requirement = 'The Kokkos system shall provide exact Jacobian using automatic differentation.' | |
| requirement = 'The Kokkos system shall provide an exact Jacobian using automatic differentation.' |
| run_sim = True | ||
| ratio_tol = 1e-7 | ||
| difference_tol = 1e-6 | ||
| requirement = 'The Jacobian from KokkosADTimeDerivative shall be perfect.' |
There was a problem hiding this comment.
| requirement = 'The Jacobian from KokkosADTimeDerivative shall be perfect.' | |
| requirement = 'The Jacobian from a Kokkos based time derivative kernel shall be perfect.' |
| run_sim = True | ||
| ratio_tol = 1e-7 | ||
| difference_tol = 1e-6 | ||
| requirement = 'The Jacobian from KokkosADCoupledTimeDerivative shall be perfect.' |
There was a problem hiding this comment.
| requirement = 'The Jacobian from KokkosADCoupledTimeDerivative shall be perfect.' | |
| requirement = 'The Jacobian from a kokkos-based coupled variable time derivative kernel shall be perfect.' |
No, this requires the full problem to be built for testing. Hopefully we can get more coverage as the time goes on and more tests get added. |
|
Job Test, step Results summary on a2f3df9 wanted to post the following: Framework test summaryCompared against 54450aa in job civet.inl.gov/job/3821125. Added tests
Modules test summaryCompared against 54450aa in job civet.inl.gov/job/3821125. No change |
Refs #32699.