Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -61,10 +61,8 @@ trait LocalElementMixin extends ParticleMixin with LocalElementGrammarMixin {
else if (representation =:= Representation.Binary) true
else false
}
case LengthKind.EndOfParent if isComplexType =>
notYetImplemented("lengthKind='endOfParent' for complex type")
case LengthKind.EndOfParent =>
notYetImplemented("lengthKind='endOfParent' for simple type")
// per DFDL Spec 9.3.2, endOfParent is already positioned at parent's end so length is zero
Comment thread
stevedlawrence marked this conversation as resolved.
Outdated
case LengthKind.EndOfParent => false
}
res
}.value
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -269,6 +269,26 @@ trait Term
.getOrElse(false)
}

final lazy val immediatelyEnclosingElementParent: Option[ElementBase] = {
val p = optLexicalParent.flatMap {
Comment thread
stevedlawrence marked this conversation as resolved.
case e: ElementBase => Some(e)
case ge: GlobalElementDecl => Some(ge.asRoot)
case s: SequenceTermBase => s.immediatelyEnclosingElementParent
case c: ChoiceTermBase => c.immediatelyEnclosingElementParent
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to return the choice in some cases? It looks like the logic for EOP sometimes cares about the choice so I'm not sure we can bypass this?

case ct: ComplexTypeBase => {
ct.optLexicalParent.flatMap {
case e: ElementBase => Some(e)
case ge: GlobalElementDecl => Some(ge.asRoot)
case _ => {
None
}
}
}
case _ => None
}
p
}

final lazy val immediatelyEnclosingGroupDef: Option[GroupDefLike] = {
optLexicalParent.flatMap { lexicalParent =>
val res: Option[GroupDefLike] = lexicalParent match {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -456,7 +456,11 @@ trait AlignedMixin extends GrammarMixin { self: Term =>
}
case LengthKind.Delimited => encodingLengthApprox
case LengthKind.Pattern => encodingLengthApprox
case LengthKind.EndOfParent => LengthMultipleOf(1) // NYI
case LengthKind.EndOfParent =>
eb.immediatelyEnclosingElementParent match {
case Some(parent) => parent.elementSpecifiedLengthApprox
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is quite right. The length of this element isn't the same as the parent length, it's whatever is left over of the parent after the previous siblings.

So this elements length is kindof parent.elementSpecifiedLenghtApprox - priorAlignmentApprox (i.e the length of the parent minus wherever we are starting) but we can't just subtract approx things since they are potentially multiples.

That said, I wonder if we don't really need to get this elements approx length perfect, because no elements come after it, and the endingAlignApprox of the parent won't need this specific is going to be known since it has an explicit length? Maybe this just becomes LengthMultipleOf 1 or 8 depending on length units? This might need some more thought...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm Length Units don't apply for endOfParent, so we might be fine leaving it as LengthMultipleOf(1)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But then if its parent has siblings, does that make an impact on us not approximating?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think you're right that we can just say length approx of an EOP element is multiple of 1.

In the case of a parent of an EOP element having siblings, I think the parent will calculate contentEndAlignment to figure out where it ends to figure out what alignment is needed for its sibling. And we have this code that handles that:

      case eb: ElementBase => {
        val res = if (eb.isComplexType && eb.lengthKind == LengthKind.Implicit) {
          eb.complexType.group.contentEndAlignment
        } else {
          // simple type or complex type with specified length
          contentStartAlignment + elementSpecifiedLengthApprox
        }
        res
      }

I don't think implicit is allowed to a be a parent of an EOP child, so we would fall into the contentStartAlignment + elementSpecifiedLengthApprox branch.

And I think elementSpeciifedLengthApprox will calculate the length of the parent and know where it ends to figure out alignment for the sibling--it doesn't even care where the children ended. So when we have EOP, we kindof don't really care about the approx length of the child, because the paren't already knows what the length will be, so having the EOP child being LengthMultipleOf(1) doesn't really matter for the parent.

So I think we are okay and will correctly detect any needed alignment?

case _ => LengthMultipleOf(1)
}
// If an element is lengthKind="prefixed", the element's length is the length
// of the value of the prefix element, which can't be known till runtime
case LengthKind.Prefixed => LengthMultipleOf(1) // NYI (see DAFFODIL-3066)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,14 @@ package org.apache.daffodil.core.grammar

import java.lang.Long as JLong

import org.apache.daffodil.core.dsom.ChoiceTermBase
import org.apache.daffodil.core.dsom.ElementBase
import org.apache.daffodil.core.dsom.ExpressionCompilers
import org.apache.daffodil.core.dsom.InitiatedTerminatedMixin
import org.apache.daffodil.core.dsom.ModelGroup
import org.apache.daffodil.core.dsom.PrefixLengthQuasiElementDecl
import org.apache.daffodil.core.dsom.Root
import org.apache.daffodil.core.dsom.SequenceTermBase
import org.apache.daffodil.core.grammar.primitives.*
import org.apache.daffodil.core.runtime1.ElementBaseRuntime1Mixin
import org.apache.daffodil.lib.exceptions.Assert
Expand Down Expand Up @@ -52,6 +56,7 @@ trait ElementBaseGrammarMixin

requiredEvaluationsIfActivated(checkPrefixedLengthElementDecl)
requiredEvaluationsIfActivated(checkDelimitedLengthEVDP)
requiredEvaluationsIfActivated(checkEndOfParentElem)

private val context = this

Expand Down Expand Up @@ -252,6 +257,134 @@ trait ElementBaseGrammarMixin
}
final lazy val prefixedLengthBody = prefixedLengthElementDecl.parsedValue

final lazy val parentEffectiveLengthUnits: LengthUnits =
immediatelyEnclosingElementParent match {
case Some(parent: ElementBase) => {
parent.lengthKind match {
case LengthKind.Explicit | LengthKind.Prefixed => parent.lengthUnits
case LengthKind.Pattern => LengthUnits.Characters
case _
if parent.isInstanceOf[ChoiceTermBase] && (parent
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a choice might not necessarily be a direct parent? It's not uncommon for a choice branch to be a sequence (or a nested sequences). I think we need to look up the parent until we find a choice or an element.

.asInstanceOf[ChoiceTermBase]
.choiceLengthKind == ChoiceLengthKind.Explicit) =>
LengthUnits.Bytes
case LengthKind.EndOfParent => parent.parentEffectiveLengthUnits
case _ =>
Assert.invariantFailed(
s"Could not figure effective length unit of parents of ${context}"
)
}
}
case None if this.isInstanceOf[Root] => LengthUnits.Characters
case _ =>
Assert.invariantFailed(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this invariant might break if we have something like a global element decl with a child with EOP. That EOP will want to reach up to find where it's used but wont' be able to find a parent because it only looks lexically.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean a child that's an element reference? Do we have any other way to look at a parent that's not optLexicalParent? Even immediatelyEnclosingGroupDef uses optLexicalParent.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the below test and it works as expected where LK is EOP

	<xs:element name="text_string_txt_bytes" type="xs:string" dfdl:lengthUnits="bytes" nillable="true" />
	<xs:element name="text_string_txt_ref2">
		<xs:complexType>
			<xs:sequence>
				<xs:element ref="ex:text_string_txt_bytes"/>
			</xs:sequence>
		</xs:complexType>
	</xs:element>
	<xs:element name="text_string_txt_ref3">
		<xs:complexType>
			<xs:sequence>
				<xs:element ref="ex:text_string_txt_ref2"/>
			</xs:sequence>
		</xs:complexType>
	</xs:element>

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might I'm just forgetting how optLexicalParent works.

Refreshing my memory, it looks like the GlobalElementDecl DSOM object that represents the global definition does not have an optLexicalParent (or I guess more correctly it does, but it is the SchemaDocument).

But we also have an "ElementRef" DSOM object that represents the local element ref to that global definition. And the "ElementRef" is what is in the DSOM tree.

So as long as these functions are run within the context of ElementRef then maybe these invariants hold.

But I think maybe things get tricky if we try to recursively look up multiple parents though?

For example, if we are in the context of the ElementRef(text_string_txt_bytes) and ask for its optLexicalParent we'll get the global GlobalElemenDecl(text_string_txt_ref2). But if we then recursively ask for that GlobalElementDecl's optLexicalParent we'll get a SchemaDocument.

Similarly, say we have a group like this:

<group name="foo">
  <sequence>
    <element name="someEopElement" ... />
  </sequence>
</group>

Asking for the optLexicalParent of someEopElement returns the GlobalSequenceGroupDef for foo, who's optLexicalParent is the SchemaDocument and not whatever references the group.

And that makes sense because multiple different things could reference the group, so we don't really know which parent to examine.

So I'm not really sure how recursively looking up parents can work. The recrusion essentially ends at the global definition. It works fine if the thing you are looking for is within the scope of of your global element (e.g. Element Ref is always inside an Element, but I think it breaks once groups get involved. Unless those are handelded somehow else, and maybe I'm just looking at the wrongs spot when I'm inspeting values of optLexicalparent?

s"Could not figure effective length unit of parents of ${context}"
)
}
final lazy val checkEndOfParentElem: Unit = {
if (lengthKind != LengthKind.EndOfParent) ()
else {
schemaDefinitionWhen(
hasTerminator,
"%s is specified as dfdl:lengthKind=\"endOfParent\", but specifies a dfdl:terminator.",
context
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to include context in the error string. I believe the error context is capture and output as part of the SDE.

)
schemaDefinitionWhen(
trailingSkip != 0,
"%s is specified as dfdl:lengthKind=\"endOfParent\", but specifies a non-zero dfdl:trailingSkip.",
context
)
schemaDefinitionWhen(
maxOccurs > 1,
"%s is specified as dfdl:lengthKind=\"endOfParent\", but specifies a maxOccurs greater than 1.",
context
)
schemaDefinitionWhen(
nextSibling.isDefined && nextSibling.get.isInstanceOf[ModelGroup],
"%s is specified as dfdl:lengthKind=\"endOfParent\", but a model group is defined between this element and the end of the enclosing component",
context
)
Comment thread
stevedlawrence marked this conversation as resolved.
schemaDefinitionWhen(
nextSibling.isDefined && nextSibling.get.isRepresented,
"%s is specified as dfdl:lengthKind=\"endOfParent\", but a represented element is defined between this element and the end of the enclosing component",
context
)
immediatelyEnclosingElementParent match {
case Some(parent: ElementBase) =>
parent.lengthKind match {
case LengthKind.Implicit | LengthKind.Delimited =>
schemaDefinitionError(
"%s is specified as dfdl:lengthKind=\"endOfParent\", but its parent is an element with dfdl:lengthKind 'implicit' or 'delimited'.",
context
)
case _ => // do nothing
}
case _ => // do nothing
}
schemaDefinitionWhen(
representation == Representation.Text && knownEncodingWidthInBits != 8 && parentEffectiveLengthUnits != LengthUnits.Characters,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some testing and change the CVS schema to this:

  <element name="file" dfdl:lengthKind="explicit" dfdl:length="10" dfdl:terminator="%NL;">
    <complexType>
      <sequence>
        <element name="field" type="xs:string" dfdl:lengthKind="endOfParent" />
      </sequence>
    </complexType>
  </element>

And a got this stack trace:

org.apache.daffodil.lib.exceptions.Abort: Invariant broken: KnownEncodingMixin.this.isKnownEncoding
org.apache.daffodil.lib.exceptions.Assert$.abort(Assert.scala:153)
org.apache.daffodil.runtime1.processors.KnownEncodingMixin.knownEncodingName(EncodingRuntimeData.scala:56)
org.apache.daffodil.runtime1.processors.KnownEncodingMixin.knownEncodingName$(EncodingRuntimeData.scala:46)
org.apache.daffodil.core.dsom.LocalElementDeclBase.knownEncodingName$lzyINIT1(LocalElementDecl.scala:25)
	at org.apache.daffodil.lib.exceptions.Assert$.abort(Assert.scala:153)
	at org.apache.daffodil.runtime1.processors.KnownEncodingMixin.knownEncodingName(EncodingRuntimeData.scala:56)
	at org.apache.daffodil.runtime1.processors.KnownEncodingMixin.knownEncodingName$(EncodingRuntimeData.scala:46)
	at org.apache.daffodil.core.dsom.LocalElementDeclBase.knownEncodingName$lzyINIT1(LocalElementDecl.scala:25)
	at org.apache.daffodil.core.dsom.LocalElementDeclBase.knownEncodingName(LocalElementDecl.scala:25)
	at org.apache.daffodil.runtime1.processors.KnownEncodingMixin.knownEncodingCharset(EncodingRuntimeData.scala:62)
	at org.apache.daffodil.runtime1.processors.KnownEncodingMixin.knownEncodingCharset$(EncodingRuntimeData.scala:46)
	at org.apache.daffodil.core.dsom.LocalElementDeclBase.knownEncodingCharset$lzyINIT1(LocalElementDecl.scala:25)
	at org.apache.daffodil.core.dsom.LocalElementDeclBase.knownEncodingCharset(LocalElementDecl.scala:25)
	at org.apache.daffodil.runtime1.processors.KnownEncodingMixin.knownEncodingWidthInBits(EncodingRuntimeData.scala:81)
	at org.apache.daffodil.runtime1.processors.KnownEncodingMixin.knownEncodingWidthInBits$(EncodingRuntimeData.scala:46)
	at org.apache.daffodil.core.dsom.LocalElementDeclBase.knownEncodingWidthInBits$lzyINIT1(LocalElementDecl.scala:25)
	at org.apache.daffodil.core.dsom.LocalElementDeclBase.knownEncodingWidthInBits(LocalElementDecl.scala:25)
	at org.apache.daffodil.core.grammar.ElementBaseGrammarMixin.checkEndOfParentElem(ElementBaseGrammarMixin.scala:325)
	at org.apache.daffodil.core.grammar.ElementBaseGrammarMixin.checkEndOfParentElem$(ElementBaseGrammarMixin.scala:49)
	at org.apache.daffodil.core.dsom.LocalElementDeclBase.checkEndOfParentElem$lzyINIT1(LocalElementDecl.scala:25)
	at org.apache.daffodil.core.dsom.LocalElementDeclBase.checkEndOfParentElem(LocalElementDecl.scala:25)

I think the issue is that the default encoding used by csv is UTF-8, which seems to cause problems here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not able to replicate this. And I think CSV's default encoding is ASCII?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like it uses the $dfdl:encoding variable which I think defaults to UTF-8

https://github.com/DFDLSchemas/CSV/blob/master/src/csv-base-format.dfdl.xsd#L48

That's a relatively new change to CSV, maybe you're using an older version?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the issue is specifically due to the use of dfdl:encoding var..replacing with UTF-8 works as expected. Will investigate

<xs:element name="file" dfdl:lengthKind="explicit" dfdl:length="10" dfdl:terminator="%NL;" dfdl:encoding="{$dfdl:encoding}">
	<xs:complexType>
		<xs:sequence>
			<xs:element name="field" type="xs:string" dfdl:lengthKind="endOfParent" dfdl:encoding="{$dfdl:encoding}" />
		</xs:sequence>
	</xs:complexType>
</xs:element>

"%s is specified as dfdl:lengthKind=\"endOfParent\", but the element has text representation, does not have a single-byte character set encoding, and the effective length units of the parent is not 'characters'.",
context
)

immediatelyEnclosingModelGroup match {
case Some(s: SequenceTermBase) => {
schemaDefinitionWhen(
s.separatorPosition == SeparatorPosition.Postfix,
"%s is specified as dfdl:lengthKind=\"endOfParent\", but is in a sequence with dfdl:separatorPosition defined as 'postfix'.",
context
)
schemaDefinitionWhen(
s.sequenceKind != SequenceKind.Ordered,
"%s is specified as dfdl:lengthKind=\"endOfParent\", but is in a sequence with dfdl:sequenceKind defined as 'unordered'.",
context
)
schemaDefinitionWhen(
s.hasTerminator,
"%s is specified as dfdl:lengthKind=\"endOfParent\", but is in a sequence with a dfdl:terminator.",
context
)
schemaDefinitionWhen(
s.elementChildren.exists(e => e.floating == YesNo.Yes),
"%s is specified as dfdl:lengthKind=\"endOfParent\", but is in a sequence with elements defining dfdl:floating='yes'.",
context
)
schemaDefinitionWhen(
s.trailingSkip != 0,
"%s is specified as dfdl:lengthKind=\"endOfParent\", but is in a sequence with a non-zero dfdl:trailingSkip."
)
}
case Some(c: ChoiceTermBase) if c.choiceLengthKind == ChoiceLengthKind.Implicit => {
schemaDefinitionWhen(
c.hasTerminator,
"%s is specified as dfdl:lengthKind=\"endOfParent\", but is in a choice with a dfdl:terminator.",
context
)
schemaDefinitionWhen(
c.trailingSkip != 0,
"%s is specified as dfdl:lengthKind=\"endOfParent\", but is in a choice with a non-zero dfdl:trailingSkip.",
context
)
}
case _ => // do nothing
}

if (isSimpleType) {
schemaDefinitionUnless(
(primType eq PrimType.String)
|| (representation == Representation.Text)
|| (primType eq PrimType.HexBinary)
|| (representation == Representation.Binary
&& Seq(BinaryNumberRep.Packed, BinaryNumberRep.Bcd, BinaryNumberRep.Ibm4690Packed)
.contains(binaryNumberRep)),
"%s is a simple type specified as dfdl:lengthKind=\"endOfParent\", but isn't a string type, doesn't have text representation, isn't a hexbinary type, or doesn't have binary representation with packed decimal representation.",
context
)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if this last check as already handled by below changes? FOr example, just below this you have a SDE for

        SDE("lengthKind='endOfParent' only supported for packed binary formats.")

Seems like it would only be possible to hit one of these SDEs. I don't think we necessarily need to have all checks in this checkEndOfParentRestrictionsOnCurrentElement if they are handled in other places with potentially more specific error messages.

}

}
}

/**
* Quite tricky when we add padding or fill
*
Expand Down Expand Up @@ -648,10 +781,7 @@ trait ElementBaseGrammarMixin
Assert.invariant(pt == PrimType.String)
StringOfSpecifiedLength(this)
}
case LengthKind.EndOfParent if isComplexType =>
notYetImplemented("lengthKind='endOfParent' for complex type")
case LengthKind.EndOfParent =>
notYetImplemented("lengthKind='endOfParent' for simple type")
case LengthKind.EndOfParent => StringOfSpecifiedLength(this)
}
}

Expand All @@ -667,6 +797,10 @@ trait ElementBaseGrammarMixin
new HexBinaryLengthPrefixed(this)
}

private lazy val hexBinaryLengthEndOfParent = prod("hexBinaryLengthEndOfParent") {
new HexBinaryLengthEndOfParent(this)
}

private lazy val hexBinaryValue = prod("hexBinaryValue") {
schemaDefinitionWhen(
lengthUnits == LengthUnits.Characters,
Expand All @@ -678,10 +812,7 @@ trait ElementBaseGrammarMixin
case LengthKind.Delimited => hexBinaryDelimitedEndOfData
case LengthKind.Pattern => hexBinaryLengthPattern
case LengthKind.Prefixed => hexBinaryLengthPrefixed
case LengthKind.EndOfParent if isComplexType =>
notYetImplemented("lengthKind='endOfParent' for complex type")
case LengthKind.EndOfParent =>
notYetImplemented("lengthKind='endOfParent' for simple type")
case LengthKind.EndOfParent => hexBinaryLengthEndOfParent
}
}

Expand Down Expand Up @@ -1280,10 +1411,7 @@ trait ElementBaseGrammarMixin
case LengthKind.Implicit =>
LiteralValueNilOfSpecifiedLength(this)
case LengthKind.Prefixed => LiteralValueNilOfSpecifiedLength(this)
case LengthKind.EndOfParent if isComplexType =>
notYetImplemented("lengthKind='endOfParent' for complex type")
case LengthKind.EndOfParent =>
notYetImplemented("lengthKind='endOfParent' for simple type")
case LengthKind.EndOfParent => LiteralValueNilOfSpecifiedLength(this)
}
}
case NilKind.LiteralCharacter => {
Expand Down Expand Up @@ -1396,10 +1524,7 @@ trait ElementBaseGrammarMixin
case LengthKind.Implicit
if isSimpleType && impliedRepresentation == Representation.Binary =>
new SpecifiedLengthImplicit(this, body, implicitBinaryLengthInBits)
case LengthKind.EndOfParent if isComplexType =>
notYetImplemented("lengthKind='endOfParent' for complex type")
case LengthKind.EndOfParent =>
notYetImplemented("lengthKind='endOfParent' for simple type")
case LengthKind.EndOfParent => new SpecifiedLengthEndOfParent(this, body)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is is only needed for complex types? My thinking is that simple types with lengthKind EOP should have parent parser (ether complex or explicity length choice) that already set the bit limit via one of these specified length parsers. So the bit limit has already been set correctly and we don't need another parser to do that.

But this is needed for complex types with EOP since they need to skip any bits up to their parents bit limit that thier children might not have consumed.

So I think this wants to be:

case LengthKind.EndOfParent if isComplexType => new SpecifiedLengthEndOfParent(this, body)

And then bodyRequiresSpecifiedLength wants to be modified to make it so it evaluates to false if this is a simple type with lengthKind EOP.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm what if the simpleType is a root element with lengthKind EOP (where the user intends it to go to the end of the datastream)?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. So maybe it instead wants to be something like this?

case LengthKind.EndOfParent => {
  Assert.invariant(isComplexType || isRoot)
  new SpecifiedLengthEndOfParent(this, body)
}

case LengthKind.Delimited | LengthKind.Implicit =>
Assert.impossibleCase(
"Delimited and ComplexType Implicit cases should not be reached"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably update this comment to mention EOP, or maybe reword it to be more generic.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -191,6 +191,16 @@ case class HexBinaryLengthPrefixed(e: ElementBase) extends Terminal(e, true) {
new HexBinaryMinLengthInBytesUnparser(e.minLength.longValue, e.elementRuntimeData)
}

case class HexBinaryLengthEndOfParent(e: ElementBase) extends Terminal(e, true) {

override lazy val parser: DaffodilParser = new HexBinaryEndOfBitLimitParser(
e.elementRuntimeData
)

override lazy val unparser: DaffodilUnparser =
new HexBinaryMinLengthInBytesUnparser(e.minLength.longValue, e.elementRuntimeData)
}

abstract class PackedIntegerDelimited(
e: ElementBase,
packedSignCodes: PackedSignCodes
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,30 @@ class SpecifiedLengthExplicit(e: ElementBase, eGram: => Gram, bitsMultiplier: In

}

class SpecifiedLengthEndOfParent(e: ElementBase, eGram: => Gram)
extends SpecifiedLengthCombinatorBase(e, eGram) {

lazy val kind = "EndOfParent_" + e.lengthUnits.toString

lazy val parser: Parser = {
if (eParser.isEmpty) eParser
else
new SpecifiedLengthEndOfParentParser(
eParser,
e.elementRuntimeData
)
}

lazy val unparser: Unparser = {
if (eUnparser.isEmpty) eUnparser
else
new SpecifiedLengthEndOfParentUnparser(
eUnparser,
e.elementRuntimeData
)
}
}

class SpecifiedLengthImplicit(e: ElementBase, eGram: => Gram, nBits: Long)
extends SpecifiedLengthCombinatorBase(e, eGram)
with SpecifiedLengthExplicitImplicitUnparserMixin {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -84,11 +84,8 @@ final class HexBinarySpecifiedLengthParser(erd: ElementRuntimeData, lengthEv: Le
}

final class HexBinaryEndOfBitLimitParser(erd: ElementRuntimeData)
extends HexBinaryLengthParser(erd) {
extends HexBinaryLengthParser(erd),
BitLengthFromBitLimitMixin {

override def runtimeDependencies = Vector()

override def getLengthInBits(pstate: PState): Long = {
pstate.bitLimit0b.get - pstate.bitPos0b
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,17 @@ class SpecifiedLengthPatternParser(
}
}

class SpecifiedLengthEndOfParentParser(
eParser: Parser,
erd: ElementRuntimeData
) extends SpecifiedLengthParserBase(eParser, erd),
BitLengthFromBitLimitMixin {

override protected def getBitLength(s: PState): MaybeULong = {
MaybeULong(super[BitLengthFromBitLimitMixin].getBitLength(s))
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we need something different for SpecifiedLengthEndOfParentParser, at least the context of complex types. If we look at SpecifiedLengthParserBase, then logic it follows for complex types is essentially

  1. getBitLength
  2. ignored checking isDefinedForLength
  3. withBitLengthLimit { children.parse() }
  4. skip any remaining bits not consumed by children

Note that the reason for step 2 to ignore isDefinedForLength is because isDefinedForLength will cause the underlying input source too fill a bunch of buckets for the full length of the complex types, but there are limits to how much we can bucket.

Instead, we just call all the children, have them slowly read all the bytes as needed, and then skip whatever is left over at the end. So we don't require the full length of the complex type until we finish parsing the children.

But say we have an EOP complex root element. The first thing this does is call getBiLength which will to scan the entire stream for the EOD, which does exactly what we were trying to avoid by skipping isDefinedForLength in step 2.

So it feels like we need special logic for complex EOP elements so that they do something different if there is no current bit limit. For example, the logic wants to be more like this for complex EOP elements:

if (bitLimit.isDefined) {
  val len = getBitLength  // will use bitLimit
  withBitLengthLength { children.parse }
  val reminaingBits = ...
  dis.skip(remaiingBits)
} else {
   children.parse()
   val len = getBitLength // will scan for EOD
   val reminaingBits = ...
   dis.skip(remainingBits)
}

So we kindof need to reverse the order of getBitLengthand children.parse.

I'm not sure if it makes sense to change SpecifiedLengthParserBase to know about EOP and check for bitLimit or SpecifiedLengthEndOfParentParser wants to be it's own thing, maybe with helper functions that the Base parser has for the common things like skipping remaining bits.

Note this is only an issue for complex types, simple types need the length no matter what.

}

class SpecifiedLengthExplicitParser(
eParser: Parser,
erd: ElementRuntimeData,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,20 @@ final class SpecifiedLengthExplicitImplicitUnparser(
}
}

final class SpecifiedLengthEndOfParentUnparser(
eUnparser: Unparser,
erd: ElementRuntimeData
) extends CombinatorUnparser(erd) {

override def runtimeDependencies = Vector()

override def childProcessors = Vector(eUnparser)

override final def unparse(state: UState): Unit = {
eUnparser.unparse1(state)
}
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this unparser doesn't do anything I would suggest we shouldn't even have it and the SpecifiedLengthEndOfParent primitive just wants to return eUnparser. It looks like the pattern parser does something similar, for example.


/**
* This trait is to be used with prefixed length unparsers where the length
* must be calculated based on the content length of the data. This means the
Expand Down
Loading
Loading