Support DTS-HD with packets across PES boundary by nift4 · Pull Request #3147 · androidx/media

nift4 · 2026-03-28T16:16:08Z

In DTS-HD streams, it can happen that core packet is in one PES and the
extension substream is in another PES. Ensure we can read ahead into next PES
in this case. We need to use the old timestamp as extss and core always belong
together.

FongMi · 2026-03-28T17:54:30Z

Not enough. Please follow the ffmpeg implementation.

nift4 · 2026-03-28T18:27:22Z

@FongMi please can you elaborate a bit?

FongMi · 2026-03-28T18:46:16Z

https://github.com/FongMi/media/blob/release-1.10.0-fongmi/libraries/extractor/src/main/java/androidx/media3/extractor/ts/DtsReader.java
https://github.com/FongMi/media/blob/release-1.10.0-fongmi/libraries/extractor/src/main/java/androidx/media3/extractor/DtsUtil.java

This is what I wrote using copilot + sonnet 4.6

nift4 · 2026-03-28T19:07:24Z

@FongMi Well, I see that you did a lot of changes and most of them seem to address the same issue that we shouldn't emit core format but extension format instead (I solved it with the PatientTrackOutput wrapper and you added a new state and coreFormatPendingEmit instead). My fix is a bit subtle because it intercepts the format() call and waits until extension header is parsed before sending, allowing to overwrite the format which is cached. But it works and it's readable.

I also see another issue which I didn't solve but you solved it, that we should wait for core frame after seeking instead of sending extension substream frame first - it makes sense and I'll add it to my patch.

Is there any other problem I need to address?

FongMi · 2026-03-28T19:17:22Z

That should be all. I'll test the ISO file again after you finish.

nift4 · 2026-03-28T19:29:33Z

@FongMi It should be good now, let me know if it works. Thanks :)

FongMi · 2026-03-29T02:44:43Z

Users have reported that the playback quality of DTS-HD is not as good as my version, and that stuttering occurs. This may be because your Core+EXTSS is sent in segments, while mine is sent all at once.

FongMi · 2026-03-29T04:43:14Z

FongMi@6755c3e
Please also implement DTS-MA and DTS-X recognition.
I've switched DtsUtil to your version to reduce merge conflicts.

nift4 · 2026-03-29T08:39:30Z

@FongMi

Users have reported that the playback quality of DTS-HD is not as good as my version, and that stuttering occurs. This may be because your Core+EXTSS is sent in segments, while mine is sent all at once.

Should be fixed, please check again. Thanks for testing!

Please also implement DTS-MA and DTS-X recognition.

It's on my TODO list, but I will do it in a seperate PR when this is merged. This one is already quite big.

FongMi · 2026-03-30T02:17:16Z

@nift4 Users say it's still very laggy.

Core Architectural Difference

nick — `PatientTrackOutput` double-buffer

TS input → PatientTrackOutput internal buffer → flush() → TrackOutput

Every frame goes through two copies:

// Copy 1: into internal ParsableByteArray
public void sampleData(ParsableByteArray data, int length) {
    this.data.ensureCapacity(this.data.limit() + length);
    ByteBuffer tmp = ByteBuffer.wrap(this.data.getData()); // new object every call
    tmp.put(data.getData(), data.getPosition(), length);
}

// Copy 2: out to the real TrackOutput
public void flush() {
    output.sampleData(data, data.limit());
}

fongmi — zero-copy passthrough

TS input → TrackOutput (direct)

Instead of buffering, it adds a STATE_CHECKING_FOR_EXTSS_AFTER_CORE state: after reading a core frame, it peeks 4 bytes to check whether an EXSS follows, then decides to combine or emit immediately — no intermediate buffer needed.

Performance Comparison

Criterion	fongmi	nick
Data copies per frame	1 (direct write)	2 (buffered)
Per-frame heap allocation	None	`ByteBuffer.wrap()` each call
`ensureCapacity` reallocation risk	None	Yes (large frames)
DTS:X (XLL-X) auto-detection	Yes	No
Post-seek resync	`skipExtssUntilCore` (precise)	`waitingForResyncAfterSeek` (basic)
GC pressure	Low	Higher

Verdict

fongmi is faster, for three reasons:

Zero-copy: data is written directly to TrackOutput, eliminating the PatientTrackOutput intermediate buffer and second memcpy. For large DTS-HD MA frames (potentially hundreds of KB each), this is significant.
No per-frame heap allocation: nick allocates a new ByteBuffer wrapper on every sampleData() call, increasing GC pressure.
Richer functionality: fongmi adds XLL-X scanning to auto-detect DTS:X object audio and properly upgrades the track format to AUDIO_DTS_X, which nick lacks entirely.

nick's PatientTrackOutput is a cleaner abstraction conceptually, but the double-copy cost makes it a poor trade-off for a hot path that processes audio frames continuously.

nift4 · 2026-03-30T07:09:05Z

Hi @FongMi, I ran my sample file through both versions of extractors and only difference I saw in result was average bitrate was set on mine. That's set from core header, I imagine it might cause buffer allocation to be too small, so I removed it. Can you re-test?

No per-frame heap allocation: nick allocates a new ByteBuffer wrapper on every sampleData() call, increasing GC pressure.

Also, I don't think that's the problem causing audible lags, but I changed it to only do that on the first frame (after seek).

rohitjoins · 2026-04-02T14:57:43Z

Hi @nift4,

Thank you for all the work and iterations on this!

The PR in its current form, touches / fixes lots of different issues in extractor. Could we break this down further into smaller issues, focusing one thing at a time?

Suggestion:

Fix for mime type in TsExtractor
Format emission issue in TsExtractor
Fix for MatroskaExtractor
Fix for Mp4Extractor.

Once done, I'm happy to look at them individually and merge them.

nift4 · 2026-04-02T15:11:35Z

Hi @rohitjoins,

The PR in its current form, touches / fixes lots of different issues in extractor. Could we break this down further into smaller issues, focusing one thing at a time?

Because the other parts depend on mime type detection change in TsExtractor I had it as one branch, otherwise there would be conflicts. But we can instead merge this change first (in this PR, I removed everything except "Fix for mime type in TsExtractor") and I will add the other changes to new PRs once this one is merged.

Also please note that the format = null; change is required (and not left over by mistake) in order to have the DTS-HD MA sample ever work with DumpFileAsserts, because without this line, a seek back to 0 would yield different result and test would not work.

Looking forward to review, thanks!

rohitjoins · 2026-04-02T16:57:40Z

Also please note that the format = null; change is required (and not left over by mistake) in order to have the DTS-HD MA sample ever work with DumpFileAsserts, because without this line, a seek back to 0 would yield different result and test would not work.

Thanks for clarifying! While setting format = null; makes the test pass, we unfortunately can't use this in production.

Clearing the format on seek() forces the extractor to re-emit the format. This causes the player to reinitialize the audio decoder on every seek, which leads directly to audio dropouts and the stuttering reported IMO.

The correct way to make the DumpFileAsserts test pass is to avoid the two-step (Core -> HD) format emission entirely. If we wait to check for the extension substream before emitting the format (e.g., using a peek state), the extractor will only ever emit the final DTS-HD format. This ensures the initial playback and the seek() behavior match naturally, fixing the test without causing decoder resets.

nift4 · 2026-04-02T17:16:43Z

Hi @rohitjoins,

The correct way to make the DumpFileAsserts test pass is to avoid the two-step (Core -> HD) format emission entirely. If we wait to check for the extension substream before emitting the format (e.g., using a peek state), the extractor will only ever emit the final DTS-HD format. This ensures the initial playback and the seek() behavior match naturally, fixing the test without causing decoder resets.

Yes, I did do this, as you can see on my final branch here it's no longer needed to reset the format:
https://github.com/nift4/media/blob/dtshdints-backup/libraries/extractor/src/main/java/androidx/media3/extractor/ts/DtsReader.java#L124

It's only in this PR because the DTS-HD MA sample (which is added to demonstrate the mime type change) is separated from the fix to emit format only once.

So, I guess it was the wrong order to merge changes. Instead, I will remove mime type change from the branch, and add format-only-once fix here. Then we can merge that first, then the mime type fix, and then the MKV/MP4 related changes; and format = null; won't be needed at any point in time.

I pushed this, so now the format-only-fix is in this PR. I opted for this wrapper in order to be minimally invasive, and most importantly, don't duplicate DTS related business logic. FongMi has an approach where there are a bunch of additional states added, which is a different set of tradeoffs for code flow with the same end result. Let me know if mine is OK or if you'd rather have the different state approach.

rohitjoins · 2026-04-08T17:24:46Z

Hi @nift4,

Thanks for explaining your rationale. While I have not compared the performance differences for either of the approaches, I think it would make sense to avoid temporary copies in the memory before being flushed out to the actual output?

don't duplicate DTS related business logic

Can you please expand more on this? Can we avoid this by refactoring into reusable helper methods?

nift4 · 2026-04-08T18:09:06Z

Hi @rohitjoins,

I think it would make sense to avoid temporary copies in the memory before being flushed out to the actual output?

Please note there is only one copy, for the very first core frame ,before the format is emitted. After format is emitted to ExtractorOutput, there is no copy anymore and buffer is released (the flush() calls are only used to trigger emitting sampleMetadata, and not sampleData, at this point), as I removed it since FongMi pointed it out. (edit: force pushed to fix a small bug causing useless memory copies)

Given the constraints that:

consume() in DtsReader cannot peek into future bytes, or go back
there is no guarantee we get all bytes of a frame or even just header on first invocation of consume()
we need to read extension substream's header before calling output.format() - and hence before calling output.sampleData()

...I think there is no way to get rid of this one temporary memory copy in any way.

Can you please expand more on this? Can we avoid this by refactoring into reusable helper methods?

We need to read extension substream's header before calling format(). That involves:

first finding core syncword if any (there may be none for DTS Express or DTS-HD MA without backward compatible)
a. parsing core header
b. storing core sample data in temporary buffer
then finding extension syncword if any (if there is another core one, emit the core sample data and format BEFORE parsing new one)
a. then parsing its header
b. call output.format() based on extension header data
c. only now we can emit core sampleData()

This is the business logic I was talking about. Almost entirety of consume() method code is required for this job. But consume() method is optimized to directly pass data to ExtractorOutput, even though we cannot pass any data to it until we called format().

The difficulty in extracting helper method here is that consume() method may be called multiple times until we finish with any one specific aspect of this parsing. So there is a need to keep state in the class, as is currently done already.

But due to different needs of:

first frame (cannot call ExtractorOutput methods in any case and must cache data in the class with temporary buffer)
second frame (based on first frame type, we have to maybe parse header before emitting format, or maybe after, then emit old sample data if any, then proceed as normal)
subsequent frames (should instantly output data to ExtractorOutput without special logic)

I am not sure how code can be shared in elegant way. I'd be interested if you see any way to do it nicely.

Let me know if that's clear or if you have further questions, Thanks for the feedback!

FongMi · 2026-05-02T08:41:18Z

Sorry, I don't have the physical ISO file either, because the file is stored on Quark Cloud Drive in China, and you need a Chinese mobile phone number to register for the cloud drive.

FongMi · 2026-05-02T08:42:55Z

I can ask my users for the cloud drive URL, but you may need to find a way to download it yourself.😂

FongMi · 2026-05-02T10:26:30Z

@nift4 https://115cdn.com/s/swf1cnv3hqk?password=1234#

nift4 · 2026-05-02T11:33:43Z

Hi, I see, indeed I have no way to download from that site. Is it possible to use another site such as https://transfer.it/ ?

FongMi · 2026-05-02T11:40:32Z

I can't download it either, because I'm Taiwanese.😂

In DTS-HD streams, it can happen that core packet is in one PES and the extension substream is in another PES. Ensure we can read ahead into next PES in this case. We need to use the old timestamp as extss and core always belong together.

rohitjoins · 2026-05-11T17:39:18Z

Hi @nift4,

I have added a test with single sample across PES boundary. When run on main branch, it would output two samples in the dump file as expected and the reason for stuttering.

It runs correctly to produce a single sample at this PR with the fixes.

nift4 · 2026-05-11T17:45:56Z

Hi @rohitjoins,

I still don't have an affected sample myself (and I have to admit I'm a bit curious how you made this one), but it's good to know we can verify PR fixes it! Thanks again. It looks good to merge from my side.

rohitjoins · 2026-05-11T18:03:39Z

I first tried using ffmpeg with small PES payload sizes, but it prefers to keep complete audio frames together. So I used a custom Python script to take a raw DTS frame and manually mux it into TS packets, forcing a new PES boundary right in the middle of the frame!

Based on internal review feedback to prevent stale timestamps overriding the clock after a sync loss.

nift4 · 2026-05-12T21:54:47Z

Thanks!

Read more of the DTS-HD header in order to find out extension substream type, to get correct mime type which is relevant for buffer size decision logic (as DTS Express has way lower maximum bit rate than DTS-HD). Issue: androidx#2487 Issue: androidx#3147

MP4 sadly doesn't store information required to understand which DTS format is which either (similar to MKV), so use the helper to detect format in MP4 container. Issue: androidx#3147

nift4 force-pushed the dtshdints branch from 181cf37 to 1eb50e8 Compare March 28, 2026 19:21

nift4 changed the title ~~Support DTS-HD in ts extractor~~ Support differentiating between DTS-HD and DTS Express in TS and Matroska extractors Mar 28, 2026

nift4 force-pushed the dtshdints branch from 1eb50e8 to ff9e3bf Compare March 28, 2026 19:28

nift4 changed the title ~~Support differentiating between DTS-HD and DTS Express in TS and Matroska extractors~~ Support differentiating between DTS-HD and DTS Express in TS, MP4 and Matroska extractors Mar 29, 2026

oceanjules assigned rohitjoins Apr 2, 2026

rohitjoins added the pending comments label Apr 2, 2026

nift4 force-pushed the dtshdints branch 2 times, most recently from ea7a9c6 to 90611d7 Compare April 2, 2026 15:08

nift4 changed the title ~~Support differentiating between DTS-HD and DTS Express in TS, MP4 and Matroska extractors~~ Support DTS-HD mime type detection in ts extractor Apr 2, 2026

nift4 changed the title ~~Support DTS-HD mime type detection in ts extractor~~ Support DTS-HD in ts extractor Apr 2, 2026

nift4 force-pushed the dtshdints branch from 90611d7 to ce68449 Compare April 2, 2026 17:16

nift4 force-pushed the dtshdints branch from ce68449 to d247497 Compare April 7, 2026 14:43

nift4 force-pushed the dtshdints branch from d247497 to 1de2697 Compare April 8, 2026 18:13

nift4 force-pushed the dtshdints branch 3 times, most recently from 5bd1f1a to 4c17803 Compare May 7, 2026 16:53

nift4 changed the title ~~Support DTS-HD in ts extractor~~ Support DTS-HD with packets across PES boundary May 7, 2026

nift4 force-pushed the dtshdints branch from 4c17803 to d36a289 Compare May 7, 2026 16:55

rohitjoins force-pushed the dtshdints branch from d36a289 to 897865b Compare May 8, 2026 01:47

rohitjoins mentioned this pull request May 8, 2026

Fix issue where end of stream wasn't signaled if last pes had length field set #3206

Merged

nift4 and others added 2 commits May 11, 2026 16:18

Support DTS-HD with packets across PES boundary

54c05fd

In DTS-HD streams, it can happen that core packet is in one PES and the extension substream is in another PES. Ensure we can read ahead into next PES in this case. We need to use the old timestamp as extss and core always belong together.

Fix pendingTimeUs leak across PES boundaries

d0432a9

rohitjoins force-pushed the dtshdints branch from 404ee95 to d0432a9 Compare May 11, 2026 15:19

Add test for DTS-HD packets across PES boundaries

fa965e1

rohitjoins added the should merge label May 11, 2026

rohitjoins force-pushed the dtshdints branch from d5b4d46 to 0a91c4c Compare May 11, 2026 18:19

Clear pendingTimeUs when finding new sync

9860b33

Based on internal review feedback to prevent stale timestamps overriding the clock after a sync loss.

rohitjoins force-pushed the dtshdints branch from 0a91c4c to 9860b33 Compare May 12, 2026 16:21

copybara-service Bot merged commit 38f5c46 into androidx:main May 12, 2026
1 check passed

nift4 deleted the dtshdints branch May 12, 2026 21:54

nift4 mentioned this pull request May 15, 2026

Apply DTS-HD and DTS Express detection to MP4 #3229

Merged

Conversation

nift4 commented Mar 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

FongMi commented Mar 28, 2026

Uh oh!

nift4 commented Mar 28, 2026

Uh oh!

FongMi commented Mar 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nift4 commented Mar 28, 2026

Uh oh!

FongMi commented Mar 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nift4 commented Mar 28, 2026

Uh oh!

FongMi commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

FongMi commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nift4 commented Mar 29, 2026

Uh oh!

FongMi commented Mar 30, 2026

Core Architectural Difference

nick — PatientTrackOutput double-buffer

fongmi — zero-copy passthrough

Performance Comparison

Verdict

Uh oh!

nift4 commented Mar 30, 2026

Uh oh!

rohitjoins commented Apr 2, 2026

Uh oh!

nift4 commented Apr 2, 2026

Uh oh!

rohitjoins commented Apr 2, 2026

Uh oh!

nift4 commented Apr 2, 2026

Uh oh!

rohitjoins commented Apr 8, 2026

Uh oh!

nift4 commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

FongMi commented May 2, 2026

Uh oh!

FongMi commented May 2, 2026

Uh oh!

FongMi commented May 2, 2026

Uh oh!

nift4 commented May 2, 2026

Uh oh!

FongMi commented May 2, 2026

Uh oh!

rohitjoins commented May 11, 2026

Uh oh!

nift4 commented May 11, 2026

Uh oh!

rohitjoins commented May 11, 2026

Uh oh!

Uh oh!

nift4 commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

nift4 commented Mar 28, 2026 •

edited

Loading

FongMi commented Mar 28, 2026 •

edited

Loading

FongMi commented Mar 28, 2026 •

edited

Loading

FongMi commented Mar 29, 2026 •

edited

Loading

FongMi commented Mar 29, 2026 •

edited

Loading

nick — `PatientTrackOutput` double-buffer

nift4 commented Apr 8, 2026 •

edited

Loading