[webkit-dev] [MSE] Range ends inclusion is deleting wanted MediaSample's

Jer Noble jer.noble at apple.com
Thu Nov 9 14:23:27 PST 2017


Hi Alicia,

It should be possible to make a “mock” testcase which doesn’t rely on any particular media engine and which can demonstrate this bug.

Continued:

> On Nov 9, 2017, at 2:03 PM, Alicia Boya García <aboya at igalia.com> wrote:
> 
> Hi, WebKittens!
> 
> In the YouTube Media Source Extensions conformance tests there is one
> called 36.AppendOpusAudioOutOfOrder where two audio media segments are
> appended out of order to a SourceBuffer: First, a segment with the PTS
> ranges [10, 20) is added. Then, another one with [0, 10) is added.
> 
> (I have rounded the actual timestamps to near integers for easier
> understanding).
> 
> Almost at the very end of the process the buffered ranges are like this:
> 
> [ 0,  9)
> [10, 20)
> 
> At this point, SourceBuffer::sourceBufferPrivateDidReceiveSample() is
> called with the last audio frame, that has PTS=9 and DUR=1.
> 
> The execution reaches this conditional block:
> 
> ```
> if (trackBuffer.highestPresentationTimestamp.isValid() &&
> trackBuffer.highestPresentationTimestamp <= presentationTimestamp) {
> ```
> 
> trackBuffer.highestPresentationTimestamp contains the highest PTS so far
> within the current segment. The condition is true (9 <= 9) as expected
> for sequentially appended frames.
> 
> Inside there is this block of code:
> 
> ```
> MediaTime highestBufferedTime = trackBuffer.buffered.maximumBufferedTime();
> 
> PresentationOrderSampleMap::iterator_range range;
> if (highestBufferedTime - trackBuffer.highestPresentationTimestamp <
> trackBuffer.lastFrameDuration)
>    range =
> trackBuffer.samples.presentationOrder().findSamplesWithinPresentationRangeFromEnd(trackBuffer.highestPresentationTimestamp,
> frameEndTimestamp);
> else
>    range =
> trackBuffer.samples.presentationOrder().findSamplesWithinPresentationRange(trackBuffer.highestPresentationTimestamp,
> frameEndTimestamp);
> 
> if (range.first != trackBuffer.samples.presentationOrder().end())
>    erasedSamples.addRange(range.first, range.second);
> ```
> 
> The first if block there is an optimization, it decides whether to do a
> binary search in the entire collection of MediaSample's or do a linear
> search starting with the MediaSample with the highest PTS (which is
> faster when appends always occur at the end), but the result is the same
> in both cases.
> 
> presentationOrder() is a std::map<MediaTime, MediaSample>.
> 
> findSamplesWithinPresentationRange(beginTime, endTime) and its *FromEnd
> counterpart both return a pair of STL-style iterators which cover a
> range of MediaSample objects whose presentation timestamps sit in the
> range (beginTime, endTime] (beginTime is exclusive, endTime is inclusive).
> 
> Then, it marks those MediaSample objects (frames) for deletion.
> 
> My question is... shouldn't the range ends inclusivity be the other way
> around i.e. [beginTime, endTime)?

beginTime was intended to be exclusive so that, given a [0, 9] range, searching for (9, 10] wouldn’t match the last sample in the [0, 9] range.

But now that you mention it, perhaps the real problem is that both beginTime _and_ endTime should be exclusive.

> As I understand it, the point of that part of the algorithm is to delete
> old samples that are -- even partially -- in the presentation time range
> of the newly appended one, but using (beginTime, endTime] fails to
> accomplish that in two cases:
> 
> a) If there is an existing MediaSample with PTS=9 and DUR=1 it will not
> be removed because beginTime (=9) is exclusive.

Eh, not really. The search space is the entire duration of the sample, so an exclusive beginTime of 9 _should_ match PTS=9, DUR=1 because 9 < presentationEndTime=10.  I’m pretty sure this is correct, but a mock test case should clear this up pretty quickly.

> b) If there is an existing MediaSample with PTS=10 and DUR=1 it WILL be
> removed even though there is no overlap with the sample being appended
> (PTS=9 DUR=1) because endTime (=10) is inclusive. This is exactly what
> is making the YTTV test fail in my case.

This may be true because endTime is inclusive (and shouldn’t be). 

>    Before:
> 
>        [ 0,  9)
>        [10, 20)
> 
>    Expected result after adding [9, 10):
> 
>        [0, 20)
> 
>    Actual result in WebKit:
> 
>        [ 0, 10)
>        [11, 20)

It looks like findSamplesWithinPresentationRange() is only ever used in that specific part of sourceBufferPrivateDidReceiveSample(), so this should be very safe to change.

-Jer
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3381 bytes
Desc: not available
URL: <http://lists.webkit.org/pipermail/webkit-dev/attachments/20171109/ed5e9bdb/attachment.bin>


More information about the webkit-dev mailing list