[webkit-dev] are fuzzer tests appropriate layout tests?

Wed Jun 13 12:27:52 PDT 2012

I guess I was saying two slightly different things ...

1) I have a strong bias for individual tests that are fast
2) I have a strong bias for individual tests that are simple, focused,
easy to understand, and are predictable. All other things being equal
(which of course they never are), I would prefer 100 different tests
that can fail individually to 1 test that tests 100 different things.

Of course you have to weigh this against coverage and establishing
correctness; I wouldn't want to lose coverage, either.

-- Dirk

On Wed, Jun 13, 2012 at 12:17 PM, Filip Pizlo <fpizlo at apple.com> wrote:
> Are we sure that we want to make this a general rule?
>
> We have two profitable fuzzers in fast/js that I believe deserve to be in LayoutTests and should be run every time you make any JSC change:
>
> LayoutTests/fast/js/dfg-double-vote-fuzz.html
> LayoutTests/fast/js/dfg-poison-fuzz.html
>
> Both are somewhat long-running (I seem to recall some buzz about them being marked either SLOW or TIMEOUT on Chromium) but both have caught lots of bugs in the JSC optimizing JIT.  They generate ~1000 simple programs and eval them, each program differing in the position of some evil operation.  When you get a crash or a fail, it's pretty easy to use them to quickly identify what went wrong since the offending code is nice and tidy.  On the other hand, if it wasn't for their use of fuzzing, they would certainly have reduced coverage because the exact shape of a program that would cause a failure depends on number of registers available and compiler heuristics, both of which can change with unrelated changes to the JIT or if you switch hardware targets.
>
> So these tests are great for testing things like register allocation, OSR, and type inference.  Even seemingly unrelated changes to JSC, or possibly even JSC bindings, could either cause or reveal bugs that these tests would catch.  Hence it would be bad if they were not part of the LayoutTests.  We would lose coverage while gaining very little in return, since although these tests are on the slower end of the execution time spectrum, the other fast/js tests put together take much longer and probably don't catch as many juicy bugs.  Certainly no other test in LayoutTests/fast/js does nearly as good of a job in covering the code paths that deal with register allocation under register pressure, or type inference under evil control flow, in the presence of an operation that would cause an OSR exit.
>
> More broadly, I think this is a question of test economics.  Does this particular fuzzer test catch enough bugs to justify its run-time?  If yes then we should keep it.  And if nobody can recall a time when the test saved them from making a broken commit, or when it helped a bot watcher identify a genuinely broken changeset, then we should probably get rid of it.
>
> -F
>
>
> On Jun 13, 2012, at 11:58 AM, Dirk Pranke wrote:
>
>> I agree that the fuzzer should be used to create dedicated layout
>> tests, but we shouldn't run the fuzzer itself as part of the layout
>> test regression. I would have no objection to it being a separate test
>> step.
>>
>> -- Dirk
>>
>> On Tue, Jun 12, 2012 at 5:17 PM, Ojan Vafai <ojan at chromium.org> wrote:
>>> See https://bugs.webkit.org/show_bug.cgi?id=87772.
>>>
>>> It's great to use a fuzzer in order to find cases where we're broken and
>>> then make reduced layout tests from those. The viewspec-parser tests are
>>> themselves just a fuzzer though. Granted, they are deterministic by avoiding
>>> using an actual random function, but I don't think throwing randomly
>>> generated bits at a parser is appropriate for layout testing. If nothing
>>> else it's very slow.
>>>
>>> These tests regularly timeout on the Chromium debug bots and occasionally
>>> timeout on the Apple Lion bots. Even on the bots where they don't timeout,
>>> they're slow. I don't it makes sense to spend 1+ minutes running these 5
>>> tests when more targeted reductions could get the same effective coverage
>>> much faster.
>>>
>>> Am I wrong? If not, does anyone object to moving these tests over to
>>> ManualTests or just deleting them entirely?
>>>
>>> Ojan
>>>
>>> _______________________________________________
>>> webkit-dev mailing list
>>> webkit-dev at lists.webkit.org
>>> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>>>
>> _______________________________________________
>> webkit-dev mailing list
>> webkit-dev at lists.webkit.org
>> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>