[webkit-dev] Simplifying syntax in test_expectations.txt (bug 86691)

Fri May 18 00:05:26 PDT 2012

On May 17, 2012, at 5:39 PM, Dirk Pranke <dpranke at chromium.org> wrote:

> I probably polarized things by saying that your input was less
> valuable than those of people who were long-time users. I did not mean
> to offend, and I'm sorry. I certainly didn't mean to imply that I was
> not listening or not open to feedback from anyone, and I hope I've
> made that clear. That does not mean I will agree to any proposal
> without argument, obviously.

Apology accepted. I, in turn, am sorry for getting overly grumpy at you.

On May 17, 2012, at 4:57 PM, Ojan Vafai <ojan at chromium.org> wrote:

> 
> All the proposals that are not just bikeshedding we seem to agree on and will happen (the half-dozen things I listed above).

I think it's great that we came up with some changes that seem useful and which no one is objecting to. We should certainly do those ASAP! Unfortunately they are buried in the middle of a megathread. Maybe it would be worth it to fork off a new thread just to notify the folks who may not be following this one, which would identify the changes and give examples of a new syntax.

We can discuss further changes separately, with that set as a baseline.

> Sure TEXT, IMAGE, etc are not very clear, but noone has actually proposed something better.

I believe proposals were made which were more clear, but were rejected for other reasons (mainly not being as familiar to people used to the format afaict). Let me add another. I would propose the following replacements for current states:

neither TEXT nor IMAGE
   ==> (continue to say nothing)

TEXT or TEXT+IMAGE
   ==> FAIL
FAIL would mean the test fails - for text-only tests, it means text failure, for render tree tests it means text failure (who cares if the pixel test somehow accidentally pass at that point, that's not a meaningfully distinct state), for ref tests it would mean a reference failure

IMAGE
   ==> PIXELFAIL or PIXELONLYFAIL
This would be applied only to render tree tests and only in the case where only the pixel test mode fails, not text test. We have historically called this mode "pixel tests" not "image tests", let's be consistent. Not applicable to text tests or reference tests. Mutually exclusive with FAIL.

TEXT IMAGE
  ==> FLAKY
If one of the text tests or the image tests will fail but maybe not both, that means the test is nondeterministic, so it should be marked as flaky and its results should not affect greenness of the bots, so long as it does not hang or crash. It doesn't seem like we currently have a FLAKY result expectation based on the bots, you are supposed to indicate it by listing all possible kinds of failures, but that seems unhelpful. Also, a flaky test that sometimes entirely passes on multiple runs in a row will turn the bots red, which seems bad. Let's just have FLAKY state instead where we don't get upset whether the test passes or fails.