There is a significant practical problem to "turn the tree red and work with someone to rebaseline the tests". It takes multiple hours for some bots to build and test a given patch. That means, at any moment, you will have maybe tens and in some cases hundreds of failing tests associated with some changelist that you need to track on the bots. You might have more failing tests associated with a different changelist, and so on.
How do you propose to track which baselines have been computed by which bots for which changelists? In the past I have seen failing tests because baselines were checked in too soon and some bot had not generated new baselines. Things are even worse when a change goes in late in the day. How will the pending rebaselines be handed off to the new gardener?
In either case, your responsibility for the patch does not end with the patch landing in the tree. There may be regressions from your change or additional feedback from reviewers after the patch has landed. You can watch the tree at build.webkit.org to make sure your patch builds and passes tests on all platforms. It is your responsibility to be available should regressions arise and to respond to additional feedback that happens after a check-in.
Changes should succeed on all platforms, but it can be difficult to test on every platform WebKit supports. Be certain that your change does not introduce new test failures on the high-traffic Mac or Windows ports by comparing the list of failing tests before and after your change. Your change must at least compile on all platforms.
As the Chromium owner of tests that are frequently broken by third party contributors (they do their best to help us, but cannot generate all 15 or so Chromium expectations), I would much rather have a bug filed against me and a line in the expectations file. At least I then have visibility on what's happening and a chance to verify the results. In a recent pass to clean up expectations we found multiple cases where the wrong result had been committed because the person who did it failed to realize the result was wrong. While you might say "fix the test to make it obvious", it is not obvious how to do that for all tests.
Why not simply attach an owner and a resolution date to each expectation? The real problem right now is accountability and a way to remind people that they have left expectations hanging.