WTFCrash()

newer
ImageBitmap and OffscreenCanvas...

older
Removal of the Grid Layout runtime...

Michael Catanzaro

7 Mar 2018 7 Mar '18

5:09 a.m.

Hi, After [1], WTFCrash() is used only in debug builds on Darwin. For Darwin release builds, inline assembly is used to trigger a SIGTRAP. I experimented with this today and found it works quite badly on Linux, somehow confusing gdb and clobbering the top frames of the stacktrace, so I think we should leave that unchanged and keep it Darwin-only. So this mail applies only to debug builds on Darwin, and to non-Darwin ports. Now, currently, WTFCrash() does three things: (1) Calls a user-configurable crash hook (2) Print a low-quality backtrace to stderr (3) Crash somehow: - If ASAN is used, with __builtin_trap() (that's SIGILL on Linux x86_64) - Then with *(int *)(uintptr_t)0xbbadbeef = 0, which might fail to crash if 0xbadbeef is a valid address, and is SIGSEGV otherwise - Then with __builtin_trap() if COMPILER(GCC_OR_CLANG) - Then with ((void(*)())0)() otherwise (presumably SIGSEGV or SIGBUS, dunno) This is all rather more complicated than it needs to be. First off, the crash hook is (almost) unused and should simply be removed, see [2]. Next, the low-quality backtrace. Does anyone think this is useful? It's mainly annoying to me, because it's not anywhere near as good as a proper backtrace that shows stack members, it's mangled so function names are unnecessarily-difficult to read, and it takes all of five seconds to get a much nicer one with modern Linux developer tools. If other developers like it, perhaps we could keep it for debug builds only, and skip right to the crashing in release builds? I suppose we could keep printing it always if there is desire to do this, because it has never caused any problems with Linux crash telemetry and doesn't seem to be harming anything, but otherwise my instinct is to simplify. Now, as for how exactly to crash. Current logic, with asan disabled, prefers SIGSEGV, then SIGILL if that fails, then SIGSEGV again. I don't like that WTFCrash() triggers a SIGSEGV. This is not very clean; at least on Linux, it's conventional for assertion failures and other intentional crashes to cause SIGABRT instead of trying to write to 0xbadbeef. SIGILL is also quite unusual. I think I'd be happy if we replace all of WTFCrash() with a single call to abort(). Any objections to this? Is there a special reason for the current convoluted logic? Michael [1] https://bugs.webkit.org/show_bug.cgi?id=153996 [2] https://bugs.webkit.org/show_bug.cgi?id=183369

Show replies by date

Mark Lam

7 Mar 7 Mar

6:36 a.m.

...

On Mar 6, 2018, at 9:09 PM, Michael Catanzaro <mcatanzaro@igalia.com> wrote:

Hi,

After [1], WTFCrash() is used only in debug builds on Darwin. For Darwin release builds, inline assembly is used to trigger a SIGTRAP. I experimented with this today and found it works quite badly on Linux, somehow confusing gdb and clobbering the top frames of the stacktrace, so I think we should leave that unchanged and keep it Darwin-only. So this mail applies only to debug builds on Darwin, and to non-Darwin ports.

Now, currently, WTFCrash() does three things:

(1) Calls a user-configurable crash hook (2) Print a low-quality backtrace to stderr (3) Crash somehow: - If ASAN is used, with __builtin_trap() (that's SIGILL on Linux x86_64) - Then with *(int *)(uintptr_t)0xbbadbeef = 0, which might fail to crash if 0xbadbeef is a valid address, and is SIGSEGV otherwise - Then with __builtin_trap() if COMPILER(GCC_OR_CLANG) - Then with ((void(*)())0)() otherwise (presumably SIGSEGV or SIGBUS, dunno)

This is all rather more complicated than it needs to be.

First off, the crash hook is (almost) unused and should simply be removed, see [2].

Next, the low-quality backtrace. Does anyone think this is useful? It's mainly annoying to me, because it's not anywhere near as good as a proper backtrace that shows stack members, it's mangled so function names are unnecessarily-difficult to read, and it takes all of five seconds to get a much nicer one with modern Linux developer tools. If other developers like it, perhaps we could keep it for debug builds only, and skip right to the crashing in release builds? I suppose we could keep printing it always if there is desire to do this, because it has never caused any problems with Linux crash telemetry and doesn't seem to be harming anything, but otherwise my instinct is to simplify.

On Darwin, we currently only use WTFCrash() on Debug builds (see definition of the CRASH() macro). Feel free to make linux do the same. FWIW, I use this crash trace a lot when debugging crashes when I don’t already have a debugger attached yet. Of course, with a debugger attached, it is of less value.

...

Now, as for how exactly to crash. Current logic, with asan disabled, prefers SIGSEGV, then SIGILL if that fails, then SIGSEGV again. I don't like that WTFCrash() triggers a SIGSEGV. This is not very clean; at least on Linux, it's conventional for assertion failures and other intentional crashes to cause SIGABRT instead of trying to write to 0xbadbeef. SIGILL is also quite unusual. I think I'd be happy if we replace all of WTFCrash() with a single call to abort(). Any objections to this? Is there a special reason for the current convoluted logic?

For Darwin, I think we want assertion failures to generate crash logs for post-mortem analysis. I’m not sure if SIGABRT will give us the crash logs we want. The choice of a SIGSEGV on 0xbadbeef (for debug builds) and WTFBreakpointTrap() e.g. int3 on x86_64 (for release builds), is so that we can discern the crash log for an assertion failure vs any other crash. The reason for release builds using WTFBreakpointTrap() is so that we can get a crash with minimal perturbation to the register state, to help with post-mortem crash analysis. Debug builds are only used during development and internal testing. So, we take the opportunity to get more crash info there (hence, the dumping of the crash trace). The reason that ASan builds use __builtin_trap() is because ASan does not like the memory access of 0xbadbbeef (if I remember correctly). In addition, we also have infrastructure in place that looks for these crash patterns. So, changing to use SIGABRT (even if it generates a crash log on Darwin) would mean additional cost and churn to update that infrastructure, with not much gain to show for it. Hence, I object to the change. Feel free to change the linux implementation of CRASH() to use abort() if that works better for linux folks. BTW, CRASH() is what you want to define/redirect. WTFCrash() is only one implementation of CRASH(). No client should be calling WTFCrash() directly. Mark

...

Michael

[1] https://bugs.webkit.org/show_bug.cgi?id=153996 [2] https://bugs.webkit.org/show_bug.cgi?id=183369

_______________________________________________ webkit-dev mailing list webkit-dev@lists.webkit.org https://lists.webkit.org/mailman/listinfo/webkit-dev

Michael Catanzaro

4:37 p.m.

On Wed, Mar 7, 2018 at 12:36 AM, Mark Lam <mark.lam@apple.com> wrote:

...

On Darwin, we currently only use WTFCrash() on Debug builds (see definition of the CRASH() macro). Feel free to make linux do the same. FWIW, I use this crash trace a lot when debugging crashes when I don’t already have a debugger attached yet. Of course, with a debugger attached, it is of less value.

Unfortunately that somehow breaks the top frames in the backtrace on Linux, so we can't do that.

...

The reason for release builds using WTFBreakpointTrap() is so that we can get a crash with minimal perturbation to the register state, to help with post-mortem crash analysis. Debug builds are only used during development and internal testing. So, we take the opportunity to get more crash info there (hence, the dumping of the crash trace). The reason that ASan builds use __builtin_trap() is because ASan does not like the memory access of 0xbadbbeef (if I remember correctly).

In addition, we also have infrastructure in place that looks for these crash patterns. So, changing to use SIGABRT (even if it generates a crash log on Darwin) would mean additional cost and churn to update that infrastructure, with not much gain to show for it. Hence, I object to the change.

OK, interesting. I won't change the behavior for Darwin, then. Our crash telemetry does not care about the crash signal that's used; we'll get a nice backtrace regardless. Linux developers kinda expect SIGABRT to be used for intentional crashes like WTFCrash(), though, so it would be (very) slightly more useful for us to use abort().

...

Feel free to change the linux implementation of CRASH() to use abort() if that works better for linux folks. BTW, CRASH() is what you want to define/redirect. WTFCrash() is only one implementation of CRASH(). No client should be calling WTFCrash() directly.

CRASH() always calls WTFCrash(), except in Darwin release builds. So one option is to change only WTFCrash(). I experimented with changing CRASH() to call abort() directly and decided it doesn't really matter either way. The advantage of changing CRASH() is that there is one fewer uninteresting frame at the top of the backtrace, since WTFCrash will not be present. The cost is it will be very slightly harder to notice that CRASH() was called, since WTFCrash() won't be present in the backtrace. Michael

Michael Catanzaro

4:45 p.m.

On Wed, Mar 7, 2018 at 10:37 AM, Michael Catanzaro <mcatanzaro@igalia.com> wrote:

...

Unfortunately that somehow breaks the top frames in the backtrace on Linux, so we can't do that.

I mean, we can't do it in exactly the same way that Darwin does. Yes, of course we can change CRASH() to call abort() or do whatever we want.

Adrian Perez de Castro

9 Mar 9 Mar

11:34 a.m.

Hello, Just a quick note below about the backtraces... On Tue, 06 Mar 2018 23:09:18 -0600, Michael Catanzaro <mcatanzaro@igalia.com> wrote:

...

After [1], WTFCrash() is used only in debug builds on Darwin. For Darwin release builds, inline assembly is used to trigger a SIGTRAP. I experimented with this today and found it works quite badly on Linux, somehow confusing gdb and clobbering the top frames of the stacktrace, so I think we should leave that unchanged and keep it Darwin-only. So this mail applies only to debug builds on Darwin, and to non-Darwin ports.

Now, currently, WTFCrash() does three things:

(1) Calls a user-configurable crash hook (2) Print a low-quality backtrace to stderr (3) Crash somehow: - If ASAN is used, with __builtin_trap() (that's SIGILL on Linux x86_64) - Then with *(int *)(uintptr_t)0xbbadbeef = 0, which might fail to crash if 0xbadbeef is a valid address, and is SIGSEGV otherwise - Then with __builtin_trap() if COMPILER(GCC_OR_CLANG) - Then with ((void(*)())0)() otherwise (presumably SIGSEGV or SIGBUS, dunno)

This is all rather more complicated than it needs to be.

First off, the crash hook is (almost) unused and should simply be removed, see [2].

Next, the low-quality backtrace. Does anyone think this is useful? [...]

In some cases I have worked with embedded devices where it was extremely difficult to even run “gdbserver”, mainly due to memory constraints, and having the limited backtrace from WTFCrash() can save the day in those cases. I think it's perfectly fine to skim over and ignore the backtraces printed by WTFCrash() if one has a better way of debugging.

...

[...] It's mainly annoying to me, [...]

This is not a good reason to remove them. Just don't use them and skip reading them and you are done ;-)

...

[...] because it's not anywhere near as good as a proper backtrace that shows stack members, [...]

True, but sometimes it's all we can have (see above), which is always better than nothing.

...

[...] it's mangled so function names are unnecessarily-difficult to read,

Pipe the output to “c++filt” and you will get back demangled names. Nifty little tool :D

...

[...] and it takes all of five seconds to get a much nicer one with modern Linux developer tools.

That is, *IF* one has access to modern developers tools.

...

[...] If other developers like it, perhaps we could keep it for debug builds only, and skip right to the crashing in release builds? I suppose we could keep printing it always if there is desire to do this, because it has never caused any problems with Linux crash telemetry and doesn't seem to be harming anything, but otherwise my instinct is to simplify.

I would enable printing backtraces to -DENABLE_DEVELOPER_MODE=ON instead, because they can be useful in release (well, or RelWithDebInfo) builds. Cheers, -- Adrián 🎩

Michael Catanzaro

6:59 p.m.

On Fri, Mar 9, 2018 at 5:34 AM, Adrian Perez de Castro <aperez@igalia.com> wrote:

...

I would enable printing backtraces to -DENABLE_DEVELOPER_MODE=ON instead, because they can be useful in release (well, or RelWithDebInfo) builds.

OK, let's do this, then. I don't want to make it impossible to solve bugs on small embedded systems. I would really hate to work in an environment where you can't get a proper backtrace... that sounds awful. :(

Adrian Perez de Castro

13 Mar 13 Mar

12:21 a.m.

On Fri, 09 Mar 2018 12:59:01 -0600, Michael Catanzaro <mcatanzaro@igalia.com> wrote:

...

On Fri, Mar 9, 2018 at 5:34 AM, Adrian Perez de Castro <aperez@igalia.com> wrote:

...
I would enable printing backtraces to -DENABLE_DEVELOPER_MODE=ON instead, because they can be useful in release (well, or RelWithDebInfo) builds.

OK, let's do this, then. I don't want to make it impossible to solve bugs on small embedded systems.

Great, thanks!

...

I would really hate to work in an environment where you can't get a proper backtrace... that sounds awful. :(

It is painful, yep ;-) -- Adrián 🎩

2800

Age (days ago)

2806

Last active (days ago)

List overview

Download

6 comments

3 participants

participants (3)

Adrian Perez de Castro
Mark Lam
Michael Catanzaro

WTFCrash()

Michael Catanzaro

Mark Lam

Michael Catanzaro

Michael Catanzaro

Adrian Perez de Castro

Michael Catanzaro

Adrian Perez de Castro

tags

participants (3)