[Webkit-unassigned] [Bug 227715] New: [webkitcorepy] run-webkit-tests may hang with python2 after r271683
bugzilla-daemon at webkit.org
bugzilla-daemon at webkit.org
Tue Jul 6 11:24:26 PDT 2021
https://bugs.webkit.org/show_bug.cgi?id=227715
Bug ID: 227715
Summary: [webkitcorepy] run-webkit-tests may hang with python2
after r271683
Product: WebKit
Version: WebKit Nightly Build
Hardware: Unspecified
OS: Unspecified
Status: NEW
Severity: Normal
Priority: P2
Component: Tools / Tests
Assignee: webkit-unassigned at lists.webkit.org
Reporter: clopez at igalia.com
CC: bugs-noreply at webkitgtk.org, jbedard at apple.com
Checking the testing EWS for GTK WK2 layout test I noticed most of them were failing to complete the layout-test step and they were hanging at the end.
Example: https://ews-build.webkit-uat.org/#/builders/34/builds/34722
09:54:19.733 3 worker/5 imported/w3c/web-platform-tests/service-workers/service-worker/fetch-cors-xhr.https.html passed
09:54:19.748 3 worker/15 imported/w3c/web-platform-tests/streams/piping/flow-control.any.worker.html passed
09:54:19.778 3 worker/6 imported/w3c/web-platform-tests/streams/transform-streams/lipfuzz.any.worker.html passed
09:54:19.780 3 worker/2 "ruby -I /app/webkit/Websites/bugs.webkit.org/PrettyPatch /app/webkit/Websites/bugs.webkit.org/PrettyPatch/prettify.rb /home/ews/worker/GTK-WK2-Tests-EWS/build/layout-test-results/imported/w3c/web-platform-tests/streams/readable-streams/general.any-diff.txt" took 0.18s
09:54:19.780 3 [42759/54367] imported/w3c/web-platform-tests/streams/readable-streams/general.any.html failed unexpectedly (text diff)
09:54:20.079 3 worker/2 imported/w3c/web-platform-tests/streams/readable-streams/general.any.html failed:
09:54:20.079 3 worker/2 text diff
09:54:20.128 3 Some workers failed to gracefully shut down, but in-flight exception taking precedence
command timed out: 1200 seconds without output running ['python3', 'Tools/Scripts/run-webkit-tests', '--no-build', '--no-show-results', '--no-new-test-results', '--clobber-old-results', '--release', '--gtk', '--results-directory', 'layout-test-results', '--debug-rwt-logging', '--exit-after-n-failures', '30', '--skip-failing-tests'], attempting to kill
process killed by signal 9
program finished with exit code -1
elapsedTime=1895.925172
The issue happens because a combination of two things:
- The GTK workers are still using python2 for running the layout tests. This is because the shebang on the scripts points to "python" that defaults to python2. And even if you run the script with "python3" like in "python3 Tools/Scripts/run-webkit-tests" you actually end running it with python2 because of the flatpak-sdk and how the process are re-executed inside the container. We need to change the shebangs here.
- A race condition happens when the layout tests are run with "--exit-after-n-failures N" and more than N unexpected failures happen when RWT is executed with python2. When "--exit-after-n-failures" triggers because more unexpected failures than N happen, then a TestRunInterruptedException() exception is raised from the worker and then on the exit handler for the task-queue the workers are terminated via os.kill(). Then the queue of workers is closed, but that causes a hang with python2 when waiting for the threads to be joined. This issue is not reproducible with python3.
A way to reproduce this that worked for me reliable is:
# Invalidate all the expected results for the css3 tests (truncate them to 10 chars)
$ find LayoutTests/css3 -name \*expected.txt -exec truncate -s10 '{}' \;
# Run the tests with python2
$ python2 Tools/Scripts/run-webkit-tests --no-build --release --gtk --debug-rwt-logging --exit-after-n-failures 100 fast css3
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-unassigned/attachments/20210706/aaa181df/attachment.htm>
More information about the webkit-unassigned
mailing list