[webkit-qt] The bot infrastructure and gardening.

Thu May 10 10:32:18 PDT 2012

On Thursday, May 10, 2012 05:01:05 PM Osztrogonac Csaba wrote:
> Hi All,
> 
> Alexis Menard írta:
> > Hi,

[snip]

> The biggest problem is that folks who don't use Ubuntu 11.10 got thousands
> of failing tests because of minor font differences. In this case the best
> solution isn't that "I can't reproduce the results, so I won't run layout
> tests anymore." It would be more valuable for the whole project if
> font(config) experts try to make the WebKit, Qt, fontconfig or anything
> else to use same fonts. I don't know if it is possible or not, I don't know
> anything about fonts. Is it possible somehow to bundle a chosen fontconfig
> to Qt or to WebKit and use it for regression testing on all distro instead
> of sweating because of different system fontconfig versions?

About the minor differences I think using --ignore-metrics when running the 
tests locally could help a lot with those font issues depending on what are 
you working on.

> > - We don't have any gardening plan.
> 
> Not only the missing gardening plan is the problem. In my
> opinion introducing contributing rules would be more important.
> For example:
>   - Developers should build the patch and run tests before committing.
>     (Or at least watch the bots after landing and fix/rollout quick if
> something goes wrong) - What should I do if I broke the build / a layout
> test / API test ? - What should a gardener do if somebody doesn't care with
> the regression he/she caused ? - What should do the boss if somebody
> usually and intentionally hurt the rules? :)
> > What could be improved :
> > 
> > - We need to make a gardening plan. We can't be serious about making
> > web browsers/APIs without improving our coverage. I know we don't have
> > much resources but I think it should be ok to have one person doing it
> > for a week and then turn. Really it's a week maybe boring but it's
> > once every long time (almost one time every two-three months). This
> > will make Ossy more free to do something else so Ossy can go back
> > proper coding. I can make that list if people agree. Also it needs to
> > be enforced (maybe reviews could be the exception).
> 
> Gardening isn't so simple that only one person can be done. It can be enough
> for fire-fighting: buildfixes, updating expected files, reporting bugs, fix
> some trivial bug. But isn't enough to fix all regression caused by others
> who aren't responsible at all or the regression occured on the part of
> WebKit you don't know anything. Not to mention there are many complex
> tests, and there isn't trivial to decide if the new result is correct or
> not.
> 
> I added our gardening timetable to this wiki:
> https://trac.webkit.org/wiki/QtWebKitBuildBots
> 
> All new volunteers are very welcome. ;-) It would be great if you guys in
> INdT could be join, you are near to PDT timezone. And handling problems
> freshly is always simpler than waiting for hungarian morning and trying
> solve dozens of new regressions, broken builds, assertions, flakey tests,
> ...
> 
> > - We need to be able to test/stress/break the bot environment. Today
> > the fact that none of us can mess up with the bot make it hard to
> > reproduce the failures of the bot that you can't see on your machine.
> > While I do understand (and we don't want that) that Ossy doesn't give
> > us the key to the bot, we still need to have one to mess around.
> 
> We hacked too many times in the past to make layout test system be able run
> more than one bot on the same 8-24 cores machine. But the limitation is
> still for one linux user. We still have a strict limitation: An other user
> trying to run tests on the same machine can kill all the bots, so now only
> one user is allowed. In this case it isn't a good idea if anybody logs in
> and hacking something. When I have to do it, I'm very very careful, but
> sometimes I broke everything accidentally.
> 
> > So if we are moving to EC2 could we create one instance there that would
> > be
> > be the exact clone of the real bot (the only running layout test, WK2)
> > and with a free access for devs? That would allow us to mess
> > around/figure out problems, come up with a list of things to do on the
> > main bot (that Ossy or whatever admin could do) and then we rollback
> > the dev instance to a clone of the main one and it gets free for the
> > next dev.
> 
> EC2 is simpler. You don't have to rollback the instance you hacked on.
> Just stop it and then delete. And the next developer run a new instance.
> Running a new instance takes only several minutes. And you only pay for
> runtime (all started hours) and a little bit for storage, IO and network.
> 
> But there are still some technical problem (policy, account, users, ...)
> should be solved. We are working hard to find all necessarry thing,
> and then I will talk to Simon about the details.
> 
> > This will allow every single of us to have the exact same
> > environment with freedom to test what's going on. I know Linode
> > supports cloning instances but EC2 supports it? Also Linode allows you
> > to rollback the VM to any state you saved before (so you take the VM,
> > save the state, do your testing, fix, then rollback for the next guy).
> 
> I don't know anything about Linode. But EC2 supports cloning AMI images.
> We are going to maintain a master image and create new AMI when it is
> necessary and then replace the bots with some clicks instead of installing
> all new packages, security updates, update Qt5 on each machine parallelly.
> 
> > Also Ossy could you describe a bit that move to EC2? We're moving all
> > the bots there (I'm not sure the bots which builds only brings much
> > value but whatever)?
> 
> I think moving all bots is impossible and absolutely unnecessary. Now
> we have only one EC2 instance (High-Memory Quadruple Extra Large) with
> 8 cores and a clean build takes ~15 minutes. (~$7200/a year + IO + network
> if you buy a reserved instance for a year)
> 
> Migrating build only buildbots and EWS bots to Amazon would be absolutely
> unnecessary and vasting money ... Our build farm has 150-200 cores and
> building for all testers isn't a big deal for them. And for testing a
> 2 cores machine is more than enough and is cheaper (~$850/a year) Now our
> bots are two-in-one builder and tester, but separating them to builder and
> tester is so simple. Only debug bots shouldn't be separated, because
> uploading 1.2Gb from Szeged to Apple master and then download it from Apple
> master to Amazon EC2 would be very slow and very expensive.
> 
> Additionally our ARM bots, Windows cross compiler bots isn't ready for
> migrating to Amazon now. Of course, it is possible in the future, but it
> needs more working on them. And migration perf bots is impossible, they
> need dedicated hardware.
> 
> br,
> Ossy
> _______________________________________________
> webkit-qt mailing list
> webkit-qt at lists.webkit.org
> http://lists.webkit.org/mailman/listinfo.cgi/webkit-qt
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.webkit.org/pipermail/webkit-qt/attachments/20120510/5603a915/attachment-0001.bin>