[webkit-dev] Static source code analysis

Sun Jun 8 13:12:54 PDT 2008

Hi,

Darin Adler wrote:
> On May 29, 2008, at 6:21 AM, Ferenc, Rudolf wrote:
> 
>> - Calculating source code metrics (e.g. highlighting the too complex 
>> functions)
> 
> I'm not sure this would be valuable. However, there's no harm in sending 
> the list of those functions to this mailing list. I just wouldn't expect 
> any concrete results from posting such a list.

In practice, according to our experiences, metrics are mostly useful when they 
are continuously tracked and monitored. This means that the source code has to 
be measured regularly, and the results have to be stored in a database. Then, 
when the metrical values change in the wrong direction, then the developers can 
be alerted.
We performed studies on the source code of Mozilla and on Bugzilla in the past 
and showed that e.g. the CBO metric (Coupling between Object Classes) can 
predict bugs in classes with an accuracy of 70%. In other words, classes which 
are highly coupled to other classes are more fault-prone than those with a lower 
coupling value.
We published these results in the IEEE Transactions on Software Engineering,
http://doi.ieeecomputersociety.org/10.1109/TSE.2005.112

We are willing to perform these measurements for you, if you wish.

>> - Checking for rule violations (e.g. the reported issues).
> 
> So far, the reported issues haven't been critical ones. But bug reports 
> with specific issues discovered by the tool are still nice to have.

We're glad to hear this. Our experience shows that very simple rules can have a 
great effect on source code quality. E.g. one of the simplest rule checks (write 
only one instruction per line) proved to be one of the most effective rules in 
predicting bugs.
Again, eliminating all such rule violations is usually not feasible in practice 
(but we will do our best to send you patches for this purpose), but it is 
important to pay attention not to introduce new violations. This can be achieved 
also by continuously monitoring the source code.

>> - Checking for design flaws (so called "bad code smells").
> 
> I'm highly skeptical about the value of this. But again, I can't see the 
> harm in showing the results to the folks working on the project.

By design flaws (so called "bad code smells"), I meant the kinds of problems 
Martin Fowler described in his book "Refactoring" 
(http://en.wikipedia.org/wiki/Code_smell, 
http://en.wikipedia.org/wiki/Martin_Fowler).
These kinds of problems usually don't mean bugs, but they certainly decrease the 
maintainability of the source code.

>> - Detecting code duplications (copy/paste programming).
> 
> This might be helpful. Hard to say, really.

It certainly is. Just imagine the situation when somebody in a hurry copies 
source code similar to that what he needs now. Later it turns out that the 
copied code contains a bug. Now that bug is doubled...

>> Would you be interested in these results?
> 
> I think it's fine for you to post the results. But please keep your 
> expectations low -- it's quite likely these won't show us flaws that are 
> more important than the ones already known and reflected in 
> bugs.webkit.org bug reports.

Of course, functional bugs are the most important ones and those have to 
addressed in the first place. We'll do our best to send you patches to fix the 
other problems described above.

>> Addressing these issues can significantly improve the quality of the 
>> source code regarding testability, complexity and maintainability.
> 
> A strong claim!

Yes it is, but it is not baseless. E.g. we created and are maintaining the 
official code size benchmark of GCC called CSiBE 
(http://gcc.gnu.org/benchmarks/, http://www.inf.u-szeged.hu/csibe/). Introducing 
this benchmark to GCC resulted in rapid decrease of the size of the generated 
object code (desperately needed by developers of embedded software).
The key issue here was again: continuous measurement and monitoring. You can 
only control the quality if you have the appropriate measurement data available.

We will collect the current measurement results and make it available for you 
for download on our university server.

Best regards,
   Rudi.