[webkit-dev] [webkit-reviewers] usage of auto

Tue Jan 10 22:17:03 PST 2017

Brady asked:

> Have you identified any situation where explicitly calling out the type in a range-based for loop has been better than using the proper form of auto?

> Have you identified a situation where explicitly calling out a nasty templated type, like in my example, added to readability over using auto?

Darin asked:

> I’d love to see examples where using auto substantially hurts readability so we could debate them.

In many places, I agree that auto is better.  But there are a bunch of algorithms in the compiler and GC where knowing the type helps me read the code more quickly.  Here are a few examples.  (1) is a loop, (2) is loop-like, and (3) is a nasty templated type.

1) B3 compiler code cloning loop.

I find that this code is easier to read with types:

                Value* clone = m_proc.clone(value);
                for (Value*& child : clone->children()) {
                    if (Value* newChild = mappings[i].get(child))
                        child = newChild;
                }
                if (value->type() != Void)
                    mappings[i].add(value, clone);

                cases[i]->append(clone);
                if (value->type() != Void)
                    cases[i]->appendNew<UpsilonValue>(m_proc, value->origin(), clone, value);

Here's another code cloning loop I found - sort of the same basic algorithm:

            for (Value* value : *tail) {
                Value* clone = m_proc.clone(value);
                for (Value*& child : clone->children()) {
                    if (Value* replacement = map.get(child))
                        child = replacement;
                }
                if (value->type() != Void)
                    map.add(value, clone);
                block->append(clone);
            }

When reading this code, it's pretty important to know that value, clone, child, newChild, and replacement are all Values.  As soon as you know this piece of information the algorithm - its purpose and how it functions - becomes clear.  You can infer this information if you know where to look - m_proc.clone() and clone->children() are give-aways - but this isn't as immediately obvious as the use of the type name.

If someone refactored this code to use auto, it would be harder for me to read this code.  I would spend more time reading it than I would have spent if it spelled out the type.  I like seeing the type spelled out because that's how I recognize if the loop is over blocks, values, or something else.  I like to spell out the type even when it's super obvious:

        for (BasicBlock* block : m_proc) {
            for (Value* value : *block) {
                if (value->opcode() == Phi && candidates.contains(block))
                    valuesToDemote.add(value);
                for (Value* child : value->children()) {
                    if (child->owner != block && candidates.contains(child->owner))
                        valuesToDemote.add(child);
                }
            }
        }

Sticking to this format for compiler loops means that I spend less time reading the code, because I can recognize important patterns at a glance.

2) Various GC loops

The GC usually loops using lambdas, but the same question comes into play: is the value's type auto or is it spelled out?

    forEachFreeCell(
        freeList,
        [&] (HeapCell* cell) {
            if (false)
                dataLog("Free cell: ", RawPointer(cell), "\n");
            if (m_attributes.destruction == NeedsDestruction)
                cell->zap();
            clearNewlyAllocated(cell);
        });

It's useful to know that 'cell' is a HeapCell, not a FreeCell or a JSCell.  I believe that this code will only compile if you say HeapCell.  Combined with the function name, this tells you that this is a cell that is free, but not necessarily on a free-list ('cause then it would be a FreeCell).  This also tells you that the cell wasn't necessarily a JSCell before it was freed - it could have been a raw backing store.  That's important because then we don't have a guarantee about the format of its header.

I think that spelling out the type really helps here.  In the GC, we often assert everything, everywhere, all the time.  Typical review comes back with suggestions for more assertions.  Types are a form of assertion, so they are consistent with how we hack the GC.

3) AirEmitShuffle

My least favorite part of compilers is the shuffle.  That's the algorithm that figures out how to move data from one set of registers to another, where the sets may overlap.

It has code like this:

            while (Arg src = worklist.pop()) {
                HashMap<Arg, Vector<ShufflePair>>::iterator iter = mapping.find(src);
                if (iter == mapping.end()) {
                    // With a shift it's possible that we previously built the tail of this shift.
                    // See if that's the case now.
                    if (verbose)
                        dataLog("Trying to append shift at ", src, "\n");
                    currentPairs.appendVector(shifts.take(src));
                    continue;
                }
                Vector<ShufflePair> pairs = WTFMove(iter->value);
                mapping.remove(iter);

                for (const ShufflePair& pair : pairs) {
                    currentPairs.append(pair);
                    ASSERT(pair.src() == src);
                    worklist.push(pair.dst());
                }
            }

I sure appreciate the type of that iterator.  I totally agree that in 99% of cases, the type of an iterator should be auto because you really don't want to know its type.

But in this code, seeing the type of that iterator makes it much easier to tell at a glance what the code is doing.  Without that type, I'd have to scroll around to see what mapping really is.  It's great to know that the HashMap maps Args to Vector<ShufflePair>, since this reassures me that there are no fancy conversions when src is used as a key and when the iter->value is WTFMoved to my pairs variable.

Oh, and if the while loop used auto instead of Arg, I'd be super confused.  In the compiler, I think that loops that benefit from explicit type far outnumber loops that don't.

-Filip
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.webkit.org/pipermail/webkit-dev/attachments/20170110/2930cfea/attachment.html>