[webkit-reviews] review denied: [Bug 228781] [ARM64] Add pattern matching for Load/Store Pair : [Attachment 436577] Patch

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Thu Aug 26 19:06:06 PDT 2021


Saam Barati <sbarati at apple.com> has denied Yijia Huang
<yijia_huang at apple.com>'s request for review:
Bug 228781: [ARM64] Add pattern matching for Load/Store Pair
https://bugs.webkit.org/show_bug.cgi?id=228781

Attachment 436577: Patch

https://bugs.webkit.org/attachment.cgi?id=436577&action=review




--- Comment #18 from Saam Barati <sbarati at apple.com> ---
Comment on attachment 436577
  --> https://bugs.webkit.org/attachment.cgi?id=436577
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=436577&action=review

> Source/JavaScriptCore/ChangeLog:88
> +	   The equivalent Pattern is:
> +	   Store Pair Pattern 1:
> +	       memory1 = Store(value1, base, offset + bytes)
> +	       ...
> +	       memory2 = Store(value2, base, offset)
> +	   Store Pair Pattern 2:
> +	       memory1 = Store(value1, base, offset)
> +	       ...
> +	       memory2 = Store(value2, base, offset + bytes)
> +
> +	   Note: The bytes means 4/8 for 32/64 bit memory ops.
> +
> +	   First, we need to convert to the canonical form, which is subject to
control equivalence. The
> +	   conversion is shown in below:
> +
> +	   Convert Store Pair Pattern 1 to the Canonical Form:
> +	       memory1 = Store(value1, base, offset + bytes)		memory1
= Nop
> +	       ...							...
> +	       memory2 = Store(value2, base, offset)		-->	memory2
= Store(value2, base, offset)
> +	       ...						       
newMemory1 = Store(value1, base, offset + bytes)
> +
> +	   Convert Store Pair Pattern 2 to the Canonical Form:
> +	       memory1 = Store(value1, base, offset)			memory1
= Nop
> +	       ...							...
> +	       ...						       
newMemory1 = Store(value1, base, offset)
> +	       memory2 = Store(value2, base, offset + bytes)	-->	memory2
= Store(value2, base, offset + bytes)
> +
> +	   Then, lower the canonical form to Air:
> +	       memory1 = Store(value1, base, offset)
> +	       memory2 = Store(value2, base, offset + bytes)	-->    
StorePair %value1, %value2, (%base, offset)

I think you can just say something like "Store Pair canonicalization works like
Load pair, except we don't need to worry about users of the store"

> Source/JavaScriptCore/b3/B3CanonicalizeLoadStorePair.cpp:62
> +static void allPathsUtil(BasicBlock* start, BasicBlock* end, bool* visited,
unsigned* path, unsigned pathIndex, HashSet<unsigned>& blocks)
> +{
> +    if (start == end) {
> +	   for (unsigned i = 1; i < pathIndex; ++i)
> +	       blocks.add(path[i]);
> +	   return;
> +    }
> +
> +    unsigned startBlockIndex = start->index();
> +    visited[startBlockIndex] = true;
> +    path[pathIndex] = startBlockIndex;
> +    for (BasicBlock* next : start->successorBlocks()) {
> +	   if (!visited[next->index()])
> +	       allPathsUtil(next, end, visited, path, pathIndex + 1, blocks);
> +    }
> +    visited[startBlockIndex] = false;
> +}

This algorithm is overkill.

We can cheat, since we know that start dominates end (we can assert it), we
don't need to do anything crazy to backtrack if we don't reach end. We'll know
each path leaving start will reach end. So we can just iterate successors
iteratively, checking if they're visited, to add to a set of basic blocks.

> Source/JavaScriptCore/b3/B3CanonicalizeLoadStorePair.cpp:77
> +    HeapRange range = m1->opcode() == Load ? m1->effects().reads :
m1->effects().writes;

this isn't quite right for fenced operations. I also don't think we can move
anything that's fenced around anything else that's fenced. So we need to be
careful w.r.t that.

> Source/JavaScriptCore/b3/B3CanonicalizeLoadStorePair.cpp:78
> +    for (unsigned i : blocks) {

Let's tabulate this information on a per-block basis (maybe we can do this
lazily when first needed), by taking the union of all effects in an entire
block. So we don't have to repeatedly iterate all values in a block. We can
just our cache for the summation of a block's effects.

> Source/JavaScriptCore/b3/B3CanonicalizeLoadStorePair.cpp:81
> +	       if (value->effects().reads.overlaps(range)
> +		   || value->effects().writes.overlaps(range))

let's call effects() once, and use its result.

> Source/JavaScriptCore/b3/B3CanonicalizeLoadStorePair.cpp:142
> +		       // we should double check in case any collisions appears
in base offset hashing

why would there be a collision?

> Source/JavaScriptCore/b3/B3CanonicalizeLoadStorePair.cpp:240
> +	   for (Value* child : value->children()) {
> +	       auto iter = loadUses.find(child);
> +	       if (iter != loadUses.end())
> +		   iter->value.append(value);
> +	   }

I think this algorithm would be easier if you just moved memory2 to memory1,
since memory1 dominates memory2, you can always move memory2 to memory1's
program location without violating SSA. (There are other constraints to be able
to move it, of course.) Doing that would allow you to drop this loadUses stuff.

> Source/JavaScriptCore/b3/B3LowerToAir.cpp:3480
> +		   if (!isValidForm(opcode, Arg::PreIndex, Arg::Tmp) ||
!m_index || memory->accessBank() != GP)

add a test?


I also think this code isn't considering if the memory value has a fence, and
doing the wrong thing there. Same for load with increment.

> Source/JavaScriptCore/b3/B3LowerToAir.cpp:3523
> +		   if (!isValidForm(opcode, Arg::Tmp, Arg::Tmp, Arg::Addr) ||
!m_index || memory2->accessBank() != GP)

do we have a test for accessBank() here?

> Source/JavaScriptCore/b3/air/AirOpcode.opcodes:673
> +arm64: StorePair64 U:G:64, U:G:64, D:G:64

this is sorta weird, since it's defining 128 bits of stuff.

> Source/JavaScriptCore/b3/air/AirOpcode.opcodes:676
> +arm64: LoadPair64 U:G:64, D:G:64, D:G:64

ditto, bur reading 128 bits

> Source/JavaScriptCore/b3/air/AirOpcode.opcodes:695
> +arm64: StorePair32 U:G:32, U:G:32, ZD:G:32

ditto, this is sorta weird since it's defining 64bits of stuff

> Source/JavaScriptCore/b3/air/AirOpcode.opcodes:698
> +arm64: LoadPair32 U:G:32, ZD:G:32, ZD:G:32

ditto, but reading.


More information about the webkit-reviews mailing list