[webkit-reviews] review granted: [Bug 195514] Invalid flags in a RegExp literal should be an early SyntaxError : [Attachment 364151] Patch with YarrFlags refactor

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Sun Mar 10 13:34:22 PDT 2019


Darin Adler <darin at apple.com> has granted Ross Kirsling
<ross.kirsling at sony.com>'s request for review:
Bug 195514: Invalid flags in a RegExp literal should be an early SyntaxError
https://bugs.webkit.org/show_bug.cgi?id=195514

Attachment 364151: Patch with YarrFlags refactor

https://bugs.webkit.org/attachment.cgi?id=364151&action=review




--- Comment #14 from Darin Adler <darin at apple.com> ---
Comment on attachment 364151
  --> https://bugs.webkit.org/attachment.cgi?id=364151
Patch with YarrFlags refactor

View in context: https://bugs.webkit.org/attachment.cgi?id=364151&action=review

Looks like a really good change.

I’m not sure the includes here are minimal. For example, YarrFlags.h includes
OptionSet.h but there are many cases where we are adding both. Similarly,
anywhere we add StringHash.h we can remove other includes like WTFString.h.

>> Source/JavaScriptCore/bytecompiler/NodesCodegen.cpp:147
>>	    return generator.emitNewRegExp(generator.finalDestination(dst),
regExp);
> 
> Note:
> We could almost assert isValid here, if not for invalid references in Unicode
patterns like /\1/u and /\k<a>\u.
> These too are supposed* to be early errors though, so we can handle this in a
subsequent patch.
> 
> * https://test262.report/browse/language/literals/regexp/u-dec-esc.js
>  
https://test262.report/browse/language/literals/regexp/named-groups/invalid-dan
gling-groupname-without-group-u.js

Perhaps we can do *more* than just assert isValid in a subsequent patch. I
suspect we could get rid of the concept of an invalid RegExp entirely and
remove the RegExp::invalid function as well.

> Source/JavaScriptCore/runtime/CachedTypes.cpp:42
> +#include "YarrFlags.h"

I think we could use a forward declaration here instead of including the
header. Not 100% sure. Maybe not worth trying to optimize that.

> Source/JavaScriptCore/runtime/CachedTypes.cpp:44
>  #include <wtf/Forward.h>

I doubt this is needed. I am sure that one of the other headers includes it.

> Source/JavaScriptCore/runtime/RegExpCache.h:34
> +#include "YarrFlags.h"

Can this be done with a forward declaration instead of an include? I believe
that in the past we found that we can just forward declare enumerations to
compile things like this.

> Source/JavaScriptCore/runtime/RegExpConstructor.cpp:212
> +    EXCEPTION_ASSERT(!!scope.exception() == (flags ==
Yarr::Flags::Invalid));
> +    if (UNLIKELY(flags == Yarr::Flags::Invalid))
>	   return nullptr;

Seems like we should remove the unnecessary EXCEPTION_ASSERT trick here and
just do RETURN_IF_EXCEPTION here. I don’t think it’s significantly better for
performance and it requires us to preserve a special Invalid value which I
don’t think we want unless it is critical for performance.

> Source/JavaScriptCore/runtime/RegExpConstructor.cpp:254
> +	       EXCEPTION_ASSERT(!!scope.exception() == (flags ==
Yarr::Flags::Invalid));
> +	       if (flags == Yarr::Flags::Invalid)
>		   return nullptr;

Ditto.

> Source/JavaScriptCore/runtime/RegExpPrototype.cpp:160
> -	   RegExpFlags flags = NoFlags;
> +	   OptionSet<Yarr::Flags> flags;
>	   if (!arg1.isUndefined()) {
> -	       flags = regExpFlags(arg1.toWTFString(exec));
> +	       flags = Yarr::parseFlags(arg1.toWTFString(exec));
>	       RETURN_IF_EXCEPTION(scope, encodedJSValue());
> -	       if (flags == InvalidFlags)
> +	       if (flags == Yarr::Flags::Invalid)
>		   return throwVMError(exec, scope, createSyntaxError(exec,
"Invalid flags supplied to RegExp constructor."_s));
>	   }

Seems like this should be sharing more code with the toFlags function in
RegExpConstructor.cpp. This does a lot of the same work in slightly different
ways, which seems unnecessary.

> Source/JavaScriptCore/testRegExp.cpp:332
> -    RegExp* r = RegExp::create(vm, pattern.toString(), regExpFlags(line +
i));
> +    RegExp* r = RegExp::create(vm, pattern.toString(), Yarr::parseFlags(line
+ i));

Seems like we should handle the invalid flags here, rather than passing invalid
flags into RegExp::create.

> Source/JavaScriptCore/yarr/YarrFlags.h:29
> +#include <wtf/OptionSet.h>
> +#include <wtf/text/StringView.h>

We don’t need either of these includes to compile this header. Forward.h would
be sufficient.

> Source/JavaScriptCore/yarr/YarrFlags.h:40
> +    Invalid = 1 << 6,

Do we really need to keep Flags::Invalid? In most other similar cases, we just
use Optional in places where we are parsing. Using Optional is nice because it
makes certain mistakes fail to compile rather than doing something unexpected
at runtime. For example, someone could parse and look for one of the flags and
forget to check for Invalid. But you can’t compile code that makes that mistake
with Optional (or Expected).

> Source/JavaScriptCore/yarr/YarrFlags.h:41
> +    DeletedValue = 1 << 7

On the other hand, I suppose we do need DeletedValue for OptionSet<Flags> so we
can use it as a hash table key and I don’t object to keeping it even though
it’s inelegant and worth looking for a more elegant yet efficient way to do it
in the long run.

> Source/JavaScriptCore/yarr/YarrFlags.h:44
> +JS_EXPORT_PRIVATE OptionSet<Flags> parseFlags(StringView);

I suggest having this return Optional<OptionSet<Flags>> instead of using a
special invalid value.

> Source/JavaScriptCore/yarr/YarrPattern.cpp:1110
> +    ASSERT(m_flags != Flags::Invalid);

This assertion is only needed because of the poor pattern of having an Invalid
value that we should eliminate. Worth noting that a deleted value here is
equally bad and we should consider asserting about that, too.

Also, I’m thinking maybe we should put the assertions in the constructor
instead of here?


More information about the webkit-reviews mailing list