[Webkit-unassigned] [Bug 228122] Add functionalities for parsing URL query string

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Mon Aug 2 11:06:04 PDT 2021


https://bugs.webkit.org/show_bug.cgi?id=228122

Darin Adler <darin at apple.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #434736|review?                     |review-
              Flags|                            |

--- Comment #24 from Darin Adler <darin at apple.com> ---
Comment on attachment 434736
  --> https://bugs.webkit.org/attachment.cgi?id=434736
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=434736&action=review

I appreciate the work to do a sorting-based algorithm instead of a hash table.

> Source/WTF/wtf/HashFunctions.h:182
> +    template<typename T, typename U> struct KeyValuePairHash {
> +        static unsigned hash(const KeyValuePair<T, U>& p)
> +        {
> +            return pairIntHash(DefaultHash<T>::hash(p.key), DefaultHash<U>::hash(p.value));
> +        }
> +        static bool equal(const KeyValuePair<T, U>& a, const KeyValuePair<T, U>& b)
> +        {
> +            return DefaultHash<T>::equal(a.key, b.key) && DefaultHash<U>::equal(a.value, b.value);
> +        }
> +        static constexpr bool safeToCompareToEmptyOrDeleted = DefaultHash<T>::safeToCompareToEmptyOrDeleted && DefaultHash<U>::safeToCompareToEmptyOrDeleted;
> +    };

We aren’t using a KeyValuePair as a hash table key. Can we either not do this at all at this time, or do it in a separate patch?

> Source/WTF/wtf/HashFunctions.h:282
> +    template<typename T, typename U> struct DefaultHash<KeyValuePair<T, U>> : KeyValuePairHash<T, U> { };

We aren’t using a KeyValuePair as a hash table key. Can we either not do this at all at this time, or do it in a separate patch?

> Source/WTF/wtf/HashTraits.h:431
> +using WTF::KeyValuePairHashTraits;

We aren’t using a KeyValuePair as a hash table key. Can we either not do this at all at this time, or do it in a separate patch?

> Source/WTF/wtf/KeyValuePair.h:67
> +    template <typename K, typename V>
> +    bool operator==(const KeyValuePair<K, V>& other) const
> +    {
> +        return key == other.key && value == other.value;
> +    }

I’d like to see this be done as a free function, after the class definition:

    template<typename KeyType, typename ValueType> constexpr bool operator=(const KeyValuePair<KeyType, ValueType>& a, const KeyValuePair<KeyType, ValueType>& b)
    {
        return a.key == b.key && a.value == b.value;
    }

But also, this change is no longer needed in this patch. We aren’t using a KeyValuePair as a hash table key. Can we either not do this at all at this time, or do it in a separate patch?

> Source/WTF/wtf/URL.cpp:1189
> +        StringBuilder firstKeyValuePairAppended;
> +        StringBuilder secondKeyValuePairAppended;
> +        firstKeyValuePairAppended.append(firstKeyValuePair.key, firstKeyValuePair.value);
> +        secondKeyValuePairAppended.append(secondKeyValuePair.key, secondKeyValuePair.value);
> +        return firstKeyValuePairAppended.toString().utf8() < secondKeyValuePairAppended.toString().utf8();

1) There is no reason to repeat this lambda twice! We can put it in a local variable and use it twice. Also, we repeat the comparison below, and should reuse for that too.

2) Concatenating strings using StringBuilder, then converting to UTF-8 is a super-expensive way to compare and makes this unnecessarily inefficient. Seems unlikely the sorting order matters at all:

Code should be more like this:

    auto compare = [&] (const KeyValuePair<String, String>& a, const KeyValuePair<String, String>& b) {
        if (int result = codePointCompare(a.key, b.key))
            return result;
        return codePointCompare(a.value, b.value);
    };
    auto comparesLessThan = [&] (const KeyValuePair<String, String>& a, const KeyValuePair<String, String>& b) {
        return compare(a, b) < 0;
    };

    std::sort(firstQueryParameters.begin(), firstQueryParameters.end(), comparesLessThan);
    std::sort(secondQueryParameters.begin(), secondQueryParameters.end(), comparesLessThan);

> Source/WTF/wtf/URL.cpp:1219
> +        if (tempParameterFromFirst.toString().utf8() < tempParameterFromSecond.toString().utf8()) {
> +            differingQueryParameters.append(firstQueryParameters[indexInFirstQueryParameter]);
> +            indexInFirstQueryParameter++;
> +        } else if (tempParameterFromSecond.toString().utf8() < tempParameterFromFirst.toString().utf8()) {
> +            differingQueryParameters.append(secondQueryParameters[indexInSecondQueryParameter]);
> +            indexInSecondQueryParameter++;
> +        } else {
> +            indexInFirstQueryParameter++;
> +            indexInSecondQueryParameter++;
> +        }

Same issue here, should not be building temporary strings and converting to UTF-8 just compare. Also, since it’s important to use the same comparison function so we do it consistently, this makes the refactoring above more valuable.

    int comparison = compare(firstQueryParameters[indexInFirstQueryParameter], secondQueryParameters[indexInSecondQueryParameter]);
    if (comparison < 0) {
        ...
    } else if (comparison > 0) {
        ...
    } else {
        ...
    }

> Source/WTF/wtf/URL.cpp:1240
> +    return url.string()[url.pathEnd() - 1] == '/' ? url.pathEnd() - 1 : url.pathEnd();

I don’t understand why this "/" removal is needed. Which test case falls if we remove this logic?

> Source/WTF/wtf/URL.cpp:1251
> +bool isEqualIgnoringQueryAndFragments(const URL& firstURL, const URL& secondURL)
> +{
> +    if (!firstURL.isValid() || !secondURL.isValid())
> +        return false;
> +    
> +    auto lengthOfFirstURLIgnoringQueryAndFragments = lengthOfURLIgnoringQueryAndFragments(firstURL);
> +    auto secondLengthOfSecondURLIgnoringQueryAndFragments = lengthOfURLIgnoringQueryAndFragments(secondURL);
> +    return StringView(firstURL.string()).substring(0, lengthOfFirstURLIgnoringQueryAndFragments) == StringView(secondURL.string()).substring(0, secondLengthOfSecondURLIgnoringQueryAndFragments);
> +}

Seems a little peculiar that two invalid URLs always compare as unequal even if they are identical. Is that really the behavior we want?

>From a factoring point of view, one more function and shorter argument names can make this code much clearer and easier to read:

    static StringView substringIgnoringQueryAndFragments(const URL& url)
    {
        return StringView { url.string() }.left(lengthOfURLIgnoringQueryAndFragments(url));
    }

    bool isEqualIgnoringQueryAndFragments(const URL& a, const URL& b)
    {
        return a.isValid() && b.isValid() && substringIgnoringQueryAndFragments(a) == substringIgnoringQueryAndFragments(b);
    }

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-unassigned/attachments/20210802/0d0e7430/attachment-0001.htm>


More information about the webkit-unassigned mailing list