[Webkit-unassigned] [Bug 38572] [WTFURL] Add core URL parser
bugzilla-daemon at webkit.org
bugzilla-daemon at webkit.org
Wed May 5 17:45:40 PDT 2010
https://bugs.webkit.org/show_bug.cgi?id=38572
--- Comment #12 from Adam Barth <abarth at webkit.org> 2010-05-05 17:45:38 PST ---
> > 'sources': [
> > 'src/URLComponent.h',
> > + 'src/URLParser.h',
>
> Is this intentional? My understanding was that no port (including chromium) was
> using this right away, and it's strange that only a header was added.
This build file isn't used by any port at the moment, but I'm trying to land
the library in an incrementally buildable form. We can omit the build file if
you like. (I have a local git repo with a complete build file that I can work
from.)
> > + * The contents of this file are subject to the Mozilla Public License Version
>
> Aren't we supposed to only leave LGPL when importing Mozilla code? I don't know
> for sure, but I never saw any official word permitting tri-license in WebKit
> repository.
Dunno. I'm happy to strip this down to LGPL if that's the thing to do.
> > + // This handles everything that may be an authority terminator, including
> > + // backslash. For special backslash handling see parseAfterScheme.
> > + static bool isAuthorityTerminator(CHAR ch)
>
> Why not call it "isPossibleAuthorityTerminator" then? BTW, I didn't know that
> semicolon could do that.
Will do.
> Also, I'd make this function just take UChar32. We only need CHAR in arrays.
The original library used concrete types for these also, but it used a
different concrete type. :)
My plan is to leave the concrete string types one or two layers up in the
library so that different folks can use it with different types of strings
(e.g., WTFString and std::wstring).
> > + // Given an already-identified auth section, breaks it into its consituent
>
> Typo: "consituent"
Will do.
> > + static void parseAuthority(const CHAR* spec, const URLComponent& auth, URLComponent* username, URLComponent* password, URLComponent* host, URLComponent* port)
>
> All URLComponent arguments are required here, so they should all be passed in
> the same way, either by pointer or by reference. Or is it some calling
> convention I do not know about?
This is an artifact of the original Google style of this function. In that
convention, input parameters are passed by constant reference and output
parameters are passed by pointer. In this case, we're extracting the username,
password, host, and port from the auth "meta"-component. I can convert the
output parameters to non-constant references if you prefer.
> > + // FIXME: add ASSERT(auth.isValid()); // We should always get an authority.
> > + if (auth.length() == 0) {
>
> Should be !auth.length() (or can we even say auth.isEmpty()?)
I'll add an isEmpty() method because I personally dislike !auth.length().
> > + // Search backwards for @, which is the separator between the user info and
> > + // the server info.
> > + int i = auth.begin() + auth.length() - 1;
> > + while (i > auth.begin() && spec[i] != '@')
> > + --i;
>
> This will find '@' as the last character, but not as the first one. If that's
> correct, this should ideally have a comment explaining why we don't care.
I believe it's correct, but I don't have a reference off-hand. The reason is
that people might have passwords that contain @ characters and they might not
URL-encode them.
> I only read a small portion of the patch, but it changed while I was looking at
> it; will start over later or tomorrow.
Ah sorry about that. I updated it to reflect the fact that Maciej asked me to
rename ref to fragment.
Thanks!
--
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
More information about the webkit-unassigned
mailing list