[webkit-help] How WebKit builds HTTP GET Request headers, and how WebKit resolves relative paths in <link>s, <img>s, etc.

Benjamin Poulain benjamin at webkit.org
Tue Feb 17 23:58:25 PST 2015


Relative URLs are resolved with the document's base URL. See 
http://trac.webkit.org/browser/trunk/Source/WebCore/dom/Document.cpp#L4196
A document base's URL is generally the document's URL but there are ways 
to override it. See 
http://trac.webkit.org/browser/trunk/Source/WebCore/dom/Document.cpp#L2659

Once you have the base, resolving a URL is well defined.

Benjamin

On 2/15/15 2:44 PM, Lew Hollerbach wrote:
>
> Hello,
>
> In the most general terms, my question is around relative paths in 
> links (e.g., <link>, <img>, <script>) and the related HTTP GET 
> request, and the corresponding Request/Response headers: how WebKit 
> builds the headers, and how WebKit resolves relative paths.
>
> What happens if the URL to a resource is not an absolute URL but a 
> relative one? My own testing suggests that the “Host” request-header 
> field is what’s used to resolve relative paths. Is that correct? (If 
> it isn’t correct, then what is used to resolve relative URLs?)
>
> But, if it is, then how does the browser/WebKit know what the value of 
> this field should be? Often — but certainly not always — the value is 
> just the basic domain name, like “www.somesite.com”. But it can also 
> have different sub-domains, like “assets.somesite.com”. So what does 
> the browser use to determine this “host” value?
>
> And, is there anything in a response-header — from the original 
> request — that the browser uses to set this value? So, for a real 
> example, if you go to “www.lordandtaylor.com”, when the page is loaded 
> and parsed, the very first <script>’s “src” is
>
> "/wcsstore/dojoHBC/dojo/dojo.js". So how does the browser know that 
> the “host” value should be “www.lordandtaylor.com” here?
>
> Then, a few <script>s later, a request goes to 
>http://1.shrd.lordandtaylor.com” (absolute URL), and the immediately 
> following request again has a relative path of 
> “/wcsstore/HBCStorefrontAssetStore/javascript/jquery.min.js”; again, 
> how does the browser know to — again — use “www.lordandtaylor.com” for 
> the “host” value for this second request?
>
> This area is confusing for me and I don’t have the necessary knowledge 
> to understand the inner workings here.
>
> If we do end up knowing how this “host” value is arrived at, how can 
> we — from within JavaScript — set it so that, as the page is parsed 
> and rendered, the rendering engine can know what this “host” should 
> be? Is there some global or window.variable, or other global setting 
> that can be set via an API from JavaScript? Obviously the rendering 
> engine (or some other part of the browser) must know what it is so 
> that the Request-header is correctly set when the various HTTP GET 
> requests — to fetch images, CSS files, JavaScript files, etc. — are 
> invoked. Would you know how we can set this “host” value?
>
> And, as I mentioned earlier, if it’s not “host” that determines how 
> relative paths are resolved, then what is?
>
> I’m trying to load a Website through your proxy, to bypass the 
> same-origin restrictions, and have it fully rendered inside an 
> <iframe>. Not for any nefarious purposes such as clickjacking; no, 
> we’re building a consumer-facing app that features a way to view and 
> browse any Website from the app, with some other features. And the app 
> (a hybrid mobile app) runs inside embedded WebKit.
>
> I know this can be done: A few companies have successfully implemented 
> such a feature, for use cases such as customer service, co-browsing, 
> etc. Every company that I know of who has successfully done this has 
> been acquired — presumably exactly for this feature — and so their IP 
> is clearly a trade secret and can’t be easily (if at all) gotten at.
>
> If what we’re trying to do can’t be done using proxies, is there 
> another way that you would work, that would result in the same 
> ultimate experience — of being able to load any given Website into 
> some container and have it be fully functional, allow all the links to 
> be traversed and assets loaded?
>
> Thanks for any and all insights, suggestions, and general help!
>
> Lew
>
>
>
> _______________________________________________
> webkit-help mailing list
> webkit-help at lists.webkit.org
> https://lists.webkit.org/mailman/listinfo/webkit-help

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.webkit.org/pipermail/webkit-help/attachments/20150217/0864eda1/attachment-0001.html>


More information about the webkit-help mailing list