[webkit-dev] Out-of-process networking and potential for sharing memory cache (was Re: Feature Announcement: Moving HTML Parser off the Main Thread)

Maciej Stachowiak mjs at apple.com
Wed Jan 16 23:33:58 PST 2013


Hi Adam,

You raise a number of interesting points which I'll try to address.

On Jan 12, 2013, at 1:14 AM, Adam Barth <abarth at webkit.org> wrote:

> On Sat, Jan 12, 2013 at 12:03 AM, Maciej Stachowiak <mjs at apple.com> wrote:
>> 
>> Do you think thread in the UI process vs. completely separate process is a topic worth discussing? It seems like the WebCore layer is unaffected by the difference, and in fact the impact of Chromium's choice is not even visible in the WebKit repository afaict.
> 
> I don't know.  As I wrote above, I haven't really thought through the
> consequences of that design choice.  My point was just that the design
> wasn't discussed with the community at all.

The NetworkProcess and its nature as a process have been mentioned before. At the time, no one expressed an opinion about the matter or pressed for an alternative, and it seems you have not (yet) done so either. If you have an interest in discussing it yourself, I at least would be happy to discuss it. If, for example, you would like to ask questions about it, advocate for a different design, argue that it's important for WK2 and Cr architectures to be consistent in this regard, present the Chromium team's reasons for choosing a thread, or anything else, then I would gladly engage in such discussion.

Indeed, if anyone has a substantive point to make, I'll concede the foul of insufficient prior discussion. If no one does, then it doesn't seem very valuable to debate the meta-point.


>> (One reason for our particular choice, FYI, is so that we can give the NetworkProcess a tighter sandbox profile than the Safari UI process).
> 
> I'm surprised that the Mac OS X sandboxing mechanisms are
> sophisticated enough to provide a meaningful sandbox for the
> NetworkProcess.  That's certainly not possible on other platforms
> (e.g., Windows).  The reference monitor we use in Chromium for network
> requests contains a great deal of web-specific details that are
> necessary to prevent, for example, an attacker from stealing
> confidential information (such as tax returns) stored in the user's
> file system.

I take your word for it that it's not possible on Windows.

I believe that the NetworkProcess can feasibly be denied the following privileges via Mac OS X sandboxing mechanisms, whereas none of these can feasibly be denied to the UI process:

- Access to the window server
- Access to the pasteboard server
- Access to arbitrary local files in the filesystem

I'm going to guess you are the most skeptical about whether the third can be done in a meaningful way. If you would like to get deep into the details of how it might be done, then I'd sincerely love to have your expert review, but it might be something that we should discuss outside this thread.

However, to give you a quick overview: Mac OS X sandboxing mechanisms allow fine-grained control over the file access for a process. The sandbox profile's rules can allow or deny access to any section of the filesystem. There is also a way to dynamically grant temporary access to a given file or portion of the filesystem from a more privileged process. Deny rules can also take precedence over extensions. Thus, at minimum, it is straightforward to achieve the following:

(1) When the user has not explicitly opened any local files, the NetworkProcess can only access its assigned whitelist of files (including such things as cache and cookies directories, for instance).
(2) For the duration of the user having an explicitly chosen local document open, the NetworkProcess can be temporarily granted access to the whole filesystem excepting sensitive blacklisted areas.

(2) is obviously suboptimal, since social engineering attacks could be used to trick a user into opening a local file, but it does prevent drive-by attacks. But it's possible to do better with more work. Let me throw out two rough strawman designs. Let's presume for both of these that local files are opened in separate WebProcesses from any remote webpages. There are at least two approaches to avoiding giving the NetworkProcess the run of the filesystem even temporarily. One is to grant the extension the local file WebProcesses, and have them load files directly rather than via proxy. Another is to have a FileAccessProcess or the like which only handles file: URLs but otherwise works like the NetworkProcess, and only WebProcesses serving a local file document would be given a handle to it. There might be other even more clever ways but I hope this suffices for an existence proof.

Again, we could dig more into the details, but I think that would merit a separate discussion (perhaps off of webkit-dev).

For these reasons, I am reasonably confident that a separate process for networking can be made to have better security properties on Mac OS X (and platforms with similarly sophisticated sandboxing mechanisms) than a thread inside the UI process. And these better properties are intrinsically tied to being a process rather than a thread.

Against this, the one advantage I can think of to the thread approach is saving the memory and IPC costs associated with an extra process.


> 
>>> The main point I was trying to make in the document is that hooking in
>>> at the CachedResource layer has worse security properties than hooking
>>> in at the ResourceHandle layer.  I understand that Safari and Chromium
>>> have different goals when it comes to security, which I believe is the
>>> main reason you're even willing to consider hooking in at the
>>> CachedResource layer.
>>> 
>>> In particular, Chromium needs to be able to run in a strong sandbox on
>>> Windows XP.  Unlike Mac OS X, Windows XP does not have a particularly
>>> sensible sandboxing mechanism, mostly because the OS mechanisms that
>>> we use in Chromium were not intended to be used for sandboxing by the
>>> designers of XP.  As a consequence, any network requests that we fail
>>> to hook will simply fail (or more likely just crash).  For that
>>> reason, we need complete interposition, which means we need to hook
>>> into WebCore at a layer like ResourceHandle that has actually has
>>> complete interposition.
>>> 
>>> Moreover, we've enjoyed the security benefits of complete
>>> interposition for a while now and have raised our security goals,
>>> namely we're interested in achieving software fault isolation based
>>> API integrity.  As far as I can tell, the approach WebKit2 has taken
>>> to out-of-proces networking is not compatible with software fault
>>> isolation based API integrity.  Now, I understand if that sounds very
>>> futuristic to you given the current security posture of the apple-mac
>>> port, but if you're serious about working on a common design, you need
>>> to be willing to accept design goals and constraints beyond those that
>>> affect the apple-mac port.  As long as you minimize or ignore those
>>> constraints, it's unlikely that the folks who have those goals and
>>> constraints will be interested in adopting your designs.
>> 
>> I'm certainly not blowing off your use case. But I would like to understand how it impacts the design.
>> 
>> Can you explain why the WebKit2 approach to out-of-process networking is incompatible with software fault isolation based API integrity, whereas interposing at the ResourceHandle layer is? Is it (a) because some loads don't go through the CachedResource layer currently, (b) because there are some loads you think it is fundamentally impossible (or at least unduly challenging) to send through the CachedResource layer, or (c) some other reason?
> 
> Issue (a) is already a problem with the CachedResource approach with
> the sandbox we use today.  Network requests that don't get intercepted
> will not work on Windows because code running inside the sandbox we
> use on Windows cannot talk to the network.

Our long-term goal is to make all WebCore loads go through ResourceLoader/CachedResource, and have nothing hit ResourceHandle directly. I agree with you that this is not the case today. I take it from your response that you don't think it is fundamentally impossible or overly complex to make this happen. And it seems like attaining this goal would have a number of advantages regardless of the interposition layer of choice, such as making all loads, even obscure ones, correctly respect ResourceLoader-level concepts such as ApplicationCache and WebArchives.

> 
> When using software fault isolation (SFI), the untrusted code cannot
> talk with the underlying operating system at all (not even on advanced
> operating systems like Mac OS X that provide high quality sandboxing
> APIs).  In particular, code inside an SFI sandbox cannot use
> CFNetwork.  Given that WebCore contains code that uses ResourceHandle,
> we're left with the following choice:
> 
> 1) Implement ResourceHandle in terms of an SFI API rather than an
> operating system API.
> 2) Move the parts of WebCore that call ResourceHandle out of the SFI
> sandbox so that ResourceHandle can use operating system APIs.
> 
> Generally speaking, the more code we put in the sandbox, the better
> security we'll achieve.  I'd certainly hope we'd be able to put at
> least WebCore inside the sandbox, which makes (2) unattractive.  (In
> fact, I hope we'll be able to put much of the "content" and "chrome"
> layers of Chromium inside the SFI sandbox as well, much as these
> layers are inside the current Chromium sandbox.)  In order to sandbox
> WebCore, we'll need to choose (1), which means we'll need to intercept
> network requests at the ResourceHandle layer anyway.

I'm going to admit that I don't fully grok the details of what SFI is about. So I may be misunderstanding your point. I hope you will correct me if so.

My understanding of your point is that, when interposing at the CachedResource/ResourceLoader level rather than at the ResourceHandle level, in some sense "more code" has to run outside the web process/render process sandbox, and that's the sandbox that we are most confident can be applied across many platforms, and made really tight, so that it can work with schemes like SFI.

Is that correct?

If so, I concede your point to some extent, but I'm not sure I concede it entirely. It seems to me that a network process or thread will have code for the following things, whatever level it hooks in at:
- Making network connections
- Parsing/processing http, ssl, ftp and other protocols
- Accessing the disk cache
- Accessing the cookie store
- Maintaining some form of memory cache
- Organizing per-host connection pools and scheduling resource loads into them.

And it seems like in either scheme, it doesn't need to do much more. So it doesn't seem inevitable to me that a lot more code overall will run in the network process / network thread if it interposes at a higher layer. This seems like a question the answer which is fact-intensive and sensitive to details, and cannot be answered a priori. It may also be that for some of this functionality, in one case it would end up handled by WebCore code and in another case by code written separately, but this doesn't seem like it would have a major effect on exposed attack surface. 

There may be some nuance or detail here that I am missing.If you think there is functionality that has to end up living in the NetworkProcess instead of WebProcess when interposing at a high level instead of a low level, I'd be curious what it is.

> 
> Another way to think about this issue is to consider how you'd sandbox
> the NetworkProcess in these scenarios.  How would you sandbox the
> NetworkProcess on Windows (where the sandboxing mechanisms are
> primitive)?  How would you sandbox the NetworkProcess using SFI (where
> the sandboxing mechanism is draconian)?

I think we would not attempt to sandbox the NetworkProcess on Windows or using SFI if doing so is infeasible. Still, sandboxing it on those platforms where it is possible and to the extent possible seems like an advantage over having it be a thread in a privileged UI process.

> 
> The only way I can see of solving these problems is to implement
> ResourceHandle out-of-process (in the Windows case) or with an SFI API
> (in the SFI case).  If you're going to do that anyway, what have you
> gained by creating a NetworkProcess in the first place?

I think having networking code in a non-SFI sandbox on some platforms is better than having it in a privileged process on all platforms. That seems like a gain. Do you disagree?


In addition to the potential for better cache sharing, one other reason for interposition at a high level is to avoid layering violations in ResourceHandle. ResourceHandle was designed to be a wrapper around a network library that is otherwise meant to be unaware of higher-level concepts relating to the WebCore and the Web platform. That is the reason it's in WebCore/platform/. But proxying at that level seems to require threading higher-level information, such as the frame associated with a load, the resource load type, priority following from page visibility, etc, through the ResourceHandle level. If ResourceHandle has knowledge of such Web-level concepts, then it has no meaningfully distinct role from ResourceLoader. But if it was merged into ResourceLoader, the end result would end up not much different from interposing at a higher level in the first place. I concede that this design concern is trumped by actual practical advantages or disadvantages of this approach. But I wanted to mention it as an additional point of concern. 


Thanks,
Maciej






More information about the webkit-dev mailing list