Sandboxing npm packages


(Lars Willighagen) #1

A static safety-check for npm might not work, but – hear me out – a dynamic runtime one could. Perhaps. The idea may be a bit of a stretch, the implementation definitely is, and the usefulness is limited (as is the concept in and of itself). However, it could work, and I can’t have this idea sitting in the back of my head any longer.

The Idea

Permissions are applied through the scope of a set of files, which is usually limited to the root directory of a package and downwards. Exceptions are node_modules, which always has zero permissions, and possibly bin files. Each package in node_modules has it’s own set of permissions, although those could be inherited from parent packages (see “Usage”).

The Implementation

This is were tink comes in. See, tink implements its own require function. I think. If it doesn’t, @babel/register does, so it’s definitely possible. Anyway, if you redefine the require function you can implement some proxy logic in there, that keeps track of the permissions of each scope. Then, by grabbing the filename of the require caller (which is a bit awkward, I know), you can determine the scope, and what permissions it has. Of course, requireing a file or module that falls outside those permissions is not allowed, and a PermissionError can be thrown.

This beats any static analysis, as this should be pretty obfuscation-proof, and safe to edge cases if used with require.resolve(). To avoid workarounds, file-system and child_process acces would be limited too, as well as possibly process and other global variables, as they may lead to packages via constructors or prototypes.

Other problems include the fact that, of course, sensitive modules can be passed in both directions. I only realized this after writing this entire post. The only workaround of this I can think of is simply expecting people not to do that, for their own sake. So that probably invalidates my entire idea. Great.

The only thing I can remember that comes close to solving this is the sandbox of https://happening.im/, which, if I remember correctly, prevented passing around modules anywhere. However, that part of their code isn’t open source, the project is discontinued, the developers are gone, and I have no idea how they did it.

The Usage

Now comes the hard part: when should permissions be given? Packages can’t simply declare permissions themselves for a number of reasons:

  • it would have to be opt-in, kind of defeating the purpose
  • being opt-in, it wouldn’t stop truly malicious people
  • etc.

On the other hand, you can’t require (pun intended) users to sift through every dependency in their project to manually assign permissions to everything. A middle ground, of having users assign permissions to top-level dependencies could work, but could also be the worst of both: giving to much permissions to sub-dependencies, and a trial-and-error process for the user. One good thing: in contrast too static checks, permissions aren’t required for optional packages that aren’t actually used.

The Use Cases

The use cases are pretty limited too, as mentioned. Proper safe-keeping of malicious packages is somewhat doable with packages that don’t need any access. However, packages that legitimately need access to say, the filesystem, they can probably work around the imposed limitations.


I don’t think this is worthy of an actual RFC, as it’s not even slightly realistic, and probably not something for npm anyway, so I just posted it here.


tink: Brainstorming
(David Gilbertson) #2

Interesting, I didn’t even think of tink when writing my post about permissions. As you mentioned in your note, I think the bullet points under “Packages can’t simply declare permissions themselves for a number of reasons” might be avoidable.

As it stands, what are the biggest hurdles you see with this?

I think the two factors that would most affect the success of this approach are the technical and the incentives/mechanism design. If think wraps require and fs that’s a pretty good start to the technical side of things.


(Lars Willighagen) #3

The biggest hurdle I saw with this was finding an authoritative source of the scope of the requester, scope being the dependency it’s part of. While determining the scope from a file path could still prove tricky (symlinks may act up, who knows?), I think I found an authoritative source of the file path: Module._load. So if that gets middle-ware before any client code is run, tink should be able to intercept any require calls and redirect/cancel them appropriately.

Other hurdles of all sizes:

  • As mentioned determining the scope. It may be as easy as getting the first few directories after the last node_modules, but there should be no edge cases.
  • If this will be in tink it would be with the other overrides, and the permissions data has to get there somewhere (but that’s just a minor inconvenience).
  • There are a lot of builtin modules, classifying them should be done carefully. child_process should be treated as full permissions AFAICT, but fs could – say – rewrite package-permissions.json(1), and there may be some other innocent module that exposes fs via prototypes.

(1) Although the wrapper could fix that with a special case.


(Lars Willighagen) #4

Unfortunately, after some testing, it seems that files can change their filename, simply with:

module.filename = '/home/user/index.js'

(Kat Marchán) #5

I’ve been poking at something very similar to this. We definitely need something like tink to pull it off, because it involves patching the module system and the modules themselves. I’ll get back to you after the holidays and after I’ve had actual time to work on things, but it mostly seems to be feasible – the problem being access to module.require and the module cache, which would instantly allow any module to break out of its sandbox, but we can work on that.

tl;dr you need to probably pass in custom module objects. I imagine a SandboxedModule sort of thing, and that will involve deep patching of Node’s module system.


(Lars Willighagen) #6

:partying_face:

Of course, these were just some thoughts.


(Maël Nison) #7

We’ve also studied this. My conclusion was that it cannot be done in a safe way unless an actual sandbox mechanism is available in Node. The vm module looks a good match, but is not a sandbox and as such is not yet acceptable from a security perspective to run untrusted code. There are alternatives (vm2), but they have shortcomings that make them unsuitable for a widespread use.

Note that security isn’t only about preventing require (even if it’s the most basic need) - it also covers side-channel attacks and a lot of other APIs. One such example is timeout handlers - a malicious package could cancel all registered handlers with ease, which would probably not be intended. Something similar could happen with file descriptors - even if the assumption could be that fs access means all access.

I’d really like to see it available though so feel free to experiment, but imo the best way to achieve this is to also talk with the Node committees, because they’ll have the bulk of the work to do (on the bright side I’ve already spoke to some about this, and I know it’s something they have on their plate).


(Kat Marchán) #8

What made it unsuitable for widespread use?


(Joel Edwards) #9

I was looking into how to enforce a minimal runtime with the recent introduction of npx cards and I came across a new package which attempts to do something similar. This is reminiscent of Java’s project jigsaw.