The npm community forum has been discontinued.
To discuss usage of npm, visit the GitHub Support Community.
Sandboxing npm packages
A static safety-check for npm might not work, but – hear me out – a dynamic runtime one could. Perhaps. The idea may be a bit of a stretch, the implementation definitely is, and the usefulness is limited (as is the concept in and of itself). However, it could work, and I can’t have this idea sitting in the back of my head any longer.
Permissions are applied through the scope of a set of files, which is usually limited to the root directory of a package and downwards. Exceptions are
node_modules, which always has zero permissions, and possibly bin files. Each package in
node_modules has it’s own set of permissions, although those could be inherited from parent packages (see “Usage”).
This is were
tink comes in. See, tink implements its own
require function. I think. If it doesn’t,
@babel/register does, so it’s definitely possible. Anyway, if you redefine the
require function you can implement some proxy logic in there, that keeps track of the permissions of each scope. Then, by grabbing the filename of the
require caller (which is a bit awkward, I know), you can determine the scope, and what permissions it has. Of course,
requireing a file or module that falls outside those permissions is not allowed, and a
PermissionError can be thrown.
This beats any static analysis, as this should be pretty obfuscation-proof, and safe to edge cases if used with
require.resolve(). To avoid workarounds, file-system and
child_process acces would be limited too, as well as possibly
process and other global variables, as they may lead to packages via constructors or prototypes.
Other problems include the fact that, of course, sensitive modules can be passed in both directions. I only realized this after writing this entire post. The only workaround of this I can think of is simply expecting people not to do that, for their own sake. So that probably invalidates my entire idea. Great.
The only thing I can remember that comes close to solving this is the sandbox of https://happening.im/, which, if I remember correctly, prevented passing around modules anywhere. However, that part of their code isn’t open source, the project is discontinued, the developers are gone, and I have no idea how they did it.
Now comes the hard part: when should permissions be given? Packages can’t simply declare permissions themselves for a number of reasons:
- it would have to be opt-in, kind of defeating the purpose
- being opt-in, it wouldn’t stop truly malicious people
On the other hand, you can’t require (pun intended) users to sift through every dependency in their project to manually assign permissions to everything. A middle ground, of having users assign permissions to top-level dependencies could work, but could also be the worst of both: giving to much permissions to sub-dependencies, and a trial-and-error process for the user. One good thing: in contrast too static checks, permissions aren’t required for optional packages that aren’t actually used.
The Use Cases
The use cases are pretty limited too, as mentioned. Proper safe-keeping of malicious packages is somewhat doable with packages that don’t need any access. However, packages that legitimately need access to say, the filesystem, they can probably work around the imposed limitations.
I don’t think this is worthy of an actual RFC, as it’s not even slightly realistic, and probably not something for npm anyway, so I just posted it here.
Interesting, I didn’t even think of tink when writing my post about permissions. As you mentioned in your note, I think the bullet points under “Packages can’t simply declare permissions themselves for a number of reasons” might be avoidable.
As it stands, what are the biggest hurdles you see with this?
I think the two factors that would most affect the success of this approach are the technical and the incentives/mechanism design. If think wraps
fs that’s a pretty good start to the technical side of things.
The biggest hurdle I saw with this was finding an authoritative source of the scope of the requester, scope being the dependency it’s part of. While determining the scope from a file path could still prove tricky (symlinks may act up, who knows?), I think I found an authoritative source of the file path:
Module._load. So if that gets middle-ware before any client code is run,
tink should be able to intercept any
require calls and redirect/cancel them appropriately.
Other hurdles of all sizes:
- As mentioned determining the scope. It may be as easy as getting the first few directories after the last
node_modules, but there should be no edge cases.
- If this will be in
tinkit would be with the other overrides, and the permissions data has to get there somewhere (but that’s just a minor inconvenience).
- There are a lot of builtin modules, classifying them should be done carefully.
child_processshould be treated as full permissions AFAICT, but
fscould – say – rewrite
package-permissions.json(1), and there may be some other innocent module that exposes
(1) Although the wrapper could fix that with a special case.
Unfortunately, after some testing, it seems that files can change their filename, simply with:
module.filename = '/home/user/index.js'
I’ve been poking at something very similar to this. We definitely need something like
tink to pull it off, because it involves patching the module system and the modules themselves. I’ll get back to you after the holidays and after I’ve had actual time to work on things, but it mostly seems to be feasible – the problem being access to
module.require and the module cache, which would instantly allow any module to break out of its sandbox, but we can work on that.
tl;dr you need to probably pass in custom
module objects. I imagine a
SandboxedModule sort of thing, and that will involve deep patching of Node’s module system.
Of course, these were just some thoughts.
We’ve also studied this. My conclusion was that it cannot be done in a safe way unless an actual sandbox mechanism is available in Node. The
vm module looks a good match, but is not a sandbox and as such is not yet acceptable from a security perspective to run untrusted code. There are alternatives (
vm2), but they have shortcomings that make them unsuitable for a widespread use.
Note that security isn’t only about preventing
require (even if it’s the most basic need) - it also covers side-channel attacks and a lot of other APIs. One such example is timeout handlers - a malicious package could cancel all registered handlers with ease, which would probably not be intended. Something similar could happen with file descriptors - even if the assumption could be that
fs access means all access.
I’d really like to see it available though so feel free to experiment, but imo the best way to achieve this is to also talk with the Node committees, because they’ll have the bulk of the work to do (on the bright side I’ve already spoke to some about this, and I know it’s something they have on their plate).
What made it unsuitable for widespread use?