Normalization of dependencies across multiple projects via caches

Hi everyone, first post here!

I apologise if this has been raised before but I can’t find any existing threads here or on GitHub that seem to cover this topic. If this idea has already been debated, I’d appreciate if someone would link it so I can read the discussion!

The concept is quite simple; many developers work on many projects on a given machine. This results in very large overlap in their dependency trees as there is a significant number of modules used very frequently in the ecosystem, even if they’re only transitively required. It seems like there may be a better alternative for systems such as these.

I propose a new mechanism which doesn’t copy dependencies around, but rather links to them in the cache - in the same hierarchy an install would look. There is already a cache which has each module versioned, so it seems like it would be a matter of linking them rather than copying them (which sounds like it would have limited impact on the codebase).

Originally I was planning on writing a tool for this, but reconsidered as it’s probably desirable enough that it warrants inclusion in npm. It’s also unclear to me how it would be built outside of npm itself. There are a couple of ways it could be supported:

  • Changing this to be the default behaviour on next major
  • Adding a switch to npm install to enable
  • Adding an alternative command (much like npx) to make the difference explicit to the developer

I don’t think changing the default (at least right now) would be a good move. If it is going to exist in npm itself, I think a switch would be the better option for now and maybe we can revisit it as a default in future after gathering some more data points. I’m not sure how many people rely on things like versioning their modules in Git, for example.

I’m torn between a switch to npm install and a new command (maybe npc?) so I would like to hear thoughts on this. npc install is a single character switchover (assuming it supports the same stuff as npm install), whereas a switch is usually a little more annoying to type out repeatedly. I also don’t know the development overhead in adding an npc versus a switch - I’d assume that it’s more complicated.

In general, this feature would have several benefits:

  • Minimal space overhead when working on multiple projects
  • Typically faster installation phases as most copying is avoided (this is particularly handy on CI systems)
  • All disk space is located in one place, making it much easier to remove from things like drive backups

If there’s any interest in adding this, I can try get some statistics from the current state of my daily machine to gauge “typical” overlap and space saving. I’m also willing to contribute any implementation (although I might need some hand-holding initially).

Thank you all for reading, please let me know your thoughts!

For your interest, there is a project in flight covering a lot of this: tink FAQ: a Package Unwinder for JavaScript