package-lock.json is not complete on first run for some modules.

What I Wanted to Do

Run npm install and not see any changes to the package-lock.json file recently been committed to git and no new dependencies have been introduced.

What Happened Instead

The package-local.json file should not have changed on the second run for npm install. The first time I installed nodemon the loc file should have been fully generated with all the required fields set.

But instead it generated this diff

diff --git a/package-lock.json b/package-lock.json
index c22a22c..f71c6f9 100644
--- a/package-lock.json
+++ b/package-lock.json
@@ -288,11 +288,13 @@
         },
         "balanced-match": {
           "version": "1.0.0",
-          "bundled": true
+          "bundled": true,
+          "optional": true
         },
         "brace-expansion": {
           "version": "1.1.11",
           "bundled": true,
+          "optional": true,
           "requires": {
             "balanced-match": "^1.0.0",
             "concat-map": "0.0.1"
@@ -305,15 +307,18 @@
         },
         "code-point-at": {
           "version": "1.1.0",
-          "bundled": true
+          "bundled": true,
+          "optional": true
         },
         "concat-map": {
           "version": "0.0.1",
-          "bundled": true
+          "bundled": true,
+          "optional": true
         },
         "console-control-strings": {
           "version": "1.1.0",
-          "bundled": true
+          "bundled": true,
+          "optional": true
         },
         "core-util-is": {
           "version": "1.0.2",
@@ -416,7 +421,8 @@
         },
         "inherits": {
           "version": "2.0.3",
-          "bundled": true
+          "bundled": true,
+          "optional": true
         },
         "ini": {
           "version": "1.3.5",
@@ -426,6 +432,7 @@
         "is-fullwidth-code-point": {
           "version": "1.0.0",
           "bundled": true,
+          "optional": true,
           "requires": {
             "number-is-nan": "^1.0.0"
           }
@@ -438,17 +445,20 @@
         "minimatch": {
           "version": "3.0.4",
           "bundled": true,
+          "optional": true,
           "requires": {
             "brace-expansion": "^1.1.7"
           }
         },
         "minimist": {
           "version": "0.0.8",
-          "bundled": true
+          "bundled": true,
+          "optional": true
         },
         "minipass": {
           "version": "2.2.4",
           "bundled": true,
+          "optional": true,
           "requires": {
             "safe-buffer": "^5.1.1",
             "yallist": "^3.0.0"
@@ -465,6 +475,7 @@
         "mkdirp": {
           "version": "0.5.1",
           "bundled": true,
+          "optional": true,
           "requires": {
             "minimist": "0.0.8"
           }
@@ -537,7 +548,8 @@
         },
         "number-is-nan": {
           "version": "1.0.1",
-          "bundled": true
+          "bundled": true,
+          "optional": true
         },
         "object-assign": {
           "version": "4.1.1",
@@ -547,6 +559,7 @@
         "once": {
           "version": "1.4.0",
           "bundled": true,
+          "optional": true,
           "requires": {
             "wrappy": "1"
           }
@@ -652,6 +665,7 @@
         "string-width": {
           "version": "1.0.2",
           "bundled": true,
+          "optional": true,
           "requires": {
             "code-point-at": "^1.0.0",
             "is-fullwidth-code-point": "^1.0.0",

Reproduction Steps

Run this script

#!/usr/bin/env bash

mkdir test-dir
cd test-dir
npm init --yes
npm intall nodemon@1.11.0 --save
git init .
git add package.json package-lock.json
git commit -m 'Initial commit'
rm -Rf node_modules
npm install
git status

Details

Platform Info

$ npm --versions
6.4.1
$ node --version
8.11.4
$ node -p process.platform
linux
2 Likes

Possibly related to package-lock.json changes from one `npm install` to the next

I can more or less explain what’s going on. I can reproduce it on Linux this way:

$ echo {} > package.json
$ npm i chokidar   # (1)
$ git init
$ echo /node_modules > .gitignore
$ git add .
$ git ci -av -m 'Initial commit'

$ rm -rf node_modules
$ npm i   # (2)
// package-lock.json changed

$ git ci -av -m c2
$ rm -rf node_modules
$ npm i   # (3)
// package-lock.json changed

$ git ci -av -m c3
$ rm -rf node_modules
$ npm i   # (4)
// no changes at last

The important thing here is that fsevents is an optional dependency of chokidar. And that fsevents doesn’t work under Linux. Then, what is an optional package to npm? Basically, the one which reverse dependencies are optional. Although, there’s a line I don’t really understand. If a package was seen during the lookup, it’s treated as nonoptional.

Let’s take balanced-match. It becomes optional on second npm i run. On first run:

+ /fsevents/balanced-match
	swOptional: false
    + /fsevents/brace-expression
        fakeChild: false
		swOptional: false
        + /fsevents/minimatch
            fakeChild: false
			swOptional: false
            + /fsevents/glob
                fakeChild: false
				swOptional: false
                + /fsevents/rimraf
                    fakeChild: false
					swOptional: false
                    + /fsevents/node-pre-gyp
                        fakeChild: false
						swOptional: false
                        + /fsevents
                            fakeChild: false
							swOptional: false
                            + /chokidar
                                fakeChild: false
								return true since isOptDep
                            return true
                        return true
                    return true
                return true
            + /fsevents/ignore-walk
                fakeChild: false
				swOptional: false
                + /fsevents/npm-packlist
                    fakeChild: false
					swOptional: false
                    + /fsevents/node-pre-gyp
                        fakeChild: false
                        return false since already seen
						...

Here, node-pre-gyp is seen twice, so the end result is false.

swOptional is an optional status taken from package-lock.json. isOptDep checks if dependency is optional according to corresponding package.json.

fakeChild is information about the package, taken from package-lock.json. So on first run, no package has fakeChild. On second and later runs, fakeChild field is added to each package. But then it’s reset on refresh-package-json step.

But if the package fails, e.g. owing to unsupported platform. Like fsevents does on Linux. No further actions are performed for such package and its dependencies. Particularly, refresh-package-json is not executed for such packages, and fakeChild is not set to false.

The interesting thing is that package’s dependencies that were processed before the package itself get processed by refresh-package-json, and have their fakeChild set to false. As opposed to those that are processed after the package. So some fsevents’ dependencies don’t have this property.

So, if a package fails, and therefore has fakeChild field, npm is forced to use data from package-lock.json.

Hopefully, I’m not far too off with my interpretations.

On second run:

+ /fsevents/balanced-match
	swOptional: false
    + /fsevents/brace-expression
        fakeChild: false
		swOptional: false
        + /fsevents/minimatch
            fakeChild: false
			swOptional: false
            + /fsevents/glob
                fakeChild: true
				swOptional: true
                + /fsevents/rimraf
                    fakeChild: true
					return true since fakeChild && swOptional
                return true
            + /fsevents/ignore-walk
                fakeChild: false
				swOptional: true
                + /fsevents/npm-packlist
                    fakeChild: false
					swOptional: true
                    + /fsevents/node-pre-gyp
                        fakeChild: true
						return true since fakeChild && swOptional
                    return true
                return true
            return true
        return true
    return true

Here, swOptional is true for /fsevents/glob, and no package is seen twice, so the result is true.

So, after first run /fsevents/glob is marked as optional. That prevents node-pre-gyp from being seen twice. And balanced-match gets optional status.

Let’s take ansi-regex, that becomes optional on third run. On first run:

+ /fsevent/ansi-regex
	swOptional: false
    + /fsevents/strip-ansi
        fakeChild: false
		swOptional: false
        + /fsevents/gauge
            fakeChild: false
			swOptional: false
            + /fsevents/npmlog
                fakeChild: false
				swOptional: false
                + /fsevents/node-pre-gyp
                    fakeChild: false
					swOptional: false
                    + /fsevents
                        fakeChild: false
						swOptional: false
                        + /chokidar
                            fakeChild: false
							return true since isOptDep
                        return true
                    return true
                return true
            return true
        + /fsevents/string-width
            fakeChild: false
			swOptional: false
            + /fsevents/gauge
                fakeChild: false
                return false since already seen
            + /fsevents/wide-align

The result is false since /fsevents/gauge was seen twice. On second run:

+ /fsevent/ansi-regex
    swOptional: false
    + /fsevent/strip-ansi
		fakeChild: true
        swOptional: false
        + /fsevent/gauge
			fakeChild: true
            swOptional: true
            + /fsevents/npmlog
				fakeChild: true
				return true since fakeChild && swOptional
            return true
        + /fsevent/string-width
            fakeChild: true
            swOptional: false
            + /fsevent/gauge
                fakeChild: true
                return false since already seen
            + /fsevent/wide-align

The result is false since /fsevents/gauge was seen twice again. On third run:

+ /fsevent/ansi-regex
    swOptional: false
    + /fsevent/strip-ansi
        fakeChild: true
        swOptional: false
        + /fsevent/gauge
            fakeChild: true
            swOptional: true
            + /fsevents/npmlog
                fakeChild: true
				return true since fakeChild && swOptional
			return true
        + /fsevent/string-width
            fakeChild: true
            swOptional: true
            + /fsevent/gauge
                fakeChild: true
				return true since fakeChild && swOptional
            + /fsevent/wide-align
                fakeChild: true
				return true since fakeChild && swOptional
            return true

This time string-width is optional according to package-lock.json, and that prevents /fsevents/gauge from being seen twice. So the result is true.

So, only after gauge (first run), then string-width (second run) become optional, gauge is not seen twice, and the result is true.

1 Like

Nice work! Based on your diagnosis, I think this is related to this issue:

I believe the weird line you mentioned (this one) checks in the first part for circular dependencies (and the second part is explained in the comment above it). The problem with that check is it has side effects for other “paths” from the module being checked upwards to package.json. So when packages are required by an optional package in two different ways, it’s always marked as seen on the second go, and therefore marked as non-optional.

My theory is that on the second install, the package-lock.json already knows that some packages are optional, and the packages that are required in multiple ways are all children of optional packages. As a result, they are all recognized as optional. With this theory, there may even be more than two installs necessary for more complicated dependency graphs.

Anyway, I have a solution for the side effect issue that seems to work. Running the command list, I can’t reproduce the issue anymore.

1 Like

Good point.

That’s what my examples demonstrate. Optional status gradually (with each run) propagates to fsevents’ dependencies.

So, you need optional package that fails (some fakeChilds are left). And several paths between the package and its dependency A, where package B is seen twice or more.

@larsgw Do note that fakeChild plays important role in all this. If not for it, isOptional would have to inspect parents all the way up to fsevents, and no optional status would propagate.

It takes 4 runs to install chokidar on Linux :)

And it might make sense to fix the above-mentioned inconsistency, that some fsevents’ dependencies have their fakeChild set to false. From what I can see, the idea is that if package fails, the package itself and its dependencies are not to have fakeChild reset.

1 Like

Let me add a couple more ways to reproduce the issue I can think of.

  1. On Linux/Windows:
$ npm init --yes
$ npm i chokidar
$ echo /node_modules > .gitignore; git init; git add .; git commit -m c1
$ git push ...

Then

$ git clone ...
$ npm i
$ git diff   # extra `optional: true` lines
  1. On Linux/Windows:
$ npm init --yes
$ npm i chokidar
$ rm -rf node_modules
$ npm i
$ echo /node_modules > .gitignore; git init; git add .; git commit -m c1
$ git push ...

Then on MacOS:

$ git clone ...
$ npm i
$ git diff   # removed `optional: true` lines

UPD Now that I think about it, fsevents should be an optional dependency regardless of the platform. So second case either won’t reproduce the problem, or would have a different cause. Although it makes sense to check.

UPD On third thought, on MacOS fsevents wouldn’t fail. So no fakeChilds left. And optional: true lines should be removed according to the way it works now. So second case should reproduce the problem as well.

UPD Actually any install or probably update under Linux/Windows until all optional: true has propagated to all the dependencies should reproduce the issue. And any install/upgrade under MacOS if package-lock.json contains propagated optional dependencies.

I should have probably read that better then. :slight_smile:

But it doesn’t have to propagate in that way (through multiple runs) if the optional status is determined correctly in the first run, i.e. when seen.has(node) works properly, right? Or am I missing something again?

Well, I had a hard time explaining it. Had a couple of misconceptions I later corrected. So you were totally in a position to miss something :)

I think it’s as you say. With your fix fakeChilds shouldn’t affect the optional status. And it should be determined correctly on the first run.

It might make sense to do something about fakeChild being set to false for some dependencies. But that’s probably out of scope of this issue. And might not have any practical implications.

@larsgw I confirm that your change https://github.com/npm/cli/pull/76 fixes this issue for me. I have the issue with npm 6.4.1.

3 Likes

@larsgw awesome work on this! Can also confirm that this fixes an issue we are seeing in Dependabot where our build server runs on Linux: https://github.com/dependabot/feedback/issues/197

3 Likes

The impact of this problem is even worse than x-yuri demonstrated. It may appear that package-lock.json has stabilized after running rm -rf node_modules; npm install four times. However, after doing this, running npm install without first removing node_modules sends us back to the beginning of the cycle.

$ echo {} > package.json
$ npm install chokidar
$ shasum package-lock.json
0a71085202680435b34b4d80bdea25211e59cee5  package-lock.json
$ rm -rf node_modules
$ npm install
$ shasum package-lock.json
735cc900821a0154c8c41a832919bbb6348989e7  package-lock.json
$ rm -rf node_modules
$ npm install
$ shasum package-lock.json
6abe51d4a08c42c43ac5197ca2e616cb0b47533f  package-lock.json
$ rm -rf node_modules
$ npm install
$ shasum package-lock.json
6abe51d4a08c42c43ac5197ca2e616cb0b47533f  package-lock.json
$ npm install
$ shasum package-lock.json
0a71085202680435b34b4d80bdea25211e59cee5  package-lock.json

This infinite cycle of churn interferes with the ability to usefully commit package-lock.json to Git.

I can confirm that npm/cli#76 fixes this (you only get the third version, which is presumably correct).

I have a question that might be silly since I do not know all the ins and outs of npm and how issues get fixed and merged into npm.

What exactly is stopping npm/cli#76 ?
I’m mostly asking so that we can start the discussion on how to solve those issues and get the package-lock.json file to be stable.

At the moment it’s a review from @iarna.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

This is really helpful analysis, thank you.

I’ve been spending the last month or so combing through all of the tree-generating and tree-munging logic in lib/install/*.js. My initial goal was to figure out the right entry point to implement workspaces and figure out why my manual symbolic links sometimes get their deps munged. My suspicion was that workspaces would never be possible as long as links weren’t properly handled.

As it turns out, I uncovered one performance issue and three very interesting problems, with a single (albeit my-time-expensive) solution.

  1. The perf issue was not really a problem, but more of an optimization opportunity. fs.realpath was being called on every module folder at ever step in the process of loading a node_modules tree. For deep package trees with lots of deps, this results in tens of thousands of fs.lstat calls on the same folders repeatedly. This is fixed in npm 6.10.0.

  2. The first real problem (and the root of the others) is that npm’s read-package-tree module loads children of a Link with the Link node itself as the parent. This is a design bug dating back at least to npm v1.0, but until npm was deduplicating by default, it never caused any problems.

    This results in some very strange behavior around dependency analysis, because npm thinks that a module could be a dep resolution even when node.js would never load it, by virtue of the realpath in its module loading algorithm. Also, since those nodes appear to be packages in our tree, and since their dependencies would thus appear to be resolved, npm was pruning out packages from the linked target folder, thus breaking the module being linked! There is some code to work around this (grep for isInLink to find some of it), and so simply updating the source of the glitch caused a lot of problems.

  3. The second problem is what’s described here. The shrinkwrap-inflating and node_modules-reading logic is not identical, resulting in cases where optional and dev flags are set inconsistently, based on whether a package’s metadata came from disk or from the package-lock.json.

  4. The third problem is that there’s interestingly confusing behavior around how packages get flagged if they are in both the dev and optional trees, requiring rewalks in many cases where a single walk through the dep graph should be sufficient. Consider a dependency graph like this:

     root
     +-- opt --> a
     |           +--> b (deduped)
     +-- dev --> d
                 +--> b
    

    In this case, it’s clear that a should have optional: true. And, d should have dev: true. However, because it’s not “only dev” and not “only optional”, b gets neither flag set. In order to make prune or install work properly when neither dev nor optional deps should be included (ie, --no-optional --only=production), extra work has to be done.

My conclusion at this time (which was, as I understand it, the intention of Kat and Rebecca prior to leaving the project) is that the logic to read/munge/reify package trees needs to be separated out from the CLI itself, so that the “installer” can just point the tree munger at a folder, make a request for some kind of transformation (add a package, prune, update, etc.) and then listen to events and update the logger.

So, I’m knee-deep in this work right now, over at https://github.com/npm/arborist. (The original code name for this module was shishigami, as in the spirit of the forest, but I figured it is less problematic, if less cute, to name it after my own culture’s word for a tree-doctor, rather than appropriate a term from a spiritual tradition that may have connotations or implications I’m not familiar with.)

I’m working on a more thorough and thoughtful roadmap post for v7, but with this work underway, my thinking now is that the tl;dr will be: workspaces, dropping node 6 support, fixing this exact weirdness around optional/dev flags, and no longer mucking with linked package dependencies.

Also, we should just write a yarn.lock file, and respect it if it’s there, because why not? It’s not a very complicated file format, and it’d be nice if mixed yarn/npm teams could just go on with their lives in peace and use whatever tool they want, or switch between them without getting scolded. Let it start with us. I could almost do it with arborist right now, but I need to get the shasum (not sha512sum) for package tarballs, which isn’t super handy right now. But it’s in the packument, so it’s not hard to get once arborist is actually fetching data.

Sorry for the long reply. I should probably turn this into a blog post :slight_smile:

2 Likes