Entrypoint Hooks (carry over discussion from Austin Collab Summit) #43384

jasnell · 2022-06-11T20:03:58Z

jasnell
Jun 11, 2022
Maintainer

Discussion moved to #43408 ... where it will get more visibility.

At the Austin Collaborator Summit, there was significant discussion around the need for a more well-defined startup lifecycle with a clearer boundary between the preload phase and the loading/evaluation of the user entry point. The use cases include more reliable handling for APMs, dynamic transpilers, diagnostic tooling, and more. I took the task of working up an initial proposal. Here is that proposal:

Entrypoint Hooks

Currently, the Node.js startup process consists of a single bootstrap phase in which the Node.js core internal mechanisms and environment are set up followed by the loading and instantiation of the user-provided entry point script.

  == Node.js Bootstrap == Preloads == User Entry point Script == Start Event Loop ==>

The User Entry point here is the script that is provided as the argument to the node binary (e.g. node foo.js, foo.js is the User Entry point).

Historically with Node.js, there have always been scenarios where it is desirable to load and run code before the User Entry point performs any actions. This can be accomplished with several methods:

Require-first: By strategically positioning require and import statements at the beginning of the User Entry point so that they are loaded and evaluated before anything else. This is the mechanism used typically by many Node.js APMs.
Wrapper/Loader: By using an alternative user entry point that ensures that certain code is loaded and evaluated first before the actual user entry point is loaded. This is the mechanism used typically by certain test frameworks, serverless environments like lambda.
Preloads: By using the Node.js -r command-line argument, Node.js can be instructed to load and evaluate one or more CommonJS scripts synchronously before loading and evaluating the user entry point script. This is used, for instance, by tools like Node.js Clinic to preload diagnostic tooling into the Node.js process.
Module Loaders: By providing an alternative module loader implementation using the still experimental loader API, it is possible to execute startup code the first time a module is loaded – including the user entry point module. This is the mechanism used by tools such as ts-node, for instance.

While each of these have historically been effective, they each suffer from a number of limitations, not the least of which is the lack of a clear separation between the execution of the preload code and the user entry point. Take, for instance, the following example:

Imagine a preload script with a simple one-line of code:

// preload.js
setImmediate(() => console.log('preload');

And a User Entrypoint script with the following:

// entry.js
console.log('entrypoint');

Now run the node binary as:

node -r ./preload.js entry.js

The order of the statements printed will be:

entrypoint
preload

This is because while the preload script does run before entry point script, it schedules async activity that does not get invoked until after the event loop has started, after the entry point script has been evaluated. While waiting for the preload script to complete, a lot of user code can run.

In other words, while there is a clear boundary at which preload can begin, there is no such boundary for when preload completes.

This is a proposal for establishing a clearer lifecycle boundary

Proposal

In the proposed new model, a new Entrypoint Hook phase is introduced into the Node.js startup following the completion of the bootstrap. During the Entrypoint Hook phase, one or more preload scripts can be loaded and evaluated in a user-defined order, in precisely the same way that preload scripts (using the -r argument) are loaded except for one very important distinction: Immediately after loading and evaluating these preload scripts, the Node.js event loop will be started to allow any asynchronous operations initiated by those to be run to completion. When there are no further async tasks for that first run of the event loop to complete, the entry point hook phase of the bootstrap will be considered to be complete, the event loop will be reset, and the user entry point will be loaded and evaluated, continuing the Node.js startup just as it does today. If there are no preload scripts to run, this entire new phase is skipped.

  == Node.js Bootstrap == Preloads == Run Event Loop == User Entry point Script == Start Event Loop ==>

With this approach, the preload scripts run during the Entrypoint Hook phase are permitted to fully complete and can alter the user entrypoint before it begins.

Use Case: Serverless

In the serverless use case, a serverless host environment can use the entry point hook phase to load any supporting framework code and initialization process it needs before completing the actual user entry point script.

Use Case: APMs/Diagnostic Tools

In the APM use case, diagnostic tools can use the entry point hook phase to load any diagnostic instrumentation it needs to prepare, even if that tooling is initialized asynchronously (e.g. to query file system or network for license or configuration data)

Use Case: Dynamic Transpilers

Because the entry point hook is guaranteed to run to completion before the start of the user entry point, they can be used to implement dynamic transpilation of the user entry point before it completes. For instance, a TypeScript entry point hook can transpile a typescript file passed in as the user entry point and trigger Node.js to load and execute the compiled JavaScript result rather than trying to run the typescript file that was provided:

What about startup time? Cold starts?

Entrypoint Hook scripts will have an impact on Node.js binary startup time when used. There are, fortunately, mechanisms for mitigating such costs. It would be possible, for instance, to capture a snapshot of the preloads such that loading and initial evaluation cost is reduced in exactly the same way that we have created snapshots of the Node.js bootstrap and are working to create snapshots of the user entry point. Preloads, however, are not trivial and effort will need to be made to ensure a minimal performance cost.

mhdawson · 2022-06-14T16:02:29Z

mhdawson
Jun 14, 2022
Maintainer

@jasnell thanks for capturing the discussion from the collab summit and opening this issue.

Do you have thoughts on how the entry point hooks would be specified? it would be interesting to see an example of how that would be done for each of the three use cases listed (serverless, APMs, transpilers).

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Node.js

Entrypoint Hooks (carry over discussion from Austin Collab Summit) #43384

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

Node.js

Entrypoint Hooks (carry over discussion from Austin Collab Summit) #43384

jasnell Jun 11, 2022 Maintainer

Entrypoint Hooks

Proposal

Use Case: Serverless

Use Case: APMs/Diagnostic Tools

Use Case: Dynamic Transpilers

What about startup time? Cold starts?

Replies: 1 comment

mhdawson Jun 14, 2022 Maintainer

jasnell
Jun 11, 2022
Maintainer

mhdawson
Jun 14, 2022
Maintainer