From 87b8944b0c08d9bc9b360f34810e1a4da440d3b1 Mon Sep 17 00:00:00 2001 From: hardfist Date: Tue, 26 Sep 2023 14:12:20 +0800 Subject: [PATCH 1/2] chore: add architecture --- src/SUMMARY.md | 7 +- src/architecture/intro.md | 1 - src/architecture/rspack/intro.md | 5 + src/architecture/rspack/loader.md | 91 +++ src/architecture/webpack/dependency.md | 618 +++++++++++++++++ src/architecture/webpack/intro.md | 7 + src/architecture/webpack/loader.md | 903 +++++++++++++++++++++++++ 7 files changed, 1630 insertions(+), 2 deletions(-) delete mode 100644 src/architecture/intro.md create mode 100644 src/architecture/rspack/intro.md create mode 100644 src/architecture/rspack/loader.md create mode 100644 src/architecture/webpack/dependency.md create mode 100644 src/architecture/webpack/intro.md create mode 100644 src/architecture/webpack/loader.md diff --git a/src/SUMMARY.md b/src/SUMMARY.md index df7cb9f..24b171d 100644 --- a/src/SUMMARY.md +++ b/src/SUMMARY.md @@ -24,7 +24,12 @@ - [Managing labels](./contributing/managing-labels.md) - +# Architecture + - [rspack](./architecture/rspack/intro.md) + - [loader](./architecture/rspack/loader.md) + - [webpack](./architecture/webpack/intro.md) + - [loader](./architecture/webpack/loader.md) + - [dependency](./architecture/webpack/dependency.md) diff --git a/src/architecture/intro.md b/src/architecture/intro.md deleted file mode 100644 index 1e0981f..0000000 --- a/src/architecture/intro.md +++ /dev/null @@ -1 +0,0 @@ -# Intro diff --git a/src/architecture/rspack/intro.md b/src/architecture/rspack/intro.md new file mode 100644 index 0000000..1c69379 --- /dev/null +++ b/src/architecture/rspack/intro.md @@ -0,0 +1,5 @@ +# rspack +This is the architecture of current rspack implementation + +# Table of Contents +[loader](./loader.md) \ No newline at end of file diff --git a/src/architecture/rspack/loader.md b/src/architecture/rspack/loader.md new file mode 100644 index 0000000..97b62d6 --- /dev/null +++ b/src/architecture/rspack/loader.md @@ -0,0 +1,91 @@ +# Related PRs + +- [rspack#2780](https://github.com/web-infra-dev/rspack/pull/2789) +- [rspack#2808](https://github.com/web-infra-dev/rspack/pull/2808) + +# Summary + +The old architecture is a quite simple version, which only supports loaders for normal stage. +Pitching loader does not put into consideration. The basic concept of the old version is to +convert the normal loader to a native function which can be called from the Rust side. +Furthermore, for performance reason, Rspack also composes loaders from the JS side to +mitigate the performance issue of Node/Rust communications. + +In this new architecture, loaders will not be converted directly into native functions. +Instead, it is almost the same with how webpack's loader-runner resolves its loaders, by +leveraging the identifier. Every time Rspack wants to invoke a JS loader, the identifiers will +be passed to the handler passed by Node side to process. The implementation also keeps +the feature of composing JS loaders for performance reason. + + +# Guide-level explanation + +The refactor does not introduce any other breaking changes. So it's backwards compatible. +The change of the architecture also help us to implement pitching loader with composability. + +## Pitching loader + +Pitching loader is a technique to change the loader pipeline flow. It is usually used with +inline loader syntax for creating another loader pipeline. style-loader, etc and other loaders +which might consume the evaluated result of the following loaders may use this technique. +There are other technique to achieve the same ability, but it's out of this article's topic. + +See [Pitching loader](https://webpack.js.org/api/loaders/#pitching-loader) for more detail. + + +# Reference-level explanation + +## Actor of loader execution + +In the original implementation of loader, Rspack will convert the normal loaders in the first place, +then pass it to the Rust side. In the procedure of building modules, these loaders will be called directly: + +![Old architecture](https://user-images.githubusercontent.com/10465670/233357319-e80f6b32-331c-416d-b4b5-30f3e0e394bd.png) + +The loader runner is only on the Rust side and execute the loaders directly from the Rust side. +This mechanism has a strong limit for us to use webpack's loader-runner for composed loaders. + +In the new architecture, we will delegate the loader request from the Rust core to a dispatcher +located on the JS side. The dispatcher will normalize the loader and execute these using a modified +version of webpack's loader-runner: + +![image](https://user-images.githubusercontent.com/10465670/233357805-923e0a27-609d-409a-b38d-96a083613235.png) + +Loader functions for pitch or normal will not be passed to the Rust side. Instead, each JS loader has +its identifier to uniquely represent each one. If a module requests a loader for processing the module, +Rspack will pass identifier with options to the JS side to instruct the Webpack like loader-runner to +process the transform. This also reduces the complexity of writing our own loader composer. + +## Passing options + +Options will normally be converted to query, but some of the options contain fields that cannot be +serialized, Rspack will reuse the _**loader ident**_ created by webpack to uniquely identify the option +and restore it in later loading process. + +## Optimization for pitching + +As we had known before, each loader has two steps, pitch and normal. For a performance friendly +interoperability, we must reduce the communication between Rust and JS as minimum as possible. +Normally, the execution steps of loaders will look like this: + +![image](https://user-images.githubusercontent.com/10465670/233360942-7517f22e-3861-47cb-be9e-6dd5f5e02a4a.png) + +The execution order of the loaders above will looks like this: + +``` +loader-A(pitch) + loader-B(pitch) + loader-C(pitch) + loader-B(normal) +loader-A(normal) +``` + +The example above does not contain any JS loaders, but if, say, we mark these loaders registered on the +JS side: + +![image](https://user-images.githubusercontent.com/10465670/233362338-93e922f6-8812-4ca9-9d80-cf294e4f2ff8.png) + +The execution order will not change, but Rspack will compose the step 2/3/4 together for only a single +round communication. + + diff --git a/src/architecture/webpack/dependency.md b/src/architecture/webpack/dependency.md new file mode 100644 index 0000000..4dffab1 --- /dev/null +++ b/src/architecture/webpack/dependency.md @@ -0,0 +1,618 @@ +> Based on *Webpack version: 5.73.0*. +> Some source code is omitted for cleaner demonstration in the example. + + +# Summary + +Explain how webpack dependency affects the compilation and what kind of problem that webpack was facing at the moment and the solution to the problem. + + +# Glossary + +> What's the meaning of a word used to describe a feature? +> +> Why does the Webpack introduce this and what's the background of introducing this? What kind of problem Webpack was facing at the time? + +## High-level presentations of *Dependencies* + +- [Dependency(fileDependency)](https://webpack.js.org/api/loaders/#thisadddependency): An existing dependency that is marked as watchable. This is the widely-used type of dependency. CSS Preprocessors like `postcss` strongly depend on this in order to mark its dependency watchable. +- [ContextDependency](https://webpack.js.org/api/loaders/#thisaddcontextdependency): Most useful for requests in which Glob and Regexp were used. For real-world usage, see [[this](https://webpack.js.org/guides/dependency-management/#require-with-expression)](https://webpack.js.org/guides/dependency-management/#require-with-expression). +- [MissingDependency](https://webpack.js.org/api/loaders/#thisaddmissingdependency): A missing dependency to mark it watchable (handles the creation of files during compilation before watchers are attached correctly.) +- [BuildDependency](https://webpack.js.org/configuration/cache/#cachebuilddependencies): Related to persistent cache. +- PresentationalDependency: Dependencies that only affect presentation are mostly used with their associated template. + +## Others + +- [LoaderContext](https://webpack.js.org/api/loaders/#the-loader-context): Context provided by Webpack *loader-runner*, which can be accessed through `this` in each loader function. +- ModuleGraph: A graph to describe the relationship between modules. + +# Guide-level explanation + +## `Dependency` + +`dependency`(`fileDependency`) stands for the file *dependency* among `missingDependeny` and `contextDependency`, etc. The created dependency will be marked as watchable, which is useful in *Hot Module Replacement* in developer mode. + +The implicit behavior for webpack internally in the case below is to create two dependencies internally. + +```js +import foo from "./foo"; +import "./style.css"; +``` + +## `ContextDependency` + +`contextDependency` is mostly used in scenarios where we want to dynamic load some module in runtime. In this case, webpack cannot assure which module it will be included in the final bundle at compile time. In order to make the code runnable in runtime, webpack has to firstly create multiple bundle modules corresponding to the matching filename such as `./components/a.js` and `./components/b.js`, etc. + +```js +// index.js +import("./components" + componentName).then(...) +``` + +```js +// components/a.js +... +export default ComponentA; +``` + +```js +// components/b.js +... +export default ComponentB; +``` + +For loaders, you can access to `this.addContextDependency` in each loader function. +For plugins, you can access via `module.buildInfo.contextDependencies`. + + + +# Reference-level explanation + + +> The abstraction of *Dependency* of Webpack was introduced in Webpack version 0.9 with a big refactor. [Redirect to the commit](https://github.com/webpack/webpack/commit/ee01837d66a44f1dd52fd1e174a6669e0d18dd55) + + +## Stakeholders of *Dependency* + +### High-level + +![image-20220919171608629](https://raw.githubusercontent.com/h-a-n-a/static/main/2022/09/upgit_20220919_1663578968.png) + + + +### Low-level + +![image-20220919171841624](https://raw.githubusercontent.com/h-a-n-a/static/main/2022/09/upgit_20220919_1663579121.png) + + + + +## How *dependencies* affect the creation of *module graph*? + + +### Duplicated module detection + +Each module will have its own `identifier`, for `NormalModule`, you can find this in `NormalModule#identifier`. If the identifier will be duplicated if inserted in `this._module`, then webpack will directly skip the remaining build process. [[source]](https://github.com/webpack/webpack/blob/9fcaa243573005d6fdece9a3f8d89a0e8b399613/lib/Compilation.js#L1270-L1274) + +Basically, an `NormalModule` identifier contains these parts: +1. `type` [`string`]: The module type of a module. If the type of the module is `javascript/auto`, this field can be omitted +2. `request` [`string`]: Request to the module. All loaders whether it's inline or matched by a config will be stringified. If _inline match resource_ exists, inline loaders will be executed before any normal-loaders after pre-loaders. A module with a different loader passed through will be treated as a different module regardless of its path. +3. `layer`: applied if provided + + + +### Module resolution + +`getResolve` is a loader API on the `LoaderContext`. Loader developers can pass `dependencyType` to its `option` which indicates the category of the module dependency that will be created. Values like `esm` can be passed, then webpack will use type `esm` to resolve the dependency. + +The resolved dependencies are automatically added to the current module. This is driven by the internal plugin system of `enhanced-resolve`. Internally, `enhanced-resolve` uses plugins to handle the dependency registration like `FileExistsPlugin` [[source]](https://github.com/webpack/enhanced-resolve/blob/e5ff68aef5ab43b8197e864181eda3912957c526/lib/FileExistsPlugin.js#L34-L54) to detect whether a file is located on the file system or will add this file to a list of `missingDependency` and report in respect of the running mode of webpack. The collecting end of Webpack is generated by the `getResolveContext` in `NormalModule` [[source]](https://github.com/webpack/webpack/blob/9fcaa243573005d6fdece9a3f8d89a0e8b399613/lib/NormalModule.js#L513-L524) + + + + +### *Module dependency* in *ModuleGraph* + +Here's a module graph with `esm` import between modules: + +![image-20220919172119861](https://raw.githubusercontent.com/h-a-n-a/static/main/2022/09/upgit_20220919_1663579279.png) + +The dependency type introduced by `import` or `require` is a derived dependency: *ModuleDependency*. + +A *ModuleDependency* contains three important fields. + +1. `category`: used to describe the category of dependency. e.g. "esm" | "commonjs" +2. `request`: see the explanation above. +3. `userRequest`: Resource and its inline loader syntax will be stringified and applied, but loaders in `module.rules` will be omitted. + +It's also good to note a field we will talk about later: +1. `assertions`: assertions in `import xx from "foo.json" assert { type: "json" }` + +More fields can be found in abstract class of *Dependency* and *ModuleDependency*. [source: Dependency](https://github.com/webpack/webpack/blob/9fcaa243573005d6fdece9a3f8d89a0e8b399613/lib/Dependency.js#L88) [source: ModuleDependency](https://github.com/webpack/webpack/blob/9fcaa243573005d6fdece9a3f8d89a0e8b399613/lib/dependencies/ModuleDependency.js#L17) + + +```js +// null -> index.js + +EntryDependency { + category: "esm", + request: "./index.js", + type: "entry", + _parentModule: undefined +} +``` + +```js +// index.js -> foo.js + +HarmonyImportSideEffectDependency { + category: "esm", + request: "./foo", + type: "harmony side effect evaluation", + _parentModule: NormalModule { identifier: "index.js" } +} +``` + +```js +// index.js -> bar.js + +HarmonyImportSideEffectDependency { + category: "esm", + request: "./bar", + type: "harmony side effect evaluation", + _parentModule: NormalModule { identifier: "index.js" } +} +``` + +```js +// bar.js -> foo.js +HarmonyImportSideEffectDependency { + category: "esm", + request: "./foo", + type: "harmony side effect evaluation", + _parentModule: NormalModule { identifier: "bar.js" } +} +``` + +### Resolving a module + +*ModuleDependencies* with different dependency category such as `esm` or `commonjs` will affect the resolving part. For ECMAScript modules, they may prefer `"module"` to `"main"`, and for *CommonJS* modules, they may use `"main"` in `package.json`. On top of that, conditional exports are also necessary to be taken into account. [doc](https://nodejs.org/api/packages.html#conditional-exports) + + +### Different types of *module dependencies* + +#### ESM-related derived types + +There are a few of *ModuleDependencies* introduced in ESM imports. A full list of each derived type can be reached at [[source]](https://github.com/webpack/webpack/blob/86a8bd9618c4677e94612ff7cbdf69affeba1268/lib/dependencies/HarmonyImportDependencyParserPlugin.js) + + +##### Import + +**`HarmonyImportDependency`** + +The basic type of harmony-related *module dependencies* are below. [[source]](https://github.com/webpack/webpack/blob/86a8bd9618c4677e94612ff7cbdf69affeba1268/lib/dependencies/HarmonyImportDependency.js#L51) + +**`HarmonyImportSideEffectDependency`** + +```js +import { foo, bar } from "./module" +import * as module from "./module" +import foo from "./module" +import "./module" +``` + +Every import statement will come with a `HarmonyImportSideEffectDependency`, no matter how the specifiers look like. The speicifier will be handled by `HarmonyImportSpecifierDendency` below. + +The field `assertions` will be stored if any import assertions exist for later consumption. +The field `category` will be used as `dependencyType` to resolve modules. + +**`HarmonyImportSpecifierDependency`** + +```js +import { foo, bar } from "./module" +import * as module from "./module" +import foo from "./module" +``` + +Example: + +```js +import { foo, bar } from "./module" + +console.log(foo, bar) +``` + +Specifier will be mapped into a specifier dependency if and only if it is used. JavaScript parser will first tag each variable [[source]](https://github.com/webpack/webpack/blob/86a8bd9618c4677e94612ff7cbdf69affeba1268/lib/dependencies/HarmonyImportDependencyParserPlugin.js#L137), and then create corresponding dependencies on each reading of dependency. [[source]](https://github.com/webpack/webpack/blob/86a8bd9618c4677e94612ff7cbdf69affeba1268/lib/dependencies/HarmonyImportDependencyParserPlugin.js#L189) and finally be replaced to the generated `importVar`. + + +##### Export(They are not module dependencies to be actual, but I placed here for convienence) + +**`HarmonyExportHeaderDependency`** + +> PresentationalDependency + +```js +export const foo = "foo"; +export default "foo"; +``` + +This is a *presentational dependency*. We will take more time on this later. + +**`HarmonyExportSpecifierDependency`** + +```js +export const foo = "foo"; // `foo` is a specifier + +HarmonyExportSpecifierDependency { + id: string; + name: string; +} +``` + +**`HarmonyExportExpressionDependency`** + +```js +export default "foo"; // "foo" is an expression + +HarmonyExportExpressionDependency { + range: [number, number] // range of the expression + rangeStatement: [number, number] // range of the whole statement +} +``` + + + + + +## How *dependencies* affect code generation + + +### *Presentational dependency* + +> A type of dependency that only affects code presentation. + +**`ConstDependency`** + +``` +ConstDependency { + expression: string + range: [number, number] + runtimeRequirements: Set | null +} +``` + +You can think of the passed `expression` as a `replacement` for the corresponding `range`. For the real world example, you can directly refer to *Constant Folding*. + + +### _Template_ + +Remember the fact that Webpack is an architecture wrapped around source code modifications. _Template_ is the solution that helps Webpack to do the real patch on the source code. Each dependency has its associated _template_ which affects a part of the code generation scoped per dependency. In other words, the effect of each _template_ is strictly scoped to its associated dependency. + +![image-20220919173300220](https://raw.githubusercontent.com/h-a-n-a/static/main/2022/09/upgit_20220919_1663579980.png) + +There are three types of modification: +- `source` +- `fragments` +- `runtimeRequirements` + +A boilerplate of the dependency template looks like this: +```js +class SomeDependency {} + +SomeDependency.Template = class SomeDependencyTemplate { + /** + * @param {Dependency} dependency the dependency for which the template should be applied + * @param {ReplaceSource} source the current replace source which can be modified + * @param {DependencyTemplateContext} templateContext the context object + * @returns {void} + */ + apply(dependency, source, templateContext) { + // do code mod here + } +} +``` + +There are three parameters in the function signature: +- dependency: The associated dependency of this template +- source: The source code represent in `ReplaceSource`, which can be used to replace a snippet of code with a new one, given the start and end position +- templateContext: A context of template, which stores the corresponding `module`, `InitFragments`, `moduleGraph`, `runtimeRequirements`, etc. (not important in this section) + + + + +#### `Source` + +Again, given an example of [`ConstDependency`](https://github.com/webpack/webpack/blob/9fcaa243573005d6fdece9a3f8d89a0e8b399613/lib/dependencies/ConstDependency.js#L20), even if you don't have an idea what it is, it doesn't matter. We will cover this in the later sections. + +The associated template modifies the code with `Source`(`ReplaceSource` to be more specific): +```js +ConstDependency.Template = class ConstDependencyTemplate extends ( + NullDependency.Template +) { + apply(dependency, source, templateContext) { + const dep = /** @type {ConstDependency} */ (dependency); + + // not necessary code is removed for clearer demostration + + if (dep.runtimeRequirements) { + for (const req of dep.runtimeRequirements) { + templateContext.runtimeRequirements.add(req); + } + } + + source.replace(dep.range[0], dep.range[1] - 1, dep.expression); + } +}; +``` + + +#### `runtimeRequirements` + +As you can see from the `Source` section above, there is another modification we talked about: `runtimeRequirements`, It adds + runtime requirements for the current `compilation`. We will explain more in the later sections. + + +#### `Fragments` + +Essentially, a [_fragment_](https://github.com/webpack/webpack/blob/9fcaa243573005d6fdece9a3f8d89a0e8b399613/lib/InitFragment.js) is a pair of code snippet that to be wrapped around each _module_ source. Note the wording "wrap", it could contain two parts `content` and `endContent` [[source]](https://github.com/webpack/webpack/blob/9fcaa243573005d6fdece9a3f8d89a0e8b399613/lib/InitFragment.js#L69). To make it more illustrative, see this: + +image + +The order of the fragment comes from two parts: +1. The stage of a fragment: if the stage of two fragments is different, then it will be replaced corresponding to the order define by the stage +2. If two fragments share the same order, then it will be replaced in [position](https://github.com/webpack/webpack/blob/9fcaa243573005d6fdece9a3f8d89a0e8b399613/lib/InitFragment.js#L41) order. +[[source]](https://github.com/webpack/webpack/blob/9fcaa243573005d6fdece9a3f8d89a0e8b399613/lib/InitFragment.js#L153-L159) + +**A real-world example** + +```js +import { foo } from "./foo" + +foo() +``` + +Given the example above, here's the code to generate a dependency that replaces `import` statement with `__webpack_require__`. + +```js +// some code is omitted for cleaner demonstration +parser.hooks.import.tap( + "HarmonyImportDependencyParserPlugin", + (statement, source) => { + const clearDep = new ConstDependency( + "", + statement.range + ); + clearDep.loc = statement.loc; + parser.state.module.addPresentationalDependency(clearDep); + + const sideEffectDep = new HarmonyImportSideEffectDependency( + source + ); + sideEffectDep.loc = statement.loc; + parser.state.module.addDependency(sideEffectDep); + + return true; + } +); +``` +Webpack will create two dependencies `ConstDependency` and `HarmonyImportSideEffectDependency` while parsing [[source]](https://github.com/webpack/webpack/blob/9fcaa243573005d6fdece9a3f8d89a0e8b399613/lib/dependencies/HarmonyImportDependencyParserPlugin.js#L110-L132). + +Let me focus on `HarmonyImportSideEffectDependency` more, since it uses `Fragment` to do some patch. + +```js +// some code is omitted for cleaner demonstration +HarmonyImportSideEffectDependency.Template = class HarmonyImportSideEffectDependencyTemplate extends ( + HarmonyImportDependency.Template +) { + apply(dependency, source, templateContext) { + super.apply(dependency, source, templateContext); + } +}; +``` +As you can see in its associated _template_ [[source]](https://github.com/webpack/webpack/blob/9fcaa243573005d6fdece9a3f8d89a0e8b399613/lib/dependencies/HarmonyImportSideEffectDependency.js#L59), the modification to the code is made via its superclass `HarmonyImportDependency.Template` [[source]](https://github.com/webpack/webpack/blob/9fcaa243573005d6fdece9a3f8d89a0e8b399613/lib/dependencies/HarmonyImportDependency.js#L244). + +```js +// some code is omitted for cleaner demonstration +HarmonyImportDependency.Template = class HarmonyImportDependencyTemplate extends ( + ModuleDependency.Template +) { + apply(dependency, source, templateContext) { + const dep = /** @type {HarmonyImportDependency} */ (dependency); + const { module, chunkGraph, moduleGraph, runtime } = templateContext; + + const referencedModule = connection && connection.module; + + const moduleKey = referencedModule + ? referencedModule.identifier() + : dep.request; + const key = `harmony import ${moduleKey}`; + + // 1 + const importStatement = dep.getImportStatement(false, templateContext); + // 2 + templateContext.initFragments.push( + new ConditionalInitFragment( + importStatement[0] + importStatement[1], + InitFragment.STAGE_HARMONY_IMPORTS, + dep.sourceOrder, + key, + // omitted for cleaner code + ) + ); + } +} +``` + +As you can see from the simplified source code above, the actual patch made to the generated code is via `templateContext.initFragments`(2). The import statement generated from dependency looks like this. + +```js +/* harmony import */ var _foo__WEBPACK_IMPORTED_MODULE_0__ = __webpack_require__(/*! ./foo */ "./src/foo.js"); //(1) +``` + + Note, the real require statement is generated via _initFragments_, `ConditionalInitFragment` to be specific. Don't be afraid of the naming, for more information you can see the (background)[https://github.com/webpack/webpack/pull/11802] of this _fragment_, which let's webpack to change it from `InitFragment` to `ConditionalInitFragment`. + +**How does webpack solve the compatibility issue?** + +For ESM modules, webpack will additionally call a helper to define `_esModule` on exports as an hint: + +```js +__webpack_require__.r(__webpack_exports__); +``` +The call of a helper is always placed ahead of any `require` statements. Probably you have already get this as the stage of `STAGE_HARMONY_EXPORTS` has high priority than `STAGE_HARMONY_IMPORTS`. Again, this is achieved via `initFragments`. The logic of the compatibility helper is defined in [this](https://github.com/webpack/webpack/blob/9fcaa243573005d6fdece9a3f8d89a0e8b399613/lib/dependencies/HarmonyCompatibilityDependency.js) file, you can check it out. + + +### Runtime + +Runtime generation is based on the previously collected `runtimeRequirements` in different dependency templates and is done after the code generation of each module. Note: it's not after the `renderManifest`, but it's after the code generation of each module. + +![image-20220919173829765](https://raw.githubusercontent.com/h-a-n-a/static/main/2022/09/upgit_20220919_1663580309.png)In the first iteration of collection, Sets of `runtimeRequirements` are collected from the module's code generation results and added to each `ChunkGraphModule`. + +In the second iteration of collection, the collected `runtimeRequirements` are already stored in `ChunkGraphModule`, so Webpack again collects them from there and stores the runtimes required by each chunk of `ChunkGraphChunk`. It's kind of the hoisting procedure of the required runtimes. + +Finally, also known as the third iteration of collection, Webpack hoists `runtimeRequirements` from those chunks that are referenced by the entry chunk and get it hoisted on the `ChunkGraphChunk` using a different field named `runtimeRequirementsInTree` which indicates not only does it contains the runtime requirements by the chunk but also it's children runtime requirements. + +![image-20220919174132772](https://raw.githubusercontent.com/h-a-n-a/static/main/2022/09/upgit_20220919_1663580492.png) + +The referenced source code you can be found it [here](https://github.com/webpack/webpack/blob/9fcaa243573005d6fdece9a3f8d89a0e8b399613/lib/Compilation.js#L3379) and these steps are basically done in `processRuntimeRequirements`. This let me recall the linking procedure of a rollup-like bundler. Anyway, after this procedure, we can finally generate _runtime modules_. Actually, I lied here, huge thanks to the hook system of Webpack, the creation of _runtime modules_ is done in this method via calls to `runtimeRequirementInTree`[[source]](https://github.com/webpack/webpack/blob/9fcaa243573005d6fdece9a3f8d89a0e8b399613/lib/Compilation.js#L3498). No doubt, this is all done in the `seal` step. After that, webpack will process each chunk and create a few code generation jobs, and finally, emit assets. + + + +### *Hot module replacement* + +Changes made via *hot module replacement* is mostly come from `HotModuleReplacementPlugin`. + + + +Given the code below: + +```js +if (module.hot) { + module.hot.accept(...) +} +``` + +Webpack will replace expressions like `module.hot` and `module.hot.accept`, etc with `ConstDependency` as the *presentationalDependency* as I previously talked about. [[source]](https://github.com/webpack/webpack/blob/9fcaa243573005d6fdece9a3f8d89a0e8b399613/lib/HotModuleReplacementPlugin.js#L97-L101) + +With the help of a simple expression replacement is not enough, the plugin also introduce additional runtime modules for each entries. [[source]](https://github.com/webpack/webpack/blob/9fcaa243573005d6fdece9a3f8d89a0e8b399613/lib/HotModuleReplacementPlugin.js#L736-L748) + +The plugin is quite complicated, and you should definitely checkout what it actually does, but for things related to dependency, it's enough. + + + + +## How *dependencies* affect production optimizations + + +### Constant folding + +> The logic is defined in ConstPlugin : [[source]](https://github.com/webpack/webpack/blob/9fcaa243573005d6fdece9a3f8d89a0e8b399613/lib/ConstPlugin.js#L135) + + +_Constant folding_ is a technique that used as an optimization for optimization. For example: + +**Source** + +```js +if (process.env.NODE_ENV === "development") { + ... +} else { + ... +} +``` + +**Generated** + +```js +if (true) { + ... +} +``` + +With mode set to `"development"`, webpack will "fold" the expression `process.env.NODE_ENV === "development"` into an expression of `"true"` as you can see for the code generation result. + +In the `make` procedure of webpack, Webpack internally uses an `JavaScriptParser` for JavaScript parsing. If an `ifStatement` is encountered, Webpack creates a corresponding `ConstDependency`. Essentially, for the `ifStatement`, the `ConstDependency` looks like this : + +```js +ConstDependency { + expression: "true", + range: [start, end] // range to replace +} +``` + +It's almost the same with `else` branch, if there is no _side effects_(refer to source code for more detail), Webpack will create another `ConstDependency` with `expression` set to `""`, which in the end removes the `else` branch. + +In the `seal` procedure of Webpack, the record of the dependency will be applied to the original source code and generate the final result as you may have already seen above. + + + +### Tree shaking & DCE + +Tree-shaking is a technique of a bundle-wise DCE(dead code elimination). In the following content, I will use tree-shaking as a wording for bundle-wise and DCE for module-wise code elimination. (I know it's not quite appropriate, but you get the point) + + + +Here's an example: + +```js +// webpack configuration +module.exports = { + optimization: { + usedExports: true + } +} +``` + +![image-20220919182656468](https://raw.githubusercontent.com/h-a-n-a/static/main/2022/09/upgit_20220919_1663583216.png) + +![image-20220919190553215](https://raw.githubusercontent.com/h-a-n-a/static/main/2022/09/upgit_20220919_1663585553.png) + +![image-20220919190925073](https://raw.githubusercontent.com/h-a-n-a/static/main/2022/09/upgit_20220919_1663585765.png) + +As you can see from the red square, the `initFragment` is generated based on the usage of the exported symbol in the `HarmonyExportSpecifierDependency` [[source]](https://github.com/webpack/webpack/blob/9fcaa243573005d6fdece9a3f8d89a0e8b399613/lib/dependencies/HarmonyExportSpecifierDependency.js#L91-L107) + +If `foo` is used in the graph, then the generated result will be this: + +```js +/* harmony export */ __webpack_require__.d(__webpack_exports__, { +/* harmony export */ "foo": function() { return /* binding */ foo; } +/* harmony export */ }); +const foo = "foo"; +``` + +In the example above, the `foo` is not used, so it will be excluded in the code generation of the template of `HarmonyExportSpecifierDependency` and it will be dead-code-eliminated in later steps. For terser plugin, it eliminates all unreachable code in `processAssets` [[source]](https://github.com/webpack-contrib/terser-webpack-plugin/blob/580f59c5d223a31c4a9c658a6f9bb1e59b3defa6/src/index.js#L836). + + + +## Things related to Persistent cache + +*TODO* + + + + +## Wrap it up! + +Let's wrap everything up in a simple example! Isn't it exciting? + +![image-20220919223228146](https://raw.githubusercontent.com/h-a-n-a/static/main/2022/09/upgit_20220919_1663597948.png) + +Given a module graph that contains three modules, the entry point of this bundle is `index.js`. To not make this example too complicated, we use normal import statements to reference each module (i.e: only one chunk that bundles everything will be created). + +### `Make` + +![image-20220919223558327](https://raw.githubusercontent.com/h-a-n-a/static/main/2022/09/upgit_20220919_1663598158.png) + + + +### Dependencies after `make` + +![image-20220919223720739](https://raw.githubusercontent.com/h-a-n-a/static/main/2022/09/upgit_20220919_1663598240.png) + +### `seal` + +![image-20220920180915326](https://raw.githubusercontent.com/h-a-n-a/static/main/2022/09/upgit_20220920_1663668558.png) + + + +# References +*TODO* + + diff --git a/src/architecture/webpack/intro.md b/src/architecture/webpack/intro.md new file mode 100644 index 0000000..58ffafd --- /dev/null +++ b/src/architecture/webpack/intro.md @@ -0,0 +1,7 @@ +# webpack + +This is the architecture of webpack implementation + +# Table of Contents +[loader](./loader.md) +[dependency](./dependency.md) \ No newline at end of file diff --git a/src/architecture/webpack/loader.md b/src/architecture/webpack/loader.md new file mode 100644 index 0000000..1cd22c4 --- /dev/null +++ b/src/architecture/webpack/loader.md @@ -0,0 +1,903 @@ +> Based on *Webpack version: 5.73.0*. + + +# Summary + +Explain how webpack loader works. Even though it's a little bit long and tedious, It's still a teeny-tiny peek at the loader system of Webpack. + + + +# Glossary + +> What's the meaning of a word used to describe a feature? +> +> Why does the Webpack introduce this and what's the background of introducing this? What kind of problem Webpack was facing at the time? + + + +## Request Related + +```javascript +import Styles from '!style-loader!css-loader?modules!./styles.css'; +``` + +- [Inline loader syntax](https://webpack.js.org/concepts/loaders/#inline): The syntax that chains the loader together within the specifier, followed by the file requested. e.g. `!style-loader!css-loader?modules!./style.css` +- `request`: The request with *inline loader syntax* retained. Webpack will convert relative URLs and module requests to absolute URLs for loaders and files requested. e.g. `!full-path-to-the-loader-separated-with-exclamation-mark!full-path-to-styles.css` + + + +## Resource Related + +```javascript +import xxx from "./index.js?vue=true&style#some-fragment" +``` + +- [`resource`](https://webpack.js.org/api/loaders/#thisresource): The absolute path to the requested file with `query` and `fragment` retained but inline loader syntax removed. e.g. `absolute-path-to-index-js.js?vue=true&style#some-fragment` +- [`resourcePath`](https://webpack.js.org/api/loaders/#thisresourcepath): The absolute path to the requested file only. e.g. `absolute-path-to-index-js.js` +- [`resourceQuery`](https://webpack.js.org/api/loaders/#thisresourcequery): Query with question mark `?` included. e.g. `?vue=true&style` +- [`resourceFragment`](https://webpack.js.org/api/loaders/#thisresourcefragment): e.g. `#some-fragment` +- inline match resource: + - Used to redirect the `module.rules` to another, which is able to adjust the loader chain. We will cover this later. + - Ref: [related PR](https://github.com/webpack/webpack/pull/7462) [Webpack Doc1](https://webpack.js.org/api/loaders/#thisimportmodule) [Webpack Doc2](https://webpack.js.org/api/loaders/#inline-matchresource) + +- `virtualResource`: + - The proposed solution to support asset type changing(A sugar to inline matchResource, which can also affect the asset filename generation) + - See more: [the background of this property](https://github.com/webpack/webpack/issues/14851) + + + + +## Others but also important to note + +- Virtual Module: A kind of module that does not locate in the real file system. But you can still import it. To create a virtual module, you need to follow the [spec](https://www.ietf.org/rfc/rfc2397.txt) and it's also worth noting that Node.js and Webpack both support it under the scheme of `data:`. Also known as, `data:` import. [Doc to Node.js](https://nodejs.org/api/esm.html#data-imports) +- [Module types](https://webpack.js.org/concepts/modules/#supported-module-types) with native support: Webpack supports the following module types native: `'javascript/auto'` |` 'javascript/dynamic'` | `'javascript/esm'` | `'json'` | `'webassembly/sync'` | `'webassembly/async'` | `'asset'` | `'asset/source'` | `'asset/resource'` | `'asset/inline'`, for those types you can use it **without a loader**. From webpack version 4.0+, webpack can understand more than `javascript` alone. + + + +# Guide-level explanation + +## Loader configuration + +The way that webpack controls what kind of module that each loader would apply is based on `module.rules` + + + +```javascript +const MiniExtractCssPlugin = require("mini-extract-css-plugin") + +module.exports = { + module: { + rules: [ + { + test: /\.vue$/, + use: ["vue-loader"] + }, + { + test: /\.css$/, + use: [MiniExtractCssPlugin.loader, "css-loader"] + } + ] + }, + plugins: [new MiniExtractCssPlugin()] +} +``` + +Here is a simple option for the configuration of `vue-loader`. `module.rules[number].test` is a part rule to test **whether a rule should be applied**. For `vue-loader` alone, It's kind of confusing how webpack pass the result to the rule of `css`, we will cover this later. But for now, It's good to notice **there is not only a `test` option alone to test if a rule should be applied**. You can find it [here](https://webpack.js.org/configuration/module/#rule) for full conditions supported. Here're some examples of other conditions you can use. + +```javascript +module.exports = { + module: { + rules: [ + { + test: /\.vue$/, // of course, test if the file extension match `vue`. + scheme: "data", // if the specifier of a request starts with `data:` + resourceQuery: "?raw", // if the `resourceQuery` matches then the rule will be applied. For this example, it's a great idea to apply a `raw-loader` here. + type: "css" // use webpack's native resource handling for css + } + ] + } +} +``` + + + +## Examples + +### Vue(1 to n) + +In a single file component(SFC) of Vue, there are commonly three blocks or more blocks([custom blocks](https://vue-loader.vuejs.org/guide/custom-blocks.html#example)) contained. The basic idea of implementing this loader is to convert it into JavaScript / CSS and let webpack handle the chunk generation(e.g. Style should be generated into a separate `.css` file) + +```vue + + + + + +``` + + + +⬇️⬇️⬇️⬇️⬇️⬇️ + +`Vue-loader` will firstly turn into the `*.vue` file into something like that. + +```javascript +import "script-path-to-vue-sfc"; +import "template-path-to-vue-sfc"; +import "style-path-to-vue-sfc"; +``` + + + +You may find it weird how webpack handles these imports and build the transformed code. But if I change the code a little bit, you will find the idea. + +```javascript +import "script:path-to-vue-sfc"; +import "template:path-to-vue-sfc"; +import "style:path-to-vue-sfc"; +``` + +and if we tweak the configuration a little bit to this, webpack will know exactly how to work with these import statements. + +```javascript +module.exports = { + module: { + rules: [ + { + test: /\.vue$/, + use: ["vue-loader"] + }, + { + scheme: "script", + use: ["apply-your-javascript-loader", "vue-script-extract-loader"] + }, + { + scheme: "template", + use: ["apply-your-javascript-loader", "vue-template-extract-loader"] + }, + { + scheme: "style", + use: ["apply-your-style-loader", "vue-style-extract-loader"] + } + ] + } +} +``` + +We added a few loaders to handle the splitting. I know it's still kind of weird here, but please stick with me and we will find a better way out. + +- vue-script-extract-loader: extract the `script` block from a SFC file. +- vue-style-extract-loader: extract the `style` block from a SFC file. +- vue-template-extract-loader: extract the `template` block from a SFC file and convert it into JavaScript. + + + +You will find it's really noisy only to transform a `*.vue` file, four loaders were introduced and I believe none of you would like to separate a simple loader into four. It's a real bummer! It will be great to use a single loader `vue-loader` alone. The current vue loader implementation uses resourceQuery to handle this. But how? + + + +#### Loader optimizations I + +We know that webpack uses a few conditions to handle whether a rule should be applied. Even with `rule.test` alone, the `this.reousrceQuery` is still available to `loaderContext` which developer could access it with `this` in any loader function(Don't worry if you still don't catch this. You will understand this after). Based on that, we change the `rule` to this: + +```javascript +module.exports = { + module: { + rules: [ + { + test: /.vue$/, + use: ["vue-loader"] + } + ] + } +} +``` + +This indicates "If an import specifier is encountered, please pass me to vue-loader"! If you remember the import transformation above, we could adjust the transformation a little bit to this: + + + +**Before** + +```javascript +import "script-path-to-vue-sfc"; +import "template-path-to-vue-sfc"; +import "style-path-to-vue-sfc"; +``` + + + +**After** + +```javascript +import "path-to-vue-sfc.vue?script=true"; +import "path-to-vue-sfc.vue?template=true"; +import "path-to-vue-sfc.vue?style=true"; +``` + +These requests will match the `test: /.vue$/` above flawlessly and in the loader we can handle like this: + +```javascript +// pseudo code only for proofing of the concept +const compiler = require("some-vue-template-compiler") + +const loader = function(source) { + const { + resourceQuery /* ?script=true or something else */, + resourcePath /* path-to-vue-sfc.vue */ + } = this + + if (resourceQuery === "?script=true") { + return compiler.giveMeCodeofScriptBlock(this.resourcePath) // javascript code + } else if (resourceQuery === "?template=true") { + return compiler.giveMeCodeofTemplateBlock(this.resourcePath) // javascript code + } else if (resourceQuery === "?style=true") { + return compiler.giveMeCodeofStyleBlock(this.resourcePath) // style code + } else { + return ` + import `${this.resourcePath}?script=true`; + import `${this.resourcePath}?template=true`; + import `${this.resourcePath}?style=true`; + ` + } +} + +module.exports = loader +``` + +You can see the loader for the example above will be used for four times. + +1. Encounter a `*.vue` file, transform the code to a few import statements +2. For each import statement introduced in the first transformation, the loader will be used again as they share the same extension `vue`. + + + +Is this the end? No! Even if you wrote the code like this, it will still fail to load. + +1. For CSS: You haven't tell webpack a way to handle the CSS, remember the CSS part is required to go through the `css-loader` and then `mini-css-extract`(if you want to generate CSS for chunk) or `style-loader`(if you want to append it directly to the DOM). After all, you have to make the result of style to pass these loaders. +2. For JS: You haven't transformed the code to any transpilers, It will be failed if your runtime doesn't support the syntax(maybe in TypeScript for example) and webpack internal acorn compiler does not have the ability to help you with that. + + + +**Pass the code to the corresponding loaders** + +We tweak the configuration a little bit again. + +```javascript +module.exports = { + module: { + rules: [ + { + test: /.vue$/, + use: ["vue-loader"] + }, + { + test: /.css$/, + use: [MiniCssExtractPlugin.loader, "css-loader"] + }, + { + test: /.js$/, + use: ["babel-loader"] + } + ] + } +} +``` + +It looks a bit more like the "normal" Webpack configuration. Note that the `rule.test` is based on the file extension, so `vue-loader` did a little bit of hack here. + +```javascript +// pseudo code only for proofing of the concept +const compiler = require("some-vue-template-compiler") + +const loader = function(source) { + const { + resourceQuery /* ?script=true or something else */, + resourcePath /* path-to-vue-sfc.vue */ + } = this + + if (resourceQuery === "?script=true") { + const code = compiler.giveMeCodeofScriptBlock(this.resourcePath) // javascript code + this.resourcePath += ".js" + return code + } else if (resourceQuery === "?template=true") { + const code = compiler.giveMeCodeofTemplateBlock(this.resourcePath) // javascript code + this.resourcePath += ".js" + return code + } else if (resourceQuery === "?style=true") { + const code = compiler.giveMeCodeofStyleBlock(this.resourcePath) // style code + this.resourcePath += ".css" // based on the `lang` in each script, the extension will be set accordingly. + return code + } else { + return ` + import `${this.resourcePath}?script=true`; + import `${this.resourcePath}?template=true`; + import `${this.resourcePath}?style=true`; + ` + } +} + +module.exports = loader +``` + +Webpack uses `resourcePath` to match a `module.rules`. So this hack will let webpack treat blocks accordingly as if they are real files with extensions of `js` | `css` |`...` . + + + +Finally! But this is only a proof of concept, for the real implementation. You should definitely check out the [`vue-loader`](https://github.com/vuejs/vue-loader) yourself. + + + +#### Loader Optimization II + +Well done! We implemented a simple and rudimentary version of `vue-loader`. However, the real pain-in-the-ass part of this implementation is hacking the extension to match the configuration. But since almost every user would have other `js` | `css` files included in the project, so vue team decide to use this kind of strategy to reuse the user configuration. + +Except for hacking the extension, webpack then provided a more legit way to handle this kind of **rule matching problem** which is known as ***inline match resource*** (We covered it in the glossary part). + + + +**inline match resource** + +Webpack can do almost anything with an import specifier like the loader chaining we covered in the glossary part. *Inline source match* is another case. By taking the advantage of it, you can force an import statement to go through a `module.rules` by introducing the `!=!` syntax. For example, if we want to force a `css` file to go through a `less` loader, it will be look like this: + +```javascript +module.exports = { + module: { + rules: [ + { + test: /.less$/, + use: ["style-loader", "css-loader", "less-loader"] + } + ] + } +} +``` + +```javascript +// This import should be converted with a loader + +// treat the file as `less` +import "./index.css.less!=!./index.css" +``` + +The slice before the `!=!` is a way to modify the extension of a single file and force it to match the `module.rules` and this transformation is often done in a loader, or you will make your application code specialized for Webpack only. + + + +After going through the basic example, let's see how we're going to optimize out the hack used in `vue-loader`. + +```javascript +// pseudo code only for proofing of the concept +const compiler = require("some-vue-template-compiler") + +const loader = function(source) { + const { + resourceQuery /* ?script=true or something else */, + resourcePath /* path-to-vue-sfc.vue */ + } = this + + if (resourceQuery === "?vue=true&script=true") { + return compiler.giveMeCodeofScriptBlock(this.resourcePath) // javascript code + } else if (resourceQuery === "?vue=true&template=true") { + return compiler.giveMeCodeofTemplateBlock(this.resourcePath) // javascript code + } else if (resourceQuery === "?vue=true&style=true") { + return compiler.giveMeCodeofStyleBlock(this.resourcePath) // style code + } else { + return ` + import `${this.resourcePath}.js!=!${this.resourcePath}?vue=true&script=true`; + import `${this.resourcePath}.js!=!${this.resourcePath}?vue=true&template=true`; + import `${this.resourcePath}.css!=!${this.resourcePath}?vue=true&style=true`; + ` + } +} + +module.exports = loader +``` + +Webpack will internally use the match resource part(before `!=!`) as the data to match loaders. In order to let `vue-loader` match the resource. We have two options: + +1. Loose test +2. *Inline loader syntax* + + + +**1. Loose test** + +```javascript +module.exports = { + module: { + rules: [ + { + test: /\.vue/, // original: `/\.vue$/`, we removed the `$` to allow resources with `.vue` included to match this rule. + use: ["vue-loader"] + } + ] + } +} +``` + +We removed the `$` to allow resources with `.vue` included matching this rule. Personally speaking, this is not a good idea, because a loose match might cause mismatches. + + + +**2. Inline loader syntax** + +```javascript +// vue-loader/index.js + +module.exports = function() { + // ... code omitted + return ` + import `${this.resourcePath}.js!=!${__filename}!${this.resourcePath}?vue=true&script=true`; + import `${this.resourcePath}.js!=!${__filename}!${this.resourcePath}?vue=true&template=true`; + import `${this.resourcePath}.css!=!${__filename}!${this.resourcePath}?vue=true&style=true`; + ` +} +``` + +This technique is to take advantage of the ***inline loader syntax*** to force the loader to go through the vue loader. This tackles down the tangible mismatching ideally and we can still retain the test regex `/\.vue$/` as-is. + + + + + +#### Final art and conclusion + +**Configuration** + +```javascript +module.exports = { + module: { + rules: [ + { + test: /\.vue$/, + use: ["vue-loader"] + }, + // ... other rules for js, or css, etc. + ] + } +} +``` + + + +**Loader** + +```javascript +// pseudo code only for proofing of the concept +const compiler = require("some-vue-template-compiler") + +const loader = function(source) { + const { + resourceQuery /* ?script=true or something else */, + resourcePath /* path-to-vue-sfc.vue */ + } = this + + if (resourceQuery === "?vue=true&script=true") { + return compiler.giveMeCodeofScriptBlock(resourcePath) // javascript code + } else if (resourceQuery === "?vue=true&template=true") { + return compiler.giveMeCodeofTemplateBlock(resourcePath) // javascript code + } else if (resourceQuery === "?vue=true&style=true") { + return compiler.giveMeCodeofStyleBlock(resourcePath) // style code + } else { + return ` + import `${this.resourcePath}.js!=!${__filename}!${resourcePath}?vue=true&script=true`; + import `${this.resourcePath}.js!=!${__filename}!${resourcePath}?vue=true&template=true`; + import `${this.resourcePath}.css!=!${__filename}!${resourcePath}?vue=true&style=true`; + ` + } +} + +module.exports = loader +``` + + + +**Conclusion** + +Vue-loader is quite complex. The basic needs of the loader are: + +1. Separate a `*.vue` file request into a number of parts. For each block, explicitly change the resource matching mechanism (using ***inline match resource***). The killer *inline match resource* not only gives us great composability with user-defined loaders, but also the ability to interact with webpack supported native types, and we will cover this part late. +2. When requesting the `vue-loader` again for a block, the code of each block is returned and let webpack handle the changed matched resource(e.g. `./App.vue.css`) with user-defined loaders (Webpack did this internally). + + + +### Use natively supported module types + +We know that webpack only supports `JavaScript` in the old time, from the version of `4.0.0`+([changelog](https://github.com/webpack/webpack/releases/tag/v4.0.0)) + + + +#### Simplified pre-processor's configuration + +> With the experimental support of CSS. A.K.A webpack knows how to handle CSS files natively. + + + +**Before** + +```javascript +module.exports = { + module: { + rules: [ + { + test: /\.less$/, + use: ["style-loader", "css-loader", "less-loader"], + type: "javascript/auto" // this field is a implicit one, if not defined, it will be set to `"javascript/auto"` + } + ] + } +} +``` + + + +**After** + +```javascript +module.exports = { + module: { + rules: [ + { + test: /\.less$/, + use: ["less-loader"], + type: "css" + } + ] + }, + experiments: { + css: true + } +} +``` + +With `experiments.css` on, webpack can experimentally understand the parsing and generating of `css` files which gets rid of `css-loader` and `style-loader`. For the full list of natively supported `Rule.type`, you can find it [here](https://webpack.js.org/configuration/module/#ruletype). + + + + + +#### Asset modules + +> From *webpack 4.0.0+*, assets are supported natively + +```javascript +module.exports = { + module: { + rules: [ + { + test: /\.(png|jpg)/, + type: "asset" + } + ] + } +} +``` + +`Rule.type === "asset"` indicates the asset will be automatically tested whether it's going to be inlined or emitted as a file on the real file system. The possible options are: `'asset'` | `'asset/source'` | `'asset/resource'` | `'asset/inline'` + + + + + +### Svgr + +Webpack loader will read the source to a UTF-8 string by default. For SVG files, this would fit the webpack load defaults. + + + +```javascript +// Proof of concept of svgr-loader +module.exports = function(source) { + if (this.resourceQuery === "?svgr=true") { // the real transform part + let { code } = svgrTransformer.transform(source); + return code + } + return `require("${this.resourcePath}.jsx!=!${__filename}!${this.resourcePath}?svgr=true")` // the request part +} +``` + +Again here we use double-pass to firstly convert each request to the request part with *inline match resource*, and do the real request with query `?svgr=true`, and let *inline match resource* handle the `jsx` conversion. Before that, we have to call a third-party `jsx` transformer, could be *ESBuild* for example, for which we cannot reuse other `module.rules` set by the user-side. *Inline match resource* saved our ass again! + + + +### Scheme imports + +> Supported in *Webpack version 5.38.0*, doc: [Rule.scheme](https://webpack.js.org/configuration/module/#rulescheme) + +```javascript +// JavaScript +import x from "data:text/javascript,export default 42" +console.log('x:',x); +``` + +```css +/* CSS */ +@import ("data:text/css, body { background: #fff; }"); +``` + +Webpack handles `data:` imports for JavaScript internally. + + + +### Asset transform and rename + +> [**Asset**](https://webpack.js.org/guides/asset-management/): This is a general term for the images, fonts, media, and any other kind of files that are typically used in websites and other applications. These typically end up as individual files within the [output](https://webpack.js.org/glossary/#o) but can also be inlined via things like the [style-loader](https://webpack.js.org/loaders/style-loader) or [url-loader](https://webpack.js.org/loaders/url-loader). +> +> *Originally posted at Webpack [Glossary](https://webpack.js.org/glossary/#a)* + + + +#### Default resource reading override + +Asset could be formatted in both text(`*.svg`) or binary (`*.png` / `*.jpg`). For loaders, webpack provides you an option [`raw`](https://webpack.js.org/api/loaders/#raw-loader) to override the default and built-in resource reading strategy from UTF-8 `string` to `Buffer`: + +```javascript +module.exports = function(source /* Buffer */ ) { + // loader implementation +} + +module.exports.raw = true +``` + + + +#### Transform and rename + +Image there is a need to transform an asset formatted with `png` to `jpg`. There is two abilities that webpack needs to support: + +1. Handle the asset with `raw` content, or a `Buffer`. We can simply override the defualt resource reading behavior by exporting `raw`(covered before). +2. Change the filename, and reuse the loader for both `png` and `jpg` + + + +##### Configuration + +```javascript +module.exports = { + module: { + rules: [ + { + test: /\.png/, + use: ["png-to-jpg-loader"] // some png to jpg loader, we will implement this + }, + { + test: /\.jpg/, + use: ["jpg-optimizer"] // some jpg optimizer, we will not covert this, + type: "asset/resource" + } + ] + } +} +``` + +1. Rule1: For files with extension `png`, we want to use a `png` to `jpg` loader, which will be covered in this article. +2. Rule2: + 1. For files with extension `jpg`, we want to use a third-party `jpg-optimizer`, which will not be covered in this article. + 2. `type: "asset/resource"`: As soon as all the loaders have gone through, we want webpack to emit the file as an external resource on the file system regardless of the file size(`type: "asset"` will automatically detect the size of an asset to determine whether an asset will be inline-included for dynamically imported from file system). +3. For those `jpg` files converted from `png`, we want them to apply with the `jpg-optimizer` too(i.e. reuse the loaders defined in `module.rules`) + + + +##### Loader + +```javascript +module.exports = function(source) { + if (this.resourceQuery === "?pngToJPG=true") { + return pngToJpg.transform(source) + } + + return `require("${this.resourcePath}.jpg!=!${__filename}${this.resourcePath}?pngToJPG=true")` +} + +module.exports.raw = true +``` + +We use double-pass again, firstly we convert the extension to `.jpg` which will apply the matched rules(in this case `test: /\.jpg/`), after the transformation of `png-to-jpg-loader`. Generated asset module filename will be based on the *inline match resource*, which is `xxxx.jpg` in this case. + + + +### AST reuse + +Webpack provides a way to pass metadata(the forth parameter) among the chaining loaders [doc](https://webpack.js.org/api/loaders/#thiscallback). The most commonly used value is `webpackAST` which accepts an `ESTree` compatible(webpack internally uses `acorn`) AST, which hugely improves the performance since webpack instead of parsing the returned code to AST again, **will directly use the AST(`webpackAST`) returned from a loader**(But **the work of a complete walking of an AST can not be omitted** as it's necessary for webpack for do some analysis for its dependencies and will be only done once, so it is not a big overhead.) + +```javascript +module.exports = function(source) { + let ast = AcornParser.parse(source, { + // options + }) + + this.callback(null, '', null, { + webpackAST: ast + }) +} +``` + +Good to note that only `ESTree` is compatible, so you cannot pass a CSS AST, or webpack will complain with `"webpackAst is unexpected for the CssParser"`. It will be ok if you don't get this, let's move to the reference-level explanation for analysis in-depth. + + + + +## Reference-level explanation + +This is the reference-level explanation part of webpack's internal loader implementation. + + + +### Loader composability + +> If you don't quite get this concept, you may refer to the Glossary and *Example* part of the Guide-level explanation first and pick up this as soon as you finished. + +The high-level idea of previously talked *inline match resource* is to let **loader developers** to customize the behavior of matching to match the pre-defined `module.rules`. It's an API to write composable loaders. But what does composition mean? For those users who are familiar with React hooks and Vue composable APIs, you may get this faster. Actually, webpack provides a lot of ways to help loader developers and users do the composition. + + + +#### User-defined loader flows + +```javascript +module.exports = { + module: { + rules: [ + { + test: /\.js$/, + use: ["babel-loader"], + type: "javascript/auto" + }, + { + test: /\.svg$/, + use: ["svgr-loader", "svgo-loader"], + } + ] + } +} +``` + +Webpack users can take the advantage of `module.rules[number].use` with a loader list for each request that matches the corresponding conditions. Note that I use the wording of `request,` not the `file` , which can include a request to `data:text/javascript` not the files on the real file system only. (In Parcel bundler, it's called [*pipelines*](https://parceljs.org/features/plugins/#pipelines), but this will not be covered in this article.) + + + +Apparently, user-declared loader flow is not able to cover up every case that a loader wants. You can see from the previous examples, `vue-loader` wants to split a file into many blocks, and remain the reference to it. `svgr-loader` wants to do the transformation first and let other loaders deal with the `jsx`. `svg-loader` wants to use the internal ability of `Asset Module` to let Webpack decide whether an asset is inlined or emitted to the real file system. and there are more to come... Based on the complexity of the loader, Webpack also provides a syntax to allow loader implementors to do the composition by themselves. + + + +#### The syntax for loader composition + + + +##### Inline loader syntax (Chaining loaders) + +> Supported from *webpack v1* [chaining-loaders](https://webpack.js.org/migrate/3/#chaining-loaders) +> +> It's possible to specify loaders in an `import` statement, or any [equivalent "importing" method](https://webpack.js.org/api/module-methods). Separate loaders from the resource with `!`. Each part is resolved relative to the current directory. [doc](https://webpack.js.org/concepts/loaders/#inline) + +```javascript +import Styles from '!style-loader!css-loader?modules!./styles.css'; +``` + +The *inline loader syntax* executes each loader for each request from right to left. Webpack handles the interaction with user-defined loaders carefully. So by default, the user-defined normal loader will be executed prior to the inline loaders, you can disable this behavior by prefixing `!` , (full reference could be found here [doc](https://webpack.js.org/concepts/loaders/#inline)). + +The custom specifier is parsed before the `module.rules` as the *inline loader syntax* interferes the user-defined loaders(See the [source code](https://github.com/webpack/webpack/blob/9fcaa243573005d6fdece9a3f8d89a0e8b399613/lib/NormalModuleFactory.js#L390-L403)). Then, webpack will get the `module.rules` combined with the required conditions to calculate the matching rule set (See the [source code](https://github.com/webpack/webpack/blob/9fcaa243573005d6fdece9a3f8d89a0e8b399613/lib/NormalModuleFactory.js#L493-L510)). + +At the moment, you cannot change the matching behavior with the syntax, loaders are always matched with the provided *resourcePath*, etc, which leads to a bunch of hack code in the implementations of loaders (see this [code snippet](https://github.com/vuejs/vue-loader/blob/e9314347d75a1b0e54f971272d23a669fc3e6965/src/select.ts#L31) in `vue-loader`). The possibilities for changing the matching behavior leaves to the later-coming *inline match resource*. + +Nevertheless, the architecture of Loader at this moment is sound and solid. Another good example is the implementation-nonrelative filter(i.e. the filtering logic of *Loader* is not declared in the loader itself), which is the fundamental root of loader composition, or the implementor will do a lot of hacks. (It's way too dirty to talk about here, but you can take the rollup [svgr](https://github.com/gregberge/svgr/blob/1dbc3e2c2027253b3b81b92fd4eb09a4aa8ae25e/packages/rollup/src/index.ts#L52) plugin as a reference) + +In conclusion, *inline loader syntax* gives us a chance to control the loader flow with user-defined rules. + + + +##### Inline match resource + +To extend the matching ability, *inline match resource* enables loader implementors to reuse some of the user-defined configurations with more flexibilities. + +On top of the previous example, webpack also provides a way to make use of the natively-supported *module types*. + +```javascript +// For module type `css` to work, you need to enable `experiments.css` +import "./style.less.webpack[css]!=path-to-less-loader!./style.less" +``` + +```javascript +// webpack.config.js +module.exports = { + experiments: { + css: true + } +} +``` + +Given the configuration above, the overview of the complete flow will be like this: + +1. Webpack: Parse the specifier of the import and create the loader for the current request +2. Webpack: Merge the result from the second step with a user-defined `module.rules` in `webpack.config`, in this case is `[]` +3. Webpack: load `style.less` as UTF-8 string +4. Less-loader: Accept the UTF-8 string as the first parameter of the loader function and transform it to the content of `css`. +5. Webpack: Call the registered native `CSS` parser, and later at the code generation step the registered native `CSS` generator generates the result. + + + +For *asset modules*, you can also use this: + +```javascript +import "./logo.png.jpg.webpack[asset/resource]!=path-to-loaders!./logo.png" +``` + +The first part, also known as `matchResource` will be used as a part of the `filename` of the final code generation. (See the [source code](https://github.com/webpack/webpack/blob/9fcaa243573005d6fdece9a3f8d89a0e8b399613/lib/asset/AssetGenerator.js#L293-L348)) + + + +### Performance optimizations + +Before moving on to the detailed implementations, here's some glossary to support your understanding the architecture as a whole. + + + +#### Glossary + +- `NormalModuleFactory`: A factory used to create a `NormalModule`, which basically exposes a `create` method. +- `NormalModule`: A module in Webpack most of the time is a `NormalModule`, but with different implementations of `parser`/ `generator` / `Module Type`, the module could be almost any kind, and also exposes a `build` method. For example, a `NormalModule` with JavaScript parser, JavaScript generator, and `type ===javascript/auto` will be regarded as a module with JavaScript-related functionalities. Also, good to note that a module may not exist on the real file system, taking `data:` for example. + + + +#### The module creation workflow + +> This will only introduce a slice of webpack's internal implementation from **the Loader's perspective**, for more you should directly refer to the source code. + +When an import statement is detected, webpack will initialize a module creation. Based on the type of *Dependency* (an abstraction of webpack, it's not important here), webpack can find the linked *ModuleFactory*(The abstraction class), in most cases, the derived factory is `NormalModuleFactory`, which exposes a `create` method. + + + +##### Prepare data needed for module creation + +The `NormalModuleFactory#create` is used to provide enough information to create a real `NormalModule`, and create the `NormalModule`. In the `create` method, webpack basically does these things(some non-loader related stuff will be omitted): + +- Resolve loaders from request: resolve the request, parse inline loader syntax: This contains *inline match resource*, *inline loader syntax*. +- Do the analysis on the parsed loader syntax, to decide whether a user-defined `normal/post/pre` loader is going to be included. [doc](https://webpack.js.org/concepts/loaders/#inline) +- Resolve Resource: resolve resource to the absolute path, fragments, queries, etc(These stuff are also provided in `LoaderContext`). For the full source code you may refer to [this](https://github.com/webpack/webpack/blob/main/lib/NormalModuleFactory.js#L653-L678) +- Use the resolved resource data to match `module.rules` defined in the configuration, and get the matched rules. This is also a part of the module creation data. +- Do some special logic with *inline match resource*, since match resource ends like `.webpack[css]` would change `Rule.type`. Also store the match resource data, since it might affect the filename generation for *asset modules*. + + + +##### Create a module based on the prepared data + +After the data needed for module creation is prepared, `NormalModuleFactory` will `new NormalModule` with the data provided. It contains basically every that a `NormalModule` needs (see the [source code](https://github.com/webpack/webpack/blob/9fcaa243573005d6fdece9a3f8d89a0e8b399613/lib/NormalModule.js#L271-L287)). Most importantly, the `loaders`. It contains every loader parsed and ordered from the `create` step. + + + +#### The module build step + +The module build step is kind of clear. Webpack will invoke the `build` method for each `NormalModule` instance, which invokes `loader-runner`(see the [source code](https://github.com/webpack/webpack/blob/9fcaa243573005d6fdece9a3f8d89a0e8b399613/lib/NormalModule.js#L819)) to go through every loader that was analyzed from the create step. It's clear to **know that the composition of loaders is happening on the same module**. + + + +#### A peek of the support of *Module Types* + +As far as this article goes, It might be getting a little bit tedious. But have you ever wondered how webpack supports these *module types* natively? I think It's still worth telling you about it to get a more complete understanding of the AST optimizations. For the support of JavaScript, webpack's JavaScript plugin will register different types of parser and generators for each *module types*, which will be used as the `parser` / `generator` to a `NormalModule` (see the [source code](https://github.com/webpack/webpack/blob/9fcaa243573005d6fdece9a3f8d89a0e8b399613/lib/javascript/JavascriptModulesPlugin.js#L202-L231)). + + + +#### Reusing AST in Webpack + +Based on the parser and generator we introduced before, webpack did a little hack around the fourth parameter of `this.callback` (from *loaderContext*), with `webpackAST`, after each loader call, the `webpackAST` will be stored in the context of loader, and passed again to the next loader. Finally, the AST will be passed to the `parser`(It could be any type, based on the *module type*, but webpack makes it a JavaScript only for AST) (see the [source code](https://github.com/webpack/webpack/blob/9fcaa243573005d6fdece9a3f8d89a0e8b399613/lib/NormalModule.js#L1087)). + +Here's an issue about trying to use SWC's AST to get rid of the time sensitive code parsing from Acorn Parser, but they are facing some AST compatibility issues and performance issues about the overhead of interop with native code(Rust). + + + + + +## References + +- loader plugin api design (Analysis) [#315](https://github.com/speedy-js/rspack/discussions/315) + +- RFC-011 Supports `data:text/javascript` protocol [#457](https://github.com/speedy-js/rspack/discussions/457) + +- Webpack: `matchResource` with natively-supported module types [doc](https://webpack.js.org/api/loaders/#thisimportmodule) + +- Webpack: Loader context [doc](https://webpack.js.org/api/loaders/#the-loader-context) + +- Webpack: Module rules [doc](https://webpack.js.org/configuration/module/#rule) + +- SWC-loader for performance optimizations [issue](https://github.com/webpack/webpack/issues/13425#issuecomment-1013560170) + From f07d544bdc2a50826ca9b9eb7388121652836147 Mon Sep 17 00:00:00 2001 From: hardfist Date: Tue, 26 Sep 2023 14:17:20 +0800 Subject: [PATCH 2/2] chore: fix typo --- src/architecture/webpack/dependency.md | 6 +++--- src/architecture/webpack/loader.md | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/src/architecture/webpack/dependency.md b/src/architecture/webpack/dependency.md index 4dffab1..59ce8c9 100644 --- a/src/architecture/webpack/dependency.md +++ b/src/architecture/webpack/dependency.md @@ -30,7 +30,7 @@ Explain how webpack dependency affects the compilation and what kind of problem ## `Dependency` -`dependency`(`fileDependency`) stands for the file *dependency* among `missingDependeny` and `contextDependency`, etc. The created dependency will be marked as watchable, which is useful in *Hot Module Replacement* in developer mode. +`dependency`(`fileDependency`) stands for the file *dependency* among `missingDependency` and `contextDependency`, etc. The created dependency will be marked as watchable, which is useful in *Hot Module Replacement* in developer mode. The implicit behavior for webpack internally in the case below is to create two dependencies internally. @@ -223,7 +223,7 @@ console.log(foo, bar) Specifier will be mapped into a specifier dependency if and only if it is used. JavaScript parser will first tag each variable [[source]](https://github.com/webpack/webpack/blob/86a8bd9618c4677e94612ff7cbdf69affeba1268/lib/dependencies/HarmonyImportDependencyParserPlugin.js#L137), and then create corresponding dependencies on each reading of dependency. [[source]](https://github.com/webpack/webpack/blob/86a8bd9618c4677e94612ff7cbdf69affeba1268/lib/dependencies/HarmonyImportDependencyParserPlugin.js#L189) and finally be replaced to the generated `importVar`. -##### Export(They are not module dependencies to be actual, but I placed here for convienence) +##### Export(They are not module dependencies to be actual, but I placed here for convenience) **`HarmonyExportHeaderDependency`** @@ -330,7 +330,7 @@ ConstDependency.Template = class ConstDependencyTemplate extends ( apply(dependency, source, templateContext) { const dep = /** @type {ConstDependency} */ (dependency); - // not necessary code is removed for clearer demostration + // not necessary code is removed for clearer demonstration if (dep.runtimeRequirements) { for (const req of dep.runtimeRequirements) { diff --git a/src/architecture/webpack/loader.md b/src/architecture/webpack/loader.md index 1cd22c4..0788784 100644 --- a/src/architecture/webpack/loader.md +++ b/src/architecture/webpack/loader.md @@ -646,7 +646,7 @@ module.exports.raw = true Image there is a need to transform an asset formatted with `png` to `jpg`. There is two abilities that webpack needs to support: -1. Handle the asset with `raw` content, or a `Buffer`. We can simply override the defualt resource reading behavior by exporting `raw`(covered before). +1. Handle the asset with `raw` content, or a `Buffer`. We can simply override the default resource reading behavior by exporting `raw`(covered before). 2. Change the filename, and reuse the loader for both `png` and `jpg`