Deep dive into Rspack & webpack tree shaking #17

hardfist · 2024-04-17T03:05:25Z

hardfist
Apr 17, 2024
Maintainer

This article primarily focuses on understanding the concept of Webpack Tree Shaking rather than delving deeply into the underlying code implementation. Code examples can be found at https://github.com/hardfist/treeshaking-cases.

One of the challenging aspects of Webpack Tree Shaking is that it involves multiple optimizations working together. Webpack's own use of the term "Tree Shaking" is somewhat inconsistent, often broadly referring to optimizations for dead code elimination. Tree Shaking is defined as:

Tree shaking is a term commonly used in the JavaScript context for dead-code elimination. 
It relies on the static structure of ES2015 module syntax, 
i.e. import and export. 
The name and concept have been popularized by the ES2015 module bundler rollup.

In some contexts, optimizations like usedExports are referred to under the umbrella of tree shaking & sideEffects:

The sideEffects and usedExports (more known as tree shaking) 
optimizations are two different things.

To avoid any ambiguity in understanding Tree Shaking, this discussion will not focus on Tree Shaking itself but rather on the various code optimizations under the category of Webpack Tree Shaking.

Webpack Tree Shaking primarily involves three types of optimizations:

usedExports Optimization: This involves removing unused export variables from modules, thereby further eliminating related side-effect-free statements.
sideEffects Optimization: This removes modules from the module graph where export variables are not used.
DCE (Dead Code Elimination) Optimization: This is typically implemented by general minification tools to remove dead code, although similar functionalities can also be achieved by tools like Webpack's ConstPlugin.

These optimizations operate on different dimensions: usedExports focuses on export variables, sideEffects on entire modules, and DCE on JavaScript statements.

Consider the following example:

In lib.js, variable b is unused, and related code does not appear in the final output due to usedExports optimization.
In util.js, no export variables are used, resulting in the absence of the util module in the final output, which is a result of sideEffects optimization.
In bootstrap.js, the console.log statement will not execute, and thus is removed in the final output, demonstrating DCE optimization.

// index.js
import { a } from './lib';
import { c } from './util';
import './bootstrap';
console.log(a);
// lib.js
export const a = 1;
export const b = 2;
// util.js
export const c = 3;
export const d = 4;
// bootstrap.js
console.log('bootstrap');
if(false){
   console.log('bad');
}else {
   console.log('good');
}

These optimizations are implemented independently but can influence each other. Below, we detail these optimizations and their interrelationships.

DCE Optimization

DCE is relatively straightforward in Webpack, with two important scenarios:

False Branch

if(false){ 
   false_branch;
} else { 
   true_branch;
}

Here, because the false_branch will never execute, it can be directly removed. This has two effects: reducing the final code size and affecting the usage relationships of variables. Consider the following example:

import { a } from './a';
if(false){
  console.log(a);
}else {
  
}

If the false_branch is not removed, variable a would be considered used. Removing it marks a as unused, which can further influence analyses for usedExports and sideEffects. To address this, Webpack offers two opportunities for DCE:

Through the ConstPlugin during the parsing stage, which performs a basic DCE to determine as much as possible about the usage of imported and exported variables, thereby enhancing subsequent sideEffect and usedExport optimizations.
Through Terser's minify during the processAssets stage for more complex DCE, primarily aimed at reducing code size.

Terser's DCE is more time-consuming and intricate, whereas the ConstPlugin's optimization is simpler. For example, a false branch handled by Terser can be successfully removed, but the ConstPlugin might not manage it.

function get_one(){
  return 1;
}
let res = get_one() + get_one();
if(res != 2){
  console.log(c);
}

Unused Top Level Statement

In modules, if a top-level statement is not exported, it can also be removed because it does not bring additional side effects. For example, b and test in the following can be safely deleted (assuming this is a module and not a script, as scripts would pollute the global scope and cannot be safely removed). Webpack's usedExports optimization leverages this characteristic to simplify its implementation.

// index.js
export const a = 10;
const b = 20;
function test(){
}

usedExports Optimization

Compared to similar optimizations by other bundlers, Webpack's usedExports optimization is quite clever. It uses the active status of dependencies to determine whether variables within a module are used. Then, during the code generation phase, if an export variable is unused, it does not generate corresponding export properties, thereby making the code segments that depend on the export variable dead code. This is further aided by subsequent minification for DCE.

Webpack enables usedExports optimization through the optimization.usedExports configuration. Consider the following example:

// index.js
import { a } from './lib';
console.log({a});
// lib.js
export const a = 1;
export const b = 2;

Without tree shaking enabled, you can see that the output contains information about b:

var __webpack_modules__ = [ , (__unused_webpack_module, __webpack_exports__, __webpack_require__) => {
    __webpack_require__.r(__webpack_exports__);
    __webpack_require__.d(__webpack_exports__, {
        a: () => a,
        b: () => b  // b is not removed
    });
    const a = 1;
    const b = 2;
} ];

When optimization.usedExports is enabled, you see that the export of b is removed, but const b = 2 still exists. However, since b is unused, const b = 2 also becomes dead code:

/***/ ((__unused_webpack_module, __webpack_exports__, __webpack_require__) => {
/* harmony export */ __webpack_require__.d(__webpack_exports__, {
/* harmony export */   a: () => (/* binding */ a)
/* harmony export */ });
/* unused harmony export b */
const a = 1;
const b = 2; // this is actually dead code
/***/ })

Further enabling compression with optimization.usedExports, the const b = 2 is removed because it is dead code:

(__unused_webpack_module, __webpack_exports__, __webpack_require__) => {
    __webpack_require__.d(__webpack_exports__, {
        a: () => a
    });
    const a = 1;
}, __webpack_module_cache__ = {};

However, analyzing whether b is used is not always straightforward. Consider the following case:

// index.js
import { a,b } from './lib';
console.log({a});
function test(){
  console.log(b);
}
function test1(){
  test();
}
// lib.js
export const a = 1;
export const b = 2;

Here, b is used by the function test, so we find that b is not directly removed from the output. This is because Webpack does not perform deep static analysis by default. Although test is unused, implying b is also unused, Webpack does not deduce this relationship:

(__unused_webpack_module, __webpack_exports__, __webpack_require__) => {
    __webpack_require__.d(__webpack_exports__, {
        a: () => a,
        b: () => b
    });
    const a = 1, b = 2;
}, __webpack_module_cache__ = {}

Fortunately, Webpack offers another configuration, optimization.innerGraph, which allows for deeper static analysis of the code. This can determine that b is not used, thus successfully removing the export property of b:

((__unused_webpack_module, __webpack_exports__, __webpack_require__) => {
/* harmony export */ __webpack_require__.d(__webpack_exports__, {
/* harmony export */   a: () => (/* binding */ a)
/* harmony export */ });
/* unused harmony export b */
const a = 1;
const b = 2;
/***/ })

DCE also impacts usedExports optimization. Consider the following case:

// index.js
import { a, b, c } from './lib';
console.log({a});
if(false){
  console.log(b);
}
function get_one(){
  return 1;
}
let res = get_one() + get_one();
if(res != 2){
  console.log(c);
}
// lib.js
export const a = 1;
export const b = 2;
export const c = 3;

Reliant on Webpack's internal ConstPlugin for DCE, it successfully removes b, but due to the limited capability of ConstPlugin, it fails to remove c.

((__unused_webpack_module, __webpack_exports__, __webpack_require__) => {
/* harmony export */ __webpack_require__.d(__webpack_exports__, {
/* harmony export */   a: () => (/* binding */ a),
/* harmony export */   c: () => (/* binding */ c)
/* harmony export */ });
/* unused harmony export b */
const a = 1;
const b = 2;
const c = 3;
/***/ })

sideEffects Optimization

While usedExports optimization focuses on optimizing export variables, sideEffects optimization is more thorough and efficient, targeting the removal of entire modules. For a module to be safely removed, it must meet two conditions: none of its export variables are used, and the module must be side-effect-free.

Webpack enables sideEffects optimization through the optimization.sideEffects configuration. Let's look at a simple example:

// index.js
import { a } from './lib';
import { c } from './util';
console.log({a});
// lib.js
export const a = 1;
export const b = 2;
// util.js
export const c = 123;
export const d = 456;

Without optimization.sideEffects enabled, the output retains the util module:

/***/ "./src/lib.js":
/***/ ((__unused_webpack_module, __webpack_exports__, __webpack_require__) => {
__webpack_require__.r(__webpack_exports__);
/* harmony export */ __webpack_require__.d(__webpack_exports__, {
/* harmony export */   a: () => (/* binding */ a),
/* harmony export */   b: () => (/* binding */ b)
/* harmony export */ });
const a = 1;
const b = 2;
/***/ }),
/***/ "./src/util.js":
/***/ ((__unused_webpack_module, __webpack_exports__, __webpack_require__) => {
__webpack_require__.r(__webpack_exports__);
/* harmony export */ __webpack_require__.d(__webpack_exports__, {
/* harmony export */   c: () => (/* binding */ c),
/* harmony export */   d: () => (/* binding */ d)
/* harmony export */ });
const c = 123;
const d = 456;
/***/ })

When optimization.sideEffects is enabled, util.js is removed from the output. This occurs because util meets both conditions required for removal. Let's explore what happens when we violate each condition:

First, introduce side effects in util.js:

export const c = 123;
export const d = 456;
console.log('hello');

This change causes util.js to reappear in the output. Now, revert that change and modify index.js to use variable c from util.js:

import { a } from './lib';
import { c } from './util';
console.log({a}, c);

This modification also causes util.js to reappear in the output. These experiments demonstrate that both conditions must be met for a module to be safely removed. Ensuring these conditions are met is crucial for effectively leveraging sideEffect optimizations in practical applications.

Let's revisit the two conditions necessary for the safe removal of a module:

Unused Export Variables

This condition, while seemingly straightforward, encounters similar challenges to those found in usedExports optimization and may require extensive analysis to determine how a variable is used.

Consider the following example, where c is used within the function test, preventing the successful removal of util.js:

// index.js
import { a } from './lib';
import { c } from './util';
console.log({a});
function test(){
  console.log(c);
}
// lib.js
export const a = 1;
export const b = 2;
// util.js
export const c = 123;
export constd = 456;

When we enable optimization.innerGraph, Webpack conducts a deeper analysis and determines that test is also unused, which implies that c is unused as well, allowing for the correct removal of util.js.

sideEffects Property

Compared to whether a variable is used, determining if a module has side effects is a more complex process. Consider the following modification to util.js:

export const c = 123;
export const d = test();
function test(){
  return 456;
}

In this case, although the function test is a side-effect-free function call, Webpack is unable to determine this and still considers the module as potentially having side effects. As a result, util.js is included in the final output.

To inform Webpack that test has no side effects, two approaches are available:

Pure Annotation: By marking the function call with a pure annotation, you indicate that the function has no side effects:

export const c = 123;
export const d = /*#__PURE__*/ test();
function test(){
  return 456;
}

sideEffects Property: When a module contains numerous top-level statements, marking each with a pure annotation can be cumbersome and error-prone. Thus, Webpack introduced the sideEffects property to label the entire module as side-effect-free. Adding "sideEffects": false to the module's package.json allows util.js to be safely removed:

// package.json
{
  "sideEffects": false
}

However, a challenge arises when a module marked as sideEffect: false depends on another module marked as sideEffect: true. Consider the scenario where button.js imports button.css, with button.js being sideEffects: false and button.css being sideEffects: true:

// package.json
{
    "sideEffects": ["**/*.css", "**/side-effect.js"]
}
// a.js
import { Button } from 'antd';
// index.js
import { Button } from './button';
// button.js
import './button.css';
import './side-effect';
export const Button = () => {
  return `<div class="button">btn</div>`
}
// button.css
.button {
  background-color: red;
}
// side-effects.js
console.log('side-effect');

If sideEffects were only marking the current module for side effects, according to ESM standards, because button.css and side-effect.js have side effects, they should be bundled. However, Webpack's output does not include button.css or side-effect.js.

Therefore, the true meaning of the sideEffects field is:

sideEffects is much more effective since it allows to skip whole modules/files and the complete subtree. -> sideEffect

If a module is marked as sideEffect: false, it implies that if the module's export variables are unused, then the module and its entire subtree can be safely removed. This explanation clarifies why, in the given example, both button.js and its subtree (including button.css and side-effect.js) can be safely deleted, which is particularly useful in the context of component libraries.

Unfortunately, this behavior varies across different bundlers. Testing has shown:

Webpack: Safely deletes side-effect-laden CSS and JS in the subtree.
esbuild: Deletes side-effect-laden JS in the subtree but not CSS.
Rollup: Does not delete side-effect-laden JS in the subtree (does not handle CSS).

Barrel Module

SideEffects optimization can optimize not only leaf node modules but also intermediate nodes. Consider a common pattern where a module re-exports the contents of other modules. If such a module itself (here referred to as mid) does not have any of its export variables used and only serves to re-export other modules' content, is it necessary to retain the re-export module?

// index.js
import { Button } from './components';
console.log('button:', Button);
// components/index.js
export * from './button';
export * from './tab';
export const mid = 'middle';
// components/button.js
export const Button = () => 'button';

Testing shows that Webpack directly deletes the re-export module, and in index.js, it directly imports the content from button.js

    (() => {
        __webpack_require__.r(__webpack_exports__);
        var _components__WEBPACK_IMPORTED_MODULE_0__ = __webpack_require__("./src/components/button.js");
        console.log("button:", _components__WEBPACK_IMPORTED_MODULE_0__.Button);
    })();

This behavior appears as if the source code's import path was directly modified:

- import { Button } from './components';
+ import { Button } from './components/button';

Frameworks like Next.js and UmiJS also offer similar optimizations Optimize Package Imports. Their approach involves rewriting these paths at the loader stage. It’s important to note that while Webpack’s barrel optimization focuses on the output, it still builds components/index.js and its sub-dependencies during the build phase. However, techniques used by Next.js and others modify the source code directly, meaning components/index.js does not participate in the build. This can significantly optimize libraries that re-export hundreds or thousands of sub-modules.

We also tested the behavior of esbuild and Rollup regarding this:

esbuild: Deletes side effects within the barrel module. See example
Rollup: Does not delete side effects within the barrel module. See example

Investigating Webpack Tree Shaking Issues

A frequent issue encountered during on-call duties is "Why has my tree shaking failed?" Troubleshooting such issues can be quite challenging. When faced with this question, the first thought is typically "Which of the tree shaking optimizations has failed?" This generally falls into one of three categories:

SideEffect Optimization Failure

The failure of sideEffect optimization is typically indicated by a module, whose export variables are not used, being included in the bundle.

A lesser-known feature of Webpack is its ability to debug various optimization bailouts through stats.optimizationBailout, including reasons for sideEffect bailouts. Consider the following example:

// index.js
import { a } from './lib';
import { abc } from './util';
console.log({a});
// lib.js
export const a = 1;
export const b = 2;
// util.js
export function abc(){
  console.log('abc');
}
export function def(){
  console.log('def')
}
console.log('xxx');

Compile with optimization.sideEffects=true and stats.optimizationBailout:true:

Webpack's logs clearly indicate that the console.log('xxx') on line 7 of util.js caused the sideEffect optimization to fail, resulting in the module being included in the bundle.

If we further configure sideEffects: false in package.json, this warning disappears because, with the sideEffect Property set, Webpack ceases side effect analysis and directly bases sideEffect optimization on the sideEffects field.

usedExports Optimization Failure

A failure in usedExports optimization manifests when an unused export variable still generates export properties.

In such cases, it is necessary toidentify where the export properties are being used:

However, determining why and where a variable is used can be unclear, as Webpack does not provide detailed records of this. A possible improvement for Webpack could be to track and report where in the module tree specific export variables are used. This would greatly facilitate the analysis and troubleshooting of usedExports optimization issues.

DCE (Dead Code Elimination) Optimization Failure

Beyond the issues with sideEffect and usedExports optimizations, most other tree shaking failures can be attributed to failures in DCE. Common causes of DCE failure include dynamic code constructs like eval and new Function, which can lead to bailout during minification. Troubleshooting these issues typically relates to the minifier used and often requires bisecting the output code to identify the problem. Unfortunately, current minifiers seldom provide detailed reasons for bailouts, which is an area where future enhancements could be beneficial.

In conclusion, effective tree shaking in Webpack requires a deep understanding of the various optimizations involved and how they interact. By correctly configuring and applying these optimizations, developers can significantly reduce the size of their bundles, enhancing performance and efficiency. As Webpack and other bundling tools evolve, ongoing learning and adjustment will be necessary to maintain optimal application performance

alvaro450 · 2024-04-19T20:24:41Z

alvaro450
Apr 19, 2024

@hardfist this is a great explanation of Tree Shaking and the whole process involved, thank you for putting this together!
There are a few minor invalid references in the usedExports Optimization: The explanation refers to const b = 1; but the code example has const b = 2;

1 reply

hardfist Apr 20, 2024
Maintainer Author

thanks, it's fixed now

khangviet1996 · 2024-08-01T01:18:18Z

khangviet1996
Aug 1, 2024

Love your deep-dive articles. Keep writing <3

0 replies

wxiaoyun · 2024-08-02T08:29:18Z

wxiaoyun
Aug 2, 2024

Very insightful and digestible The examples brought the point across very clearly ❤️

0 replies

shuyulxf · 2024-10-25T08:07:56Z

shuyulxf
Oct 25, 2024

In this example, why are these useless statement not removed by DCE instead of usedExports

0 replies

sibelius · 2024-12-03T02:31:44Z

sibelius
Dec 3, 2024

any plans to bring barrel optimization to rspack or swc ?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Web Infra

Deep dive into Rspack & webpack tree shaking #17

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 5 comments · 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Web Infra

Deep dive into Rspack & webpack tree shaking #17

Uh oh!

Uh oh!

hardfist Apr 17, 2024 Maintainer

DCE Optimization

False Branch

Unused Top Level Statement

usedExports Optimization

sideEffects Optimization

Unused Export Variables

sideEffects Property

Barrel Module

Investigating Webpack Tree Shaking Issues

SideEffect Optimization Failure

usedExports Optimization Failure

DCE (Dead Code Elimination) Optimization Failure

Replies: 5 comments · 1 reply

Uh oh!

alvaro450 Apr 19, 2024

Uh oh!

hardfist Apr 20, 2024 Maintainer Author

Uh oh!

khangviet1996 Aug 1, 2024

Uh oh!

wxiaoyun Aug 2, 2024

Uh oh!

shuyulxf Oct 25, 2024

Uh oh!

sibelius Dec 3, 2024

hardfist
Apr 17, 2024
Maintainer

alvaro450
Apr 19, 2024

hardfist Apr 20, 2024
Maintainer Author

khangviet1996
Aug 1, 2024

wxiaoyun
Aug 2, 2024

shuyulxf
Oct 25, 2024

sibelius
Dec 3, 2024