Implementing Webpack from Scratch, But in Rust - [2] MVP Version

ayou - Oct 28 - - Dev Community

Referencing mini-webpack, I implemented a simple webpack from scratch using Rust. This allowed me to gain a deeper understanding of webpack and also improve my Rust skills. It's a win-win situation!

Code repository: https://github.com/ParadeTo/rs-webpack

This article corresponds to Pull Request

The previous section introduced how to use Oxc to parse and modify JS code, solving the core problem. Now we can implement an MVP version. The goal of this MVP is to bundle the following code and produce the expected output:

// index.js
const b = require('./const.js')
console.log(b)

// const.js
module.exports = 'hello'
Enter fullscreen mode Exit fullscreen mode

We create a Compiler struct with the following properties and methods:

pub struct Compiler {
    config: Config,
    entry_id: String,
    root: String,
    modules: HashMap<String, String>,
    assets: HashMap<String, String>,
}

impl Compiler {
    pub fn new(config: Config) -> Compiler {
        // Implementation
    }

    fn parse(
        &self,
        module_path: PathBuf,
        parent_path: &Path,
    ) -> (String, Rc<RefCell<Vec<String>>>) {
        // Get module's code and dependencies
    }

    fn build_module(&mut self, module_path: PathBuf, is_entry: bool) {
        // Call build_module recursively to get modules (key is module_id, value is the code)
    }

    fn emit_file(&mut self) {
        // Output result
    }

    pub fn run(&mut self) {
        // Entry point
    }
}
Enter fullscreen mode Exit fullscreen mode

In this code, run is the entry point which calls build_module. build_module first calls parse to get the code of the JS module and its dependencies (while also transforming the original code). Then, build_module recursively calls itself to handle these dependencies, resulting in the modules HashMap. modules has module IDs as keys (relative paths from the root) and the transformed module code as values. Finally, run calls emit_file to output the result. You can find the complete code in the Pull Request. Let's discuss some key points.

Transform

In a previous example, we showed how to modify the argument of require by changing it to a fixed value, full_path_of_b:

string_literal.value = Atom::from("full_path_of_b")
Enter fullscreen mode Exit fullscreen mode

However, in actual development, this argument is dynamic and related to the current JS module's path. Let's use the following example to demonstrate how to add a dynamic prefix to the argument of require:

struct MyTransform {
    prefix: String,
}

impl<'a> Traverse<'a> for MyTransform {
    fn enter_call_expression(&mut self, node: &mut CallExpression<'a>, ctx: &mut TraverseCtx<'a>) {
        if node.is_require_call() {
            let argument: &mut Argument<'a> = &mut node.arguments.deref_mut()[0];
            match argument {
                Argument::StringLiteral(string_literal) => {
                    let old_name = string_literal.value.as_str();
                    let new_name = format!("{}{}", self.prefix, old_name);

                    // !!!!!! `new_name` does not live long enough
                    string_literal.value = Atom::from(new_name.as_str());
                }
                _ => {}
            }
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

The above code does not compile and produces an error: new_name does not live long enough. This is because new_name will be destroyed after the function execution, but Atom::from requires a lifetime 'a. The solution is to use:

string_literal.value = ctx.ast.atom(new_name.as_str());
Enter fullscreen mode Exit fullscreen mode

The reason can be explained by examining the source code of ctx.ast.atom:

#[inline]
pub fn atom(self, value: &str) -> Atom<'a> {
    Atom::from(String::from_str_in(value, self.allocator).into_bump_str())
}
Enter fullscreen mode Exit fullscreen mode

We can see that the atom method does not declare a lifetime and ultimately calls Atom::from, but the value inside it is generated by String::from_str_in, where the second parameter self.allocator has a lifetime:

pub struct AstBuilder<'a> {
    pub allocator: &'a Allocator,
}
Enter fullscreen mode Exit fullscreen mode

The Allocator is a memory allocation tool based on bumpalo. It seems that Allocator is commonly used in implementing parsers. You can refer to this tutorial for more information, but for now, we will skip it.

Emit File

When outputting the final bundled file, a template engine called sailfish is used. The template looks like this:

(function(modules) {
    var installedModules = {};
    ...

    // Load entry module and return exports
    return __webpack_require__(__webpack_require__.s = "<%- entry_id %>");
})
({
   <% for (key, value) in modules { %>
     "<%- key %>":
     (function(module, exports, __webpack_require__) {
       eval(`<%- value %>`);
     }),
   <%}%>
});
Enter fullscreen mode Exit fullscreen mode

We only need to output entry_id and the contents of modules into the template.

Running cargo run produces the following output:

// out/bundle.js
(function(modules) {
    var installedModules = {};
    ...
    // Load entry module and return exports
    return __webpack_require__(__webpack_require__.s = "./index.js");
})
({

     "./const.js":
     (function(module, exports, __webpack_require__) {
       eval(`module.exports = "hello";
`);
     }),

     "./index.js":
     (function(module, exports, __webpack_require__) {
       eval(`const b = __webpack_require__("./const.js");
console.log(b);
`);
     }),

});
Enter fullscreen mode Exit fullscreen mode

If running it with Node.js results in the correct output of hello, it means the MVP has been successfully completed.

Please kindly give me a star!

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .