Referencing mini-webpack, I implemented a simple webpack from scratch using Rust. This allowed me to gain a deeper understanding of webpack and also improve my Rust skills. It's a win-win situation!
Code repository: https://github.com/ParadeTo/rs-webpack
This article corresponds to Pull Request
The previous section introduced how to use Oxc to parse and modify JS code, solving the core problem. Now we can implement an MVP version. The goal of this MVP is to bundle the following code and produce the expected output:
// index.js
const b = require('./const.js')
console.log(b)
// const.js
module.exports = 'hello'
We create a Compiler
struct with the following properties and methods:
pub struct Compiler {
config: Config,
entry_id: String,
root: String,
modules: HashMap<String, String>,
assets: HashMap<String, String>,
}
impl Compiler {
pub fn new(config: Config) -> Compiler {
// Implementation
}
fn parse(
&self,
module_path: PathBuf,
parent_path: &Path,
) -> (String, Rc<RefCell<Vec<String>>>) {
// Get module's code and dependencies
}
fn build_module(&mut self, module_path: PathBuf, is_entry: bool) {
// Call build_module recursively to get modules (key is module_id, value is the code)
}
fn emit_file(&mut self) {
// Output result
}
pub fn run(&mut self) {
// Entry point
}
}
In this code, run
is the entry point which calls build_module
. build_module
first calls parse
to get the code of the JS module and its dependencies (while also transforming the original code). Then, build_module
recursively calls itself to handle these dependencies, resulting in the modules
HashMap. modules
has module IDs as keys (relative paths from the root) and the transformed module code as values. Finally, run
calls emit_file
to output the result. You can find the complete code in the Pull Request. Let's discuss some key points.
Transform
In a previous example, we showed how to modify the argument of require
by changing it to a fixed value, full_path_of_b
:
string_literal.value = Atom::from("full_path_of_b")
However, in actual development, this argument is dynamic and related to the current JS module's path. Let's use the following example to demonstrate how to add a dynamic prefix
to the argument of require
:
struct MyTransform {
prefix: String,
}
impl<'a> Traverse<'a> for MyTransform {
fn enter_call_expression(&mut self, node: &mut CallExpression<'a>, ctx: &mut TraverseCtx<'a>) {
if node.is_require_call() {
let argument: &mut Argument<'a> = &mut node.arguments.deref_mut()[0];
match argument {
Argument::StringLiteral(string_literal) => {
let old_name = string_literal.value.as_str();
let new_name = format!("{}{}", self.prefix, old_name);
// !!!!!! `new_name` does not live long enough
string_literal.value = Atom::from(new_name.as_str());
}
_ => {}
}
}
}
}
The above code does not compile and produces an error: new_name does not live long enough
. This is because new_name
will be destroyed after the function execution, but Atom::from
requires a lifetime 'a
. The solution is to use:
string_literal.value = ctx.ast.atom(new_name.as_str());
The reason can be explained by examining the source code of ctx.ast.atom
:
#[inline]
pub fn atom(self, value: &str) -> Atom<'a> {
Atom::from(String::from_str_in(value, self.allocator).into_bump_str())
}
We can see that the atom
method does not declare a lifetime and ultimately calls Atom::from
, but the value inside it is generated by String::from_str_in
, where the second parameter self.allocator
has a lifetime:
pub struct AstBuilder<'a> {
pub allocator: &'a Allocator,
}
The Allocator
is a memory allocation tool based on bumpalo. It seems that Allocator
is commonly used in implementing parsers. You can refer to this tutorial for more information, but for now, we will skip it.
Emit File
When outputting the final bundled file, a template engine called sailfish is used. The template looks like this:
(function(modules) {
var installedModules = {};
...
// Load entry module and return exports
return __webpack_require__(__webpack_require__.s = "<%- entry_id %>");
})
({
<% for (key, value) in modules { %>
"<%- key %>":
(function(module, exports, __webpack_require__) {
eval(`<%- value %>`);
}),
<%}%>
});
We only need to output entry_id
and the contents of modules
into the template.
Running cargo run
produces the following output:
// out/bundle.js
(function(modules) {
var installedModules = {};
...
// Load entry module and return exports
return __webpack_require__(__webpack_require__.s = "./index.js");
})
({
"./const.js":
(function(module, exports, __webpack_require__) {
eval(`module.exports = "hello";
`);
}),
"./index.js":
(function(module, exports, __webpack_require__) {
eval(`const b = __webpack_require__("./const.js");
console.log(b);
`);
}),
});
If running it with Node.js results in the correct output of hello
, it means the MVP has been successfully completed.
Please kindly give me a star!