Implementing Webpack from Scratch, But in Rust - [1] Parsing and Modifying JS Code Using Oxc

ayou - Oct 24 - - Dev Community

Referencing mini-webpack, I implemented a simple webpack from scratch using Rust. This allowed me to gain a deeper understanding of webpack and also improve my Rust skills. It's a win-win situation!

Code repository: https://github.com/ParadeTo/rs-webpack

This article corresponds to Pull Request

To implement a simple webpack, the primary task is to address the issue of JavaScript code parsing. Building a JavaScript parser from scratch is a monumental task, so it's better to choose an existing tool. Here, I chose oxc, which has the endorsement of Evan You.

Although oxc doesn't have as detailed documentation as Babel, the usage patterns are similar. First, we need to use oxc_parser to parse the JS code and generate an AST (Abstract Syntax Tree):

let name = env::args().nth(1).unwrap_or_else(|| "test.js".to_string());
let path = Path::new(&name);
let source_text = Arc::new(std::fs::read_to_string(path)?);
let source_type = SourceType::from_path(path).unwrap();

// Memory arena where Semantic and Parser allocate objects
let allocator = Allocator::default();

// 1 Parse the source text into an AST
let parser_ret = Parser::new(&allocator, &source_text, source_type).parse();
let mut program = parser_ret.program;

println!("Parse result");
println!("{}", serde_json::to_string_pretty(&program).unwrap());
Enter fullscreen mode Exit fullscreen mode

The content of test.js is as follows:

const b = require('./b.js')
Enter fullscreen mode Exit fullscreen mode

The parsed AST looks like this:

{
  "type": "Program",
  "start": 0,
  "end": 28,
  "sourceType": {
    "language": "javascript",
    "moduleKind": "module",
    "variant": "jsx"
  },
  "hashbang": null,
  "directives": [],
  "body": [
    {
      "type": "VariableDeclaration",
      "start": 0,
      "end": 27,
      "kind": "const",
      "declarations": [
        {
          "type": "VariableDeclarator",
          "start": 6,
          "end": 27,
          "id": {
            "type": "Identifier",
            "start": 6,
            "end": 7,
            "name": "b",
            "typeAnnotation": null,
            "optional": false
          },
          "init": {
            "type": "CallExpression",
            "start": 10,
            "end": 27,
            "callee": {
              "type": "Identifier",
              "start": 10,
              "end": 17,
              "name": "require"
            },
            "typeParameters": null,
            "arguments": [
              {
                "type": "StringLiteral",
                "start": 18,
                "end": 26,
                "value": "./b.js"
              }
            ],
            "optional": false
          },
          "definite": false
        }
      ],
      "declare": false
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

As webpack users know, during bundling, we need to replace require with __webpack_require__ and replace the relative path ./b.js with the full path. To achieve this, we need to modify the original code using oxc_traverse, which allows us to traverse the nodes in the AST and perform operations on the nodes we are interested in.

From the AST result above, we can see that the node of interest is CallExpression. Therefore, we can implement a Transform to modify this node as follows:

struct MyTransform;

impl<'a> Traverse<'a> for MyTransform {
    fn enter_call_expression(&mut self, node: &mut CallExpression<'a>, ctx: &mut TraverseCtx<'a>) {
        if node.is_require_call() {
            match &mut node.callee {
                Expression::Identifier(identifier_reference) => {
                    identifier_reference.name = Atom::from("__webpack_require__")
                }
                _ => {}
            }

            let argument: &mut Argument<'a> = &mut node.arguments.deref_mut()[0];
            match argument {
                Argument::StringLiteral(string_literal) => {
                    string_literal.value = Atom::from("full_path_of_b")
                }
                _ => {}
            }
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

You can use this Transform as follows:

// 2 Semantic Analyze
let semantic = SemanticBuilder::new(&source_text)
    .build_module_record(path, &program)
    // Enable additional syntax checks not performed by the parser
    .with_check_syntax_error(true)
    .build(&program);
let (symbols, scopes) = semantic.semantic.into_symbol_table_and_scope_tree();

// 3 Transform
let t = &mut MyTransform;
traverse_mut(t, &allocator, &mut program, symbols, scopes);
Enter fullscreen mode Exit fullscreen mode

Note that, unlike Babel, we need to use oxc_semantic to perform syntax analysis first and obtain symbols and scopes, which are then passed to traverse_mut.

Finally, we use oxc_codegen to regenerate the code:

// 4 Generate Code
let new_code = CodeGenerator::new()
    .with_options(CodegenOptions {
        ..CodegenOptions::default()
    })
    .build(&program)
    .code;

println!("{}", new_code);
Enter fullscreen mode Exit fullscreen mode

The resulting code will be:

const b = __webpack_require__('full_path_of_b')
Enter fullscreen mode Exit fullscreen mode

Please kindly give me a star!

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .