Deep dive into React codebase [EP3: Reverse engineer the most famous React snippet]

Nick - Jan 23 '22 - - Dev Community

TL;TR: This post turned out to be quite long and boring. You may skim through it and go to the next one. There will be a recall with all essential info. All next posts are much more bite-sized and lightweight.

In the previous episode we finished with the React repository setup.
In today's episode, we'll scratch the surface of the actual source code and reverse engineer probably the most well-known React snippet.

Recall

What we learned in the previous episodes

The React monorepo contains a lot of React-related packages, including React core, renderers, reconciler, utility packages, devtools, and testing utilities.
Some of them (like react, react-dom and react-reconciler) are more relevant for developing a good understanding of React source code as a library for building UI in browser environment.
Others are related to more advanced stuff, like testing, tooling, or React Native, and only relevant if we would explore React with its toolset.

Knowing all this, we are ready to dive straight into the code.

Finding the right approach

It's hard to come up with the right approach for exploring React codebase, mainly because it's tremendously huge and complex in its current state.
I've already tried to do it a couple of times head-first without an approximate understanding or a plan of how to do it.
This time, we'll try it another way.

Plan for today

We'll try to discover the codebase in the most logical way, I could come up with. We won't do the "start with the package.json, find an entry index.js file and move from there" because it's extremely hard to not get lost this way.
Instead, we'll start with the simplest React code, which most of us have seen dozens of times, and reverse engineer it with the help of the real React source code.



import React from 'react';
import ReactDOM from 'react-dom';

import App from './App.js';

ReactDOM.render(<App />, document.getElementById('root'));


Enter fullscreen mode Exit fullscreen mode

This approach keeps things simple, follows a gentle learning curve and allows you to start with the most practical and intriguing stuff. It's similar to how we create production-ready code, starting with the outline of a solution and going into details on demand. Simply put, we forge our own path from the basics to the final destination, not the other way around.

Sidenote: It's an experimental approach, thus I don't know whether it actually works well in scale.
So if you like it and it works for you, leave a comment to let me know, that I should continue using it.
Or if it's the other way around for you, leave a comment on what was wrong and I'll try to design a better approach, based on your feedback.
Thanks in advance 🙏🏻

Materials for the episode

I set up a repository on GitHub for this series. We'll explore, experiment, and play around there.
It's a monorepo (yeah, like the React repository), so it will contain a directory for each episode from now on.
Clone the repo to your local machine.



$ git clone https://github.com/fromaline/deep-dive-into-react-codebase.git


Enter fullscreen mode Exit fullscreen mode

Or open it in your favorite online code editor, like Gitpod or CodeSandbox.

Our setup

In the repo you'll find a directory for the current episode, called ep3 with the simplest possible React setup. It's just an html page, where react and react-dom is added through unpkg.



<!-- index.html -->
<body>
    <div id="root"></div>

    <script src="https://unpkg.com/react@17.0.0/umd/react.development.js"></script>
    <script src="https://unpkg.com/react-dom@17.0.0/umd/react-dom.development.js"></script>
    <script  src="./index.js"></script>
</body>


Enter fullscreen mode Exit fullscreen mode

And js file with a well-known setup, that you can find in virtually any React web application source code in some way or another.



// index.js
const App = <div>Hello world!</div>;

ReactDOM.render(<App />, document.getElementById('root'));


Enter fullscreen mode Exit fullscreen mode

Such a simple setup declutters our experience of investigation. It removes complexity, that modern frontend tooling, like webpack and babel introduce for the convenience of end-users. But we don't want to be just end-users, we aspire to develop in-depth understanding, thus we don't need these tools.

Get up and running

Now we need to spin up the index.html in the browser.
I use http-server, but you may use your favorite one, like live-server from VSCode or Python http.server.



$ http-server episodes/ep3


Enter fullscreen mode Exit fullscreen mode

The first thing, that we see is an error like this.



Uncaught SyntaxError: Unexpected token '<' index.js:1


Enter fullscreen mode Exit fullscreen mode

error in jsx

This error occurred because we use JSX without an appropriate tool, like Babel to compile it. So we need to "compile" JSX ourselves.

What Babel does internally is pretty straightforward. It replaces JSX with calls to React.createElement or other function if it was explicitly specified with special annotation syntax.



// @jsx React.createElement

const App = <div>Hello world!</div>;


Enter fullscreen mode Exit fullscreen mode

So after the transpilation phase happened the code looks like plain old JavaScript. You may double-check it in Babel REPL.



const App =  React.createElement('div', null, 'Hello world!');


Enter fullscreen mode Exit fullscreen mode

screenshot of the browser with the example
Now we see our Hello world example and may finally go on!

Reverse engineering

The goal

Our goal for today's and the next episode is to grasp how react-dom mounts the tree of React components to the real DOM. It's important to understand this process, because it's the first thing, that you initialize in React app.

The hypothesis

Let's form a hypothesis to start with.
I assume from my understanding of how real DOM works, that react-dom traverses a tree of React components (virtual DOM), formed by react package.



const App = {
  type: 'div',
  props: {},
  children: ['Hello world!'],
};


Enter fullscreen mode Exit fullscreen mode

Then react-dom creates a real DOM structure, based on the virtual DOM.



const el = document.createElement(App.type);
// ...
if (App.children.length === 0) {
  const child = App.children[0];
  // ...
  if (typeof child === 'string') {
    child.textContent = child;      
  }
}


Enter fullscreen mode Exit fullscreen mode

Then react-dom mounts the result in provided container.



container.appendChild(el);


Enter fullscreen mode Exit fullscreen mode

Test the hypothesis

Now we'll test the hypothesis and find out whether we were right or not.

What React.createElement does and how it works?

First of all, let's check out how React.createElement actually works and what it returns. We already know, that it relates to the react package, thus let's check the packages/react directory.



// packages/react/index.js

// ...
export {
  // ...
  createElement,
  // ...
} from './src/React';


Enter fullscreen mode Exit fullscreen mode

Here it is, so then find the place from where it's exported.



// packages/react/src/React.js

const createElement = __DEV__ ? createElementWithValidation : createElementProd;


Enter fullscreen mode Exit fullscreen mode

As you can see, createElement's value differs, based on __DEV__ global variable, which in turn defines whether code was compiled in so-called development mode or not.

Based on the name of these two functions and the meaning of the __DEV__ variable, I assume, that the createElementWithValidation does additional validation to provide meaningful error messages and warnings in development mode. And createElementProd is probably more performant and generally tailored towards production use.

createElementWithValidation

Firstly let's check the former assumption by introducing an error in our React app. We provide a null value instead of the actual valid type.



// index.js

const App = React.createElement(null, null, 'Hello world!');


Enter fullscreen mode Exit fullscreen mode

Great, now we see a typical React warning and can easily trace where it was initialized.

Warning: React.createElement: type is invalid -- expected a string (for built-in components) or a class/function (for composite components) but got: null. from react.development.js:245

The place, where it was initially called is our createElementWithValidation function, so click on the react.development.js:2240 to see the actual code.

trace warning

It becomes clear from this code snippet, that our first assumption is near the truth. createElementWithValidation checks whether provided type is valid and if not throws different warnings, based on what exactly is wrong with provided type.

Sidenote: You may ask, why is there such weird statement in the code?



{
  error('React.createElement: type is invalid...')
}


Enter fullscreen mode Exit fullscreen mode

Simply put, it's a block statement, but without if condition.
if statement was stripped out by webpack, because it's a development build, thus all warnings and errors must show up.
This topic is a bit out of scope of the article, for more info check out my Twitter thread.

Now let's remove the error and observe what else happens inside this function.



function createElementWithValidation(type, props, children) {
  var validType = isValidElementType(type);

  // We warn in this case but don't throw. We expect the element creation to
  // succeed and there will likely be errors in render.
  if (!validType) {
    // warnings, but no returns!
  }


Enter fullscreen mode Exit fullscreen mode

The first interesting bit here is how error handling is implemented, there is even a comment about it right after the validType variable.
React developers don't throw an exception in case the type is invalid, instead, they proceed but expect some errors in the render.
We know that render in React is handled by renderers, in our casereact-dom.
So from this, we can assume, that there are some validations regarding React components and appropriate warnings inside react-dom itself.

Sidenote: It's an interesting assumption because it implies, that output of the react package is not valid all the time and renderers need to validate, what they get from it on its own.
We'll definitely test this assumption in one of the next articles.

Let's continue with the function. After the initial check, it calls the more general-purpose createElement function.



var element = createElement.apply(this, arguments);


Enter fullscreen mode Exit fullscreen mode

So, this fact probably indicates, that there is a single createElement function, which actually creates the element. And createElementWithValidation and createElementProd are only wrappers, that add some extra functionality.
We'll test this assumption after we are done with current observations.

Here we see the check against null with type coercion and the useful comment.



// The result can be nullish if a mock or a custom function is used.
// TODO: Drop this when these are no longer allowed as the type argument.
if (element == null) {
  return element;
}


Enter fullscreen mode Exit fullscreen mode

This snippet shows, that element can be null or even undefined if "a mock or a custom function" is used.
It's hard to say for sure now, how custom function can be used here, because createElementis hardcoded, but we definitely will figure it out later.

Sidenote: Right now I can't fully understand what's the TODO part means. My initial guess is, that this check could be removed, whenever null or undefined won't be allowed as a value of the element.
If you have a better idea of what it means, write it in the comments section! I would be grateful.

Next thing is a validation of child keys.



// Skip key warning if the type isn't valid since our key validation logic
// doesn't expect a non-string/function type and can throw confusing errors.
// We don't want exception behavior to differ between dev and prod.
// (Rendering will throw with a helpful message and as soon as the type is
// fixed, the key warnings will appear.)
if (validType) {
  for (var i = 2; i < arguments.length; i++) {
    validateChildKeys(arguments[i], type);
  }
}


Enter fullscreen mode Exit fullscreen mode

From the actual snippet, we can conclude, that key validation only happens, if the initially provided element's type was valid. From the first two sentences of the comment it becomes more obvious what's the reason behind such behavior. validateChildKey doesn't expect a non-string/function type and as a result can throw confusing errors, that would differ from the production version.

Sidenote: it's a bit mind-blowing for me, that key validation logic requires the type of the element to be valid because at first glance they seem mostly unrelated.

From the third sentence of the comment we again see, that proper error handling is expected from a renderer, instead of the react package.

Finally, functions ends with some other validation and a return statement.



if (type === exports.Fragment) {
  validateFragmentProps(element);
} else {
  validatePropTypes(element);
}

return element;


Enter fullscreen mode Exit fullscreen mode

Here we see a simple return and two separate validations before it:

  • Fragment's props validation
  • General element's props validation

So we can conclude, that prop-types validation happens here and props validation is handled differently if the element is fragment.

Now let's check what createElementProd does and how it differs from createElementWithValidation.

createElementProd

Let's get back to our packages/react/src/React.js and trace from where createElementProd is exported.



// packages/react/src/React.js

const createElement = __DEV__ ? createElementWithValidation : createElementProd;


Enter fullscreen mode Exit fullscreen mode

We can use the standard feature of modern IDEs to find where createElementProd is implemented or just check the imports at the beginning of the file. I'll use the later method.



// packages/react/src/React.js

import {
  createElement as createElementProd,
  // ...
} from './ReactElement';


Enter fullscreen mode Exit fullscreen mode

In fact createElementProd is just an import alias for the createElement functions.
So out initial assumption regarding createElementWithValidation and createElementProd was nearly correct, but not quite.
In reality, the case is even simpler:

  • We just have a single createElement function, that is used in the production environment.
  • createElementWithValidation function adds additional validation to provide meaningful warnings and it's used in the development environment.
createElement

With our new knowledge about this whole create-element situation, we just need to figure out what createElement returns to understand how elements are created in both the prod and dev environment.
To do this let's jump to createElement function from its call inside createElementWithValidation.

createElement function

And put a debugger breakpoint right after the return statement.

return inside createElement

Finally, we see what we get from the React.createElement call. Now let's fix the inaccurate part of the hypothesis to reflect our new knowledge.

Tweak the hypothesis

I assume from my understanding of how real DOM works, that react-dom traverses a tree of React components (virtual DOM), formed by react package.



const App = {
 type: 'div',
 props: {},
 children: ['Hello world!'],
};


In reality the tree of React components looks more like this.



const App = {
  "$$typeof": Symbol(react.element),
  "type": "div",
  "key": null,
  "ref": null,
  "props": {
    "children": "Hello world!"
  },
  "_owner": null,
  "_store": {},
  "_self":  null,
  "_source":  null
}


Enter fullscreen mode Exit fullscreen mode

Where we were wrong in the original version?

  • children is not separate property, instead, it's a property inside props
  • If there is only one child, it's passed without wrapping array. At least if the only child is a text.
  • React components have a couple of other properties (we have yet to figure out what are they about), more specifically:
    • $$typeof
    • key
    • ref
    • _owner
    • _store
    • _self
    • source

But overall the first part of our hypothesis was pretty accurate! We just broaden it and fix minor issues.

Wrap up

It was a long journey and we learned a ton today!
In the next episode, we are going to continue with our hypothesis. More precisely we'll try to find out what exactly react-dom does with virtual DOM and how the render actually works.
So I'm looking forward to seeing you in the next episode!

What we learned today

IMO, the main thing, that we learned today has nothing to do with React's inner workings. It is rather the approach we can take to understand how some code works under the hood.
So, I hope you'll apply it yourself!

I am looking forward to similar content

. . . . . . . . . . . . . . . . . . . . . . . . . . .