Include HTML-in-HTML: an iteration

lionel-rowe - Aug 9 '21 - - Dev Community

Inspired by @fjones 's comment on this article:

This strikes me as a very interesting use case for webcomponents over the (rather crude) method in htmlinclude.js. It also seems like this would hit quite a lot of CSP problems. E.g. I suspect you would struggle to load any script tags or external resources from the included file.

Sounds like a challenge! Here are the design goals:

  • Simple, front-end-only HTML API to include HTML fragments inside other HTML documents, similar to htmlinclude.js
  • No HTML boilerplate required in the included HTML fragments. For example, <div></div> is fine — doesn't need to be <!DOCTYPE html><html lang="en"><head><title>title</title></head><body><div></div></body></html>
  • Renders multiple-child fragments with no problem. For example, <div>1</div> <div>2</div> works just as well as <div><div>1</div> <div>2</div></div> does
  • Once rendered, the include-html component is no longer present in the DOM
  • Allows including cross-origin content, as long as CORS headers set correctly on the resource
  • Runs script tags on same-origin content, unless sanitize attribute is set
  • Doesn't run script tags or anything else dangerous from cross-origin content

Without further ado, here's the implementation.

isSameOrigin

We use this function to check that the included content is from the same origin. If not, it'll definitely need sanitization, as we don't want 3rd parties to be able to inject scripts.

/** @param {string} src */
const isSameOrigin = (src) =>
    new URL(src, window.location.origin).origin === window.location.origin
Enter fullscreen mode Exit fullscreen mode

By providing a 2nd parameter base to the URL constructor, we resolve the src relative to the current origin. Then, we check if the origin of the two is the same.

For example:

  • new URL('./bar.html', 'https://foo.co') resolves to https://foo.co/bar.html, of which the origin is still https://foo.co, so the result will be true
  • new URL('https://baz.co/quux.html', 'https://foo.co') resolves to https://baz.co/quux.html. The base parameter in this case is ignored, as the src is already fully qualified. The origin is https://baz.co, different from https://foo.co, so the result will be false

safeHtml

This is the function we use to sanitize the HTML, if required.

/** @param {{ sanitize?: boolean } = {}} */
const safeHtml = ({ sanitize } = {}) =>
    /** @param {string} html */
    (html) => {
        const sanitized = sanitize !== false ? DOMPurify.sanitize(html) : html

        return Object.assign(sanitized, {
            __html: sanitized,
        })
    }
Enter fullscreen mode Exit fullscreen mode

We use DOMPurify, a widely used and battle-tested solution for HTML sanitization.

Using Object.assign on a string gives a String object with the additional properties added. By adding an __html property, we could directly use the result with React's dangerouslySetInnerHTML if we wanted, but we can still assign it directly to an element's innerHTML, as it's still a string... sort of.

const result = safeHtml()('<hr/>')

result // String {"<hr>", __html: "<hr>"}
result.valueOf() // "<hr>"
'' + result // "<hr>"
Enter fullscreen mode Exit fullscreen mode

IncludeHtml Web Component

Here's the web component itself:

class IncludeHtml extends HTMLElement {
    async connectedCallback() {
        const forceSanitize = Boolean(this.attributes.sanitize)
        const src = this.attributes.src.value

        if (!this.innerHTML.trim()) {
            this.textContent = 'Loading...'
        }

        const res = await fetch(src)

        const html = safeHtml({
            sanitize: !isSameOrigin(src) || forceSanitize,
        })(await res.text())

        const range = document.createRange()

        // make rendering of fragment context-aware
        range.selectNodeContents(this.parentElement)

        this.replaceWith(range.createContextualFragment(html))
    }
}

customElements.define('include-html', IncludeHtml)
Enter fullscreen mode Exit fullscreen mode

Using range.createContextualFragment means we can create a HTML fragment that will also execute any script tags present upon rendering (assuming we haven't sanitized them away yet). range.selectNodeContents means that the rendering will work as expected in a way that's aware of surrounding context. For example, trying to insert a tr outside of a table will render nothing, but it will work as expected within a table.

By using this.replaceWith, we immediately remove the Web Component from the DOM as the content is rendered, which is similar to what you'd expect from a back-end templating framework.

Usage

Finally, here are some examples of the component in use:

<nav>
    <include-html src="./includes/nav.html"></include-html>
</nav>

<main>
    <!--
        Including from 3rd-party source works
        (if CORS headers set properly on the source)
    -->
    <include-html
        src="https://dinoipsum.herokuapp.com/api/?format=html&paragraphs=2&words=15"
    ></include-html>
</main>

<footer>
    <include-html sanitize src="./includes/footer.html"></include-html>
</footer>
Enter fullscreen mode Exit fullscreen mode

You can see the rendered output and try it out yourself in this live CodeSandbox demo:

Thanks for reading! What improvements would you make to the API or features?

. . . . . . . . . . . . . .