HTML is the core technology of the web. Yet, it has changed very little in recent years. Is HTML complete? Is it neglected? What is the state of HTML?
What is the state of HTML?
My takes...
How is HTML used?
- People don't write HTML in a HTML file very often these days. They write HTML embedded in something else - a template language such as nunjucks, or in a web component (mostly React).
- For producing content, a lot of people write in markdown.
- For interactive content, some people prefer the authoring experience of markdown plus their favourite UI framework. MDX is probably most popular variant of that.
- In some respects, HTML has become more of an output format like PDF.
The medium
- As a document-oriented medium, HTML has covered all of the bases for a long time. Dealing with text and multimedia content has been in the language since HTML 5 for the most part. That was 2008.
- As an application-oriented medium, serving dynamic content as HTML in the client-server model is well served by server-side languages. People refer to this as Multi-Page Apps (MPAs) now. Quoting Jason Miller's application holotypes, content websites and storefronts are well for catered here. The difficulty arises when you want a richer interactive experience - a desktop-like app in the browser.
- There are incongruities with what HTML offers and what is desirable for demanding applications. There is a desire to make granular changes to a webpage without requesting an entire HTML file from the server. A webpage is a HTML document, that is the unit of currency in the client-server world. HTTP is a stateless protocol, meaning that the server does not keep any data (state) between two requests. Therefore every page request can be like a reset. These are the major hurdles for a desktop-like experience over a network using HTTP and HTML. This has led to the development of Single-Page Apps (SPAs) and various hybrid approaches, usually where HTML is generated client-side from data.
Building user interfaces
- People hands down reach for UI frameworks. Native web components were in the doldrums for some time. People may have their own component libraries or use UI kits.
- There are deficiencies for building user interfaces. Some UI controls (
input
element and others likeselect
) are hard to customize their style and behaviour. Not many UI controls have pseudo-elements that can be referenced in CSS. People make their own custom UI controls often. - There is interest in improving the state of the UI control elements but it hasn't yieled any results yet as far as I know. The Open UI group are advocating for adopting an industry standard definition of UI on the web. They did get an experimental input control called
selectmenu
added to Chrome recently, it is proposed as a replacement for theselect
element. They have proposals for other UI elements too.
Evolution of HTML
- There have been some but not many changes to HTML in the last 2 to 3 years. The changes I am aware of are: the
dialog
element became evergreen, thesearch
element was added, theinert
attribute was added, lazy loading of images and iframes through theloading
attribute has gotten wider support, and declarative shadow DOM has been implemented in some browsers. - There does not seem to be an interest in augmenting HTML to send and retrieve data at a more granular level than a file. The HTMX library is an example of what this could look like. It gives you access to AJAX, CSS transitions, web sockets, and server-sent events directly in HTML using attributes. Other project like Hotwire (Ruby) offer a similar approach where HTML fragments are requested to partially update a web page.
- People are still a bit confused about the versioning of HTML since it moved from a number such as HTML 4.01 to a living standard i.e. constant evolution.
Handling data that is bound to HTML
- Nowadays people use the
form
element far less for sending data to a server. Having an independent backend and frontend (split architecture) became more prominent with the advent of AJAX (Asynchronous JavaScript and XML). Now thefetch
API or a similar library is used to retrieve data from a REST or GRAPHQL endpoint. The frontend is binding this data to HTML. - Static-site generators have become more sophisticated at fetching data from different sources to generate HTML at build time.
- Application frameworks based on UI frameworks like Next (React) and Nuxt (Vue) provide data fetching as part of their featureset usually the code sits alongside your components. Most of the application frameworks tend to spit out a SPA, but some offer other "modes" such as a static site generation. Rich Harris calls this hybrid modality as Transitional Apps (TAs I guess) where you can opt into a mode for a component that will determine where and when your HTML is generated: client, server, or prerendered. He is trying to implement this concept in SvelteKit.
- The trend now is to shifting towards a broad notion of server components. The concept is to have the same web component authoring experience but do it closer to the data on the server. The server returns mostly HTML rather than data and JavaScript. Frameworks like Astro are calling this content-first websites, where pre-rendering HTML and doing things server-side is the default. Astro allows you to bring your own UI framework and handles transformation. React is the first UI framework to formalize a standard (React Server Components) for this concept that application frameworks can implement. It is early days for it.
- Server-side languages are still doing their thing. The default is to serve HTML. This is considered boring tech now. A tonne of the web still lives on the LAMP stack. You can build a server-side web application in almost any language now.
Education
I don't have much insight on how well HTML is taught in general now. From what I gather, there is a tendency of teaching just enough HTML and CSS to get to building things with JavaScript. It is left up to the student to cover things in more depth. It does appear some courses put more emphasis on accessibility now, which is a step forward.
Perception of HTML
- Maybe, the last time there was excitement about HTML was when HTML 5 was shipped. Remember the website html5rocks.com?
- There is a perception by some that there is not much to HTML. You can learn it quickly.
- The industry surveys run by devographics on frontend technology started with the State of JavaScript survey in 2016. Subsequently annual surveys were added for CSS in 2019, and HTML in 2023. It demonstrates the relative priority given to the frontend technologies. These surveys are used as an input to the Interop initiative that is used to priortise improvement of browser features, so these surveys carry some weight.
- Some people don't see HTML as a programming language, some do. It's not a productive debate in my opinion.
Defining moments
When you look back to try to understand where we are, it is hard to extrapolate the reasons why this or that happened. I would say that probably the inability to build to a webpage composed from multiple HTML files was the biggest spur for change. It is not practical to duplicate common fragments such as the website navigation across many HTML files. In the early days, people had some sort of HTML concatenation happening on a server or in a build step. There was technologies like Common Gateway Interface that enabled running scripts. Microsoft developed their server-side language Active Server Pages to be able to write dynamic webpages (gasp). None of it was particularly pretty initially. It eventually led to the rise of PHP and WordPress on the web, which provided a better experience of creating HTML from data server-side.
The web platform's proposal to resolve some of this in HTML was called HTML includes or something similar. It never happened. I guess the route was to permit building custom elements (native web components) in HTML instead. Native web components had a very rocky inception and did not offer a viable alternative to UI frameworks. You can read the article A Criticism of Web Components to get a background on the shortcomings. The platform was too stagnant in this area for too long and the solutions came from elsewhere.
I believe that native web components are more mature now and are a viable option. I am not speaking from experience. Dave Rupert has advocated for them, he writes about them in his article and talk titled HTML with Superpowers. Superpowers are welcome! In any case, it will take considerable time before there is a significant shift to native web components. As much as you may want to just use the platform, there are challenges to overcome.
The desire to have every type of application on the web with a desktop-like experience led to divergent architectures such as SPAs. The crux was atomic changes to a webpage and the perservation of appplication state. Perhaps a different path could have been made if HTML and related technologies were augmented to accomodate a wider range of applications with less reliance on JavaScript. At the moment, we are slimming down some of the fat clients that have become prevalent to reach a better compromise.
Reviewing survey questions from 2023 State of HTML Survey
The inaugural State of HTML survey was done recently. The arc of the survey seems to be that it is trying to ascertain what parts of HTML you know and use, and what parts are difficult to use. It is quite comprehensive, there are over 100 questions. The questions reiterated to me that there are always some tidbits about HTML that you do not know! There are some things that I don't use, and some things that I may never use!
Some things I never heard of:
-
<input type="file" accept="video/*" capture>
- Captures input from the user’s camera. That's an interesting one! -
controlslist
attribute - Prevent certain controls from appearing in the toolbar of a media element e.g.video
. Nice to have the option but did not reach for it before. -
input.showPicker()
- Programmatically opening the picker of form controls that have one (color pickers, date inputs etc). -
contenteditable="plaintext-only"
- Permits editing of the element's raw text, but not rich text formatting. I was aware of the attribute but not the value. Cool distinction but I'd always question if you should use this in the first place!
Some things I heard of but have not used:
-
<datalist>
provides a method of providing a list of presets for a user to select in a form control, while still allowing custom options. Kind of helps with autocompletion. I never remember this one because it is not build intoinput
!! -
autocomplete
attribute
New things I heard of:
-
dialog
,search
elements. -
inert
attribute - Popover API
Regular pain points:
- Styling
input
andselect
elements. - Disclosure widgets - why can't I just use
details
? I need to look up the caveats. - Adding
width
andheight
to images to prevent Cumulative Layout Shift. Need to configure a tool to take care of this. - There is a bit of a pardigm clash with responsive syntax for images that is based on viewport size and container queries that is based on the element size. I don't know how to tackle this in a consistent way yet.
HTML-y things that I haven't really used but would like to use more at some point in time:
- Native web components
- Some Progressive Web App APIs - the ability to build a proper web app with core technologies and publish to an app store is a dream!
Conclusion
Web development is weird. Once all you had was HTML. It was everything on the web. Now, HTML has become more of an final, output format like PDF rather than something you write a webpage in! It is like a garnish on a dish, the most visible bit on top that you consider last and regard least! Yet, still a webpage is nothing without HTML. It is a bit of paradox.
I think there are improvements that can be made to HTML that will make the web better. HTML is not complete! The first port of call for me would be to improve the experience making user interfaces with HTML. Make UI controls (select
et al) easier to style and add behaviour to. The Open UI group are advocating for adopting an industry standard definition of UI on the web. Let's listen to them. There is some movement on this front with the experimental selectmenu
element in Chrome. Let's crank that up!
I would like to see HTML evolve further. A paradigm to offer a partial update to a webpage is needed. There should be a HTML-y way to do the islands and hydration stuff that is sweeping through JavaScript. The HTMX library offers an example of what this could look like using attributes and leveraging existing protocols such as HTTP and websockets.
I will be interested to see the results of the State of HTML survey.