A tour of web capabilities

Pere Sola - Apr 3 - - Dev Community

This post compiles notes from Max Firtman's course in Frontendmasters.

  • What is a capability? The ability of a web browser to perform an action using typically a built-in API. Compatibility can be an issue, of course. So some of the capabilities may not be available in some browsers.
  • Maturity of the capabilities: green (mature), light-green (may not be available in every browse. Normally chromium-based browsers vs Firefox/Safari), yellow (not mature and only available in some browsers), red (you can't use them and may be added in the long term).
  • Resources to check the capabilities: MDN, caniuse.com, web.dev, webkit.org/blog (specifically for Safari), chromestatus.com, web.dev/baseline (multivendor, define a list of stable capabilities, stable version for the web, one list per year starting in 2022).
  • Core capabilities NOT covered in this course: fetch, web workers (for threading), WebAssembly, WebSocket, WebRTC, WebPerformance APIs, Network information (see course here), Device memory, WebOTP (one-time password), Web Crypto (cryptography), storage (another course), Web Components, CSS, 2D Canvas, WebGL, Pointer Lock (for gaming), screen capture, PWA (course), page visibility, background sync, background fetch, web push, and notifications, media session (all these in the course), Web Authentication, passkeys, credential management, (all these in the course).
  • Permissions: some are harmless, others have a privacy risk or have a cost. If there is a cost or risk some browsers may limit it: user engagement requirements or permission dialog to the user. Most will require HTTPS. Some capabilities will need the user interaction to be enabled (aka you can't trigger it on page load). Permissions are granted on an origin base (domain). If the user denies permission, the API won't be able to ask again, manual re-enablement is required. Permission may have no time limit or it may be limited by time or session or usage. They are enabled for the main navigation (the HTML document that was loaded) - it won't be available for iframes. If you want to turn a capability off: permission policy spec, it's an HTTP header (Permissions-Policy), for an iframe it's an HTML attribute.

Capabilities

Permissions API

1. Permissions API.

MDN Docs here

Sensors, Geolocation, and Input Devices

  • Sensors on mobile devices: accelerometer (3 axis - x, z, y), gyroscope (turning the phone - alpha, beta, gamma), magnetometer (compass), proximity (to the user), light sensor.

2. Two ways to consume them on the web: old APIs (global DOM APIs, that is how you need to do it in iPhones and iPads. It only supports magnetometer, gyroscope, and accelerometer. Done with event listeners. i.e. devicemotion and deviceorientation (MDN docs here) and here. They need user permission), Sensor API (not yet in Safari nor Firefox, see MDN).

3. Geolocation API (one of the first capabilities on the web).

Google, Apple, etc. know your location because of WiFi - your devices know which WiFis are around and these companies can locate you because they have a database of SSIDs and their location. The API will give you the location, it's provider-agnostic (wifi, GPS, etc.). Works only in the foreground - won't work in the background/workers. Since the API is old, it's callback-based (no promises). MDN docs here.

4. Screen orientation API.

Green availability: if the device is in portrait or landscape mode. Other stuff is not yet widely available: lock the screen.

Image description

5. Touch events API.

They work with touch screens: touchstart, touchend, touchcancel, etc. Apple proprietary. MDN docs here. That is why the following API was created:

6. Pointer events API.

MDN docs here. Based on mouse events, and multi-pointer (they work with mouse, trackpad, touch pen, stylus, etc). YOu receive one event per touch interaction (ie. 3 fingers you get 3 events vs 1 event in touch event API above).

7. VirtualKeyboard API.

MDN docs here. This API lets us show/hide the virtual keyboard and know how much space it is taking in the screen.

Image description

8. Gamepad API.

MDN docs here. Greenlight API. High-level API (I don't need to know the specifics of the gamepad, and it's the minimum that all gamepads have i.e. if a gamepad has a speaker that is not common, so it is not supported via the API) and low latency. You create a requestAnimationFrame loop (60 fps) and you check the status of each button with a boolean.

Image description

9. Web HID API (HID = Human Interface Device).

MDN docs here. Limited availability (light-green).

Image description

Image description

Speech, voice, and camera

10. Web Speech API.

  • Speech recognition API

MDN docs here.
Image description

  • Speech synthesis API.

We can make the web app speak.

Image description

11. Shape detection API. Between yellow and red.

  • Barcode detector API. MDN docs here. Experimental, it is only available for Chrome on Android as of this writing. Chromestatus here

Image description

  • Face detector API. Experimental, it is only available for Chrome on Android as of this writing.

Image description

  • Text detector API (OCR). Experimental, it is only available for Chrome on Android as of this writing.

Image description

12. Media devices. It lets you open the camera and microphone, and get the stream.

Image description

13. Advanced control camera (aka Camera PTZ - Pan, Tilt, Zoom). Let's you manipulate the camera.

Image description

13. Augmented reality (AR, XR).

Image description

Image description

14. Screen Wake Lock API. MDN docs here. Light green (not supported on Safari on iOS).

Image description

External hardware and Devices

15. Web Bluetooth API. Limited availability. MDN docs here. FF and Safari don't implement it and looks like they will never do it, because of reasons. So only on Chromium-based browsers for now. Only with BLE devices (Bluetooth Low Energy).

Image description

Image description

16. Web Audio API. MDN docs here. Green. Low-level API that lets you generate dynamic audio, 3D audio, and ultrasound communication with devices. Sonic socket library for ultrasound communication between devices.

Image description

Image description

17. Web Midi API. MDN docs here. Limited availability (Chromium-based browsers only). It's similar but much older than HID.

Image description

Image description

18. Web serial API. Very low level API. MDN docs here. Experimental.

Image description

19. Web USB API. MDN docs here. Experimental. Low-level API for device vendors. You probably won't be using this API.

Image description

19. Vibration API. MDN docs here. Targeting phones only.

Image description

20. Battery status. MDN docs here. Doesn't work in many browsers.

Image description

21. Idle detection. MDN docs here. Limited availability.

Image description

22. Web NFC. MDN docs here. Limited availability.

Image description

Image description

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .