Hey! If you're reading this now, note that some of the articles linked in this article are not in English, I'm doing my best to translate them and post them here as soon as possible and will update it when I do!
A while ago, I made a post on my Medium that I talked all about the Iterator protocol and its user interface. However, in addition to APIs like Promise.finally
, ECMAScript 2018 has brought us another way of handling our iterators. The async iterators.
The problem
Let's put ourselves in a fairly common situation. We're working with Node.js and we have to read a file, line by line. Node has an API for this type of function called readLine
(see the full documentation here). This API is a wrapper so that you can read data from an input stream line by line instead of having to parse the input buffer and break the text into small pieces.
It exposes an event API, which you can listen to like this:
const fs = require('fs')
const readline = require('readline')
const reader = readline.createInterface({
input: fs.createReadStream('./file.txt'),
crlfDelay: Infinity
})
reader.on('line', (line) => console.log(line))
Imagine we have a simple file:
line 1
line 2
line 3
If we run this code on the file we've created, we'll get a line-by-line output on our console. However, working with events is not one of the best ways to make maintainable code, because events are completely asynchronous and they can break the flow of the code, since they are fired out of order and you can only assign an action through a listener.
The solution
In addition to the event API, readline
also exposes an async iterator
. This means that, instead of reading the line through listeners in the line
event, we will read the line through a new way of using the for
keyword.
Today we have a few options for using a for
loop. The first one is the most common model, using a counter and a condition:
for (let x = 0; x < array.length; x++) {
// Code here
}
We can also use the notation for ... in
notation to read array indexes:
const a = [1,2,3,4,5,6]
for (let index in a) {
console.log(a[index])
}
In the previous case, we will get as output in console.log
, the numbers from 1 to 6, however if we use console.log(index)
we will log the index of the array, that is, the numbers from 0 to 5.
For the next case, we can use the for ... of
notation to directly get the enumerable properties of the array, that is, its direct values:
const a = [1,2,3,4,5,6]
for (let item of a) {
console.log(item)
}
Notice that all the ways I've described are synchronous. So, how do we read a sequence of promises in order?
Imagine that we have another interface that always returns a Promise, which is resolved for our line of the file in question. To resolve these promises in order, we need to do something like this:
async function readLine (files) {
for (const file of files) {
const line = await readFile(file) // Imagine readFile is our cursor
console.log(line)
}
}
However, thanks to the magic of async iterables (like readline
) we can do the following:
const fs = require('fs')
const readline = require('readline')
const reader = readline.createInterface({
input: fs.createReadStream('./xpto.txt'),
crlfDelay: Infinity
})
async function read () {
for await (const line of reader) {
console.log(line)
}
}
read()
Notice that we are now using a new definition of for
, for await (const x of y)
For Await and Node.js
The for await
notation is natively supported in the Node.js runtime from version 10.x. If you are using versions 8.x or 9.x then you need to start your Javascript file with the --harmony_async_iteration
flag. Unfortunately, async iterators are not supported in Node.js versions 6 or 7.
In order to understand the concept of async iterators, we need to take a look at what iterators themselves are. My previous article is a great source of information, but in short, an Iterator is an object that exposes a next()
function that returns another object with the notation {value: any, done: boolean}
being value
the value of the current iteration and done
identifies whether or not there are any more values in the sequence.
A simple example is an iterator that goes through all the items in an array:
const array = [1,2,3]
let index = 0
const iterator = {
next: () => {
if (index >= array.length) return { done: true }
return {
value: array[index++],
done: false
}
}
}
On its own, an iterator is of no practical use, so in order to get some use out of it, we need an iterable
. An iterable
is an object that has a Symbol.iterator
key that returns a function, which returns our iterator:
// ... Iterator code here ...
const iterable = {
[Symbol.iterator]: () => iterator
}
Now we can use it normally, with for (const x of iterable)
and we'll have all the values in the array
being iterated one by one.
If you want to know a bit more about Symbols, take a look at this other article I wrote only about it.
Under the hood, all arrays and objects have a Symbol.iterator
so that we can do for (let x of [1,2,3])
and return the values we want.
As you might expect, an async iterator is exactly the same as an iterator, except that instead of a Symbol.iterator
, we have a Symbol.asyncIterator
in our iterable and instead of an object that returns {value, done}
we have a Promise that resolves to an object with the same signature.
Let's turn our iterator above into an async iterator:
const array = [1,2,3]
let index = 0
const asyncIterator = {
next: () => {
if (index >= array.length) return Promise.resolve({done: true})
return Promise.resolve({value: array[index++], done: false})
}
}
const asyncIterable = {
[Symbol.asyncIterator]: () => asyncIterator
}
Iterating asynchronously
We can iterate any iterator manually by calling the next()
function:
// ... Async iterator Code here ...
async function manual () {
const promise = asyncIterator.next() // Promise
await p // Object { value: 1, done: false }
await asyncIterator.next() // Object { value: 2, done: false }
await asyncIterator.next() // Object { value: 3, done: false }
await asyncIterator.next() // Object { done: true }
}
In order to iterate through our async iterator, we have to use for await
, but remember that the keyword await
can only be used inside an async function
, it means that we have to have something like this:
// ... Code above ...
async function iterate () {
for await (const num of asyncIterable) console.log(num)
}
iterate() // 1, 2, 3
But, as asynchronous iterators are not supported in Node 8.x or 9.x (super old I know), in order to use an async iterator in these versions, we can simply extract the next
from your objects and iterate through them manually:
// ... Async Iterator Code here ...
async function iterate () {
const {next} = asyncIterable[Symbol.asyncIterator]() // we take the next iterator function
for (let {value, done} = await next(); !done; {value, done} = await next()) {
console.log(value)
}
}
Note that for await
is much cleaner and much more concise because it behaves like a regular loop, but also, besides being much simpler to understand, it checks for the end of the iterator by itself, via the done
key.
Handling errors
What happens if our promise is rejected within our iterator? Well, like any rejected promise, we can catch its error through a simple try/catch
(since we're using await
):
const asyncIterator = { next: () => Promise.reject('Error') }
const asyncIterable = { [Symbol.asyncIterator]: () => asyncIterator }
async function iterate () {
try {
for await (const num of asyncIterable) {}
} catch (e) {
console.log(e.message)
}
}
iterate()
Fallbacks
Something quite interesting about async iterators is that they have a fallback for Symbol.iterator
, which means that you can also use it with your regular iterators, for example, an array of promises:
const promiseArray = [
fetch('https://lsantos.dev'),
fetch('https://lsantos.me')
]
async function iterate () {
for await (const response of promiseArray) console.log(response.status)
}
iterate() // 200, 200
Async Generators
For the most part, iterators and async iterators can be created from generators.
Generators are functions that allow their executions to be paused and resumed, so that it is possible to perform an execution and then fetch a next value via a next()
function.
This is a very simplified description of generators, and it is essential to read the article that talks only about them so that you can understand generators quickly and in depth.
Async generators behave like an async iterator, but you have to implement the stopping mechanism manually, for example, let's build a random message generator for git commits to make your colleagues super happy with their contributions:
async function* gitCommitMessageGenerator () {
const url = 'https://whatthecommit.com/index.txt'
while (true) {
const response = await fetch(url)
yield await response.text() // We return the value
}
}
Note that at no point are we returning a {value, done}
object, so the loop has no way of knowing when execution has finished. We can implement a function like this:
// Previous Code
async function getCommitMessages (times) {
let execution = 1
for await (const message of gitCommitMessageGenerator()) {
console.log(message)
if (execution++ >= times) break
}
}
getCommitMessages(5)
// I'll explain this when I'm sober .. or revert it
// Never before had a small typo like this one caused so much damage.
// For real, this time.
// Too lazy to write descriptive message
// Ugh. Bad rebase.
Usecases
For a more interesting example, let's build an async iterator for a real use case. Currently, the Oracle Database driver for Node.js supports a resultSet
API, which executes a query on the database and returns a stream of records that can be read one by one using the getRow()method
.
To create this resultSet
we need to execute a query in the database, like this:
const oracle = require('oracledb')
const options = {
user: 'example',
password: 'example123',
connectString: 'string'
}
async function start () {
const connection = await oracle.getConnection(options)
const { resultSet } = await connection.execute('query', [], { outFormat: oracle.OBJECT, resultSet: true })
return resultSet
}
start().then(console.log)
Our resultSet
has a method called getRow()
that returns us a Promise of the next row from the database to be fetched. That's a nice use case for an async iterator isn't it? We can create a cursor that returns this resultSet
row by row. Let's make it a bit more complex by creating a Cursor class:
class Cursor {
constructor(resultSet) {
this.resultSet = resultSet
}
getIterable() {
return {
[Symbol.asyncIterator]: () => this._buildIterator()
}
}
_buildIterator() {
return {
next: () => this.resultSet.getRow().then((row) => ({ value: row, done: row === undefined }))
}
}
}
module.exports = Cursor
See that the cursor receives the resultSet
that it should work on and stores it in its current state. So, let's change our previous method so that we return the cursor instead of the resultSet
in one go:
const oracle = require('oracledb')
const options = {
user: 'example',
password: 'example123',
connectString: 'string'
}
async function getResultSet() {
const connection = await oracle.getConnection(options)
const { resultSet } = await connection.execute('query', [], { outFormat: oracle.OBJECT, resultSet: true })
return resultSet
}
async function start() {
const resultSet = await getResultSet()
const cursor = new Cursor(resultSet)
for await (const row of cursor.getIterable()) {
console.log(row)
}
}
start()
This way we can loop through all our returned lines without needing an individual Promises resolution.
Conclusion
Async iterators are extremely powerful, especially in dynamic and asynchronous languages like Javascript. With them you can turn a complex execution into simple code, hiding most of the complexity from the user.
Be sure to follow more of my content on my blog and sign up for my newsletter to receive weekly news!