TL:DR - Skip the theory - Take me to the code
Prerequisites
Notes: For this article it is required that you have installed working version of Node.js on your machine. You will also need an http client for request handling. For this purpose, I will use Postman.
What are streams for Node.js?
Streams are a very basic method of data transmission. In a nutshell, they divide your data into smaller chunks and transfer (pipe) them, one by one, from one place to another. Whenever you're watching a video on Netflix, you're experiencing them first hand - not the whole video is initially sent to your browser, but only parts of it, piece by piece.
A lot of npm and native node modules are using them under the hood, as they come with a few neat features:
- Asynchronously sending requests and responses
- Reading data from - and writing data to one another - physical location
- Processing data without putting them into memory
The processing part makes streams particularly charming as it makes dealing with bigger files more efficient and lives the spirit of node's event loop unblocking i/o magic.
To visualize streams, consider the following example.
You have a single file with a size of 4 gb. When processing this file, it is loaded into your computers memory. That would be quite a boulder to digest all at once.
Buffering means loading data into RAM. Only after buffering the full file, it will be sent to a server.
Streams, in comparison to the example above, would not read/write the file as a whole, but rather split it into smaller chunks. These can then be sent, consumed or worked through one by one, lowering stress for the hardware during runtime. And that's exactly what we'll build now.
Instead of loading the whole file, streams process parts (chunks) of it one by one.
In a nutshell, streams splits a computer resource into smaller pieces, working through these one by one, instead of processing it as a whole.
Get started
... or skip to the full example right away
Let's formulate the features we'd like to have:
- To keep it simple, we will work with a single index file that opens an express server.
- Inside of it, there's a route that reacts to POST - requests and in which the streaming will take place.
- The file sent will be uploaded to the project's root directory.
- (Optional): We are able to monitor the streaming progress while the upload takes place.
Also, let's do the following to get started:
- Open up your favourite text editor and create a new folder.
- Initialize a npm project and install the necessary modules.
- Add an index.js file, which we'll populate with our code in a moment.
# Initialize the project
$ npm init -y
# Install the express module
$ npm i express
# Optionally add nodemon as dev dependency
$ npm i -D nodemon
# Create the index.js file
# $ New-Item index.js (Windows Powershell)
$ touch index.js (Linux Terminal)
When everything is done, you should have a folder structure that looks like this:
project-directory
| - node_modules
| - package.json
| - index.js
Create the server
Add the following to your index.js file to create the server listening to request:
// Load the necessary modules and define a port
const app = require('express')();
const fs = require('fs');
const path = require('path');
const port = process.env.PORT || 3000;
// Add a basic route to check if server's up
app.get('/', (req, res) => {
res.status(200).send(`Server up and running`);
});
// Mount the app to a port
app.listen(port, () => {
console.log('Server running at http://127.0.0.1:3000/');
});
Then open the project directory in a terminal / shell and start the server up.
# If you're using nodemon, go with this
# in the package.json:
# { ...
# "scripts": {
# "dev": "nodemon index.js"
# }
# ... }
# Then, run the dev - script
$ npm run dev
# Else, start it up with the node command
$ node index.js
Navigate to http://localhost:3000. You should see the expected response.
Writing a basic stream to save data to a file
There are two types of streaming methods - one for reading, and one for writing. A very simplistic example of how to use them goes like this, whereas whereFrom and whereTo are the respective path to from and to where the stream should operate. This can either be a physical path on your hard-drive, a memory buffer or a URL.
const fs = require("fs");
const readStream = fs.createReadStream(whereFrom)
const writeStream = fs.createWriteStream(whereTo)
// You could achieve the same with destructuring:
const {createReadStream, createWriteStream} = require("fs");
After being created and till it closes, the stream emits a series of events that we can use to hook up callback functions. One of these events is 'open', which fires right after the stream is instantiated.
Add the following below the app.get() method in the index.js - file
app.post('/', (req, res) => {
const filePath = path.join(__dirname, `/image.jpg`);
const stream = fs.createWriteStream(filePath);
stream.on('open', () => req.pipe(stream););
});
What I found particular interesting about this one is:
Why does thereq
argument have a pipe method?
The answer is noted in the http - module documentation which express builds on - a request itself is an object that inherits from the parent 'Stream' class, therefor has all its methods available.
Having added the stream, let us now reload the server, move to Postman and do the following:
- Change the request method to POST and add the URL localhost:3000.
- Select the 'Body' tab, check the binary option and choose a file you would like to upload. As we've hardcoded the name to be 'image.jpg', an actual image would be preferable.
- Click on 'Send' and check back to the code editor.
If everything went well, you'll notice the file you just chose is now available in the project's root directory. Try to open it and check if the streaming went successful.
If that was the functionality you were looking for, you could stop reading here. If you're curious to see what else a stream has in stock, read ahead.
Use stream -events and -methods
Streams, after being created, emit events. In the code above, we're using the 'open' - event to only pipe data from the request to its destination after the stream is opened. These events work very similar to the ones you know from app.use(). and make use of node's event loop. Let's now take a look at some of these which can be used to control the code flow
Event 'open'
As soon as the stream is declared and starts its job, it fires the open event. That is the perfect opportunity to start processing data, just as we've done previously.
Event 'drain'
Whenever a data chunk is being processed, it's 'drained' to / from somewhere. You can use this event to e.g. monitor how much bytes have been streamed.
Event 'close'
After all data has been sent, the stream closes. A simple use case for 'close' is to notify a calling function that the file has been completely processed and can be considered available for further operations.
Event 'error'
If things go sideways, the error event can be used to perform an action to catch exceptions.
Let us now integrate the three new events with some basic features. Add the following to your main.js file, below the closing of the 'open' event:
stream.on('drain', () => {
// Calculate how much data has been piped yet
const written = parseInt(stream.bytesWritten);
const total = parseInt(headers['content-length']);
const pWritten = (written / total * 100).toFixed(2)
console.log(`Processing ... ${pWritten}% done`);
});
stream.on('close', () => {
// Send a success response back to the client
const msg = `Data uploaded to ${filePath}`;
console.log('Processing ... 100%');
console.log(msg);
res.status(200).send({ status: 'success', msg });
});
stream.on('error', err => {
// Send an error message to the client
console.error(err);
res.status(500).send({ status: 'error', err });
});
Wrap up & modularization
Since you probably would not drop your functions right into a .post() callback, let's go ahead and create its own function to wrap this article up. I'll spare you with the details, you can find the finalized code below.
Also, if you skipped from above, the following is happening here:
- The code below creates an express server that handles incoming post requests.
- When a client sends a file stream to the route, its contents are uploaded.
- During the upload, four events are fired.
- In these, functions are called to process the file's content and provide basic feedback on the upload progress.
Now it's your turn. How about building a user interface that takes over the job of sending a file to the root path? To make it more interesting, try using the browser's filereader API and send the file asynchronously, instead of using a form. Or use a module like Sharp to process an image before streaming it back to the client.
PS: In case you try the former method, make sure to send the file as an ArrayBuffer
// Load the necessary modules and define a port
const app = require('express')();
const fs = require('fs');
const path = require('path');
const port = process.env.PORT || 3000;
// Take in the request & filepath, stream the file to the filePath
const uploadFile = (req, filePath) => {
return new Promise((resolve, reject) => {
const stream = fs.createWriteStream(filePath);
// With the open - event, data will start being written
// from the request to the stream's destination path
stream.on('open', () => {
console.log('Stream open ... 0.00%');
req.pipe(stream);
});
// Drain is fired whenever a data chunk is written.
// When that happens, print how much data has been written yet.
stream.on('drain', () => {
const written = parseInt(stream.bytesWritten);
const total = parseInt(req.headers['content-length']);
const pWritten = ((written / total) * 100).toFixed(2);
console.log(`Processing ... ${pWritten}% done`);
});
// When the stream is finished, print a final message
// Also, resolve the location of the file to calling function
stream.on('close', () => {
console.log('Processing ... 100%');
resolve(filePath);
});
// If something goes wrong, reject the primise
stream.on('error', err => {
console.error(err);
reject(err);
});
});
};
// Add a basic get - route to check if server's up
app.get('/', (req, res) => {
res.status(200).send(`Server up and running`);
});
// Add a route to accept incoming post requests for the fileupload.
// Also, attach two callback functions to handle the response.
app.post('/', (req, res) => {
const filePath = path.join(__dirname, `/image.jpg`);
uploadFile(req, filePath)
.then(path => res.send({ status: 'success', path }))
.catch(err => res.send({ status: 'error', err }));
});
// Mount the app to a port
app.listen(port, () => {
console.log('Server running at http://127.0.0.1:3000/');
});
This post was originally published at https://q-bit.me/use-node-streams-to-upload-files/
Thank you for reading. If you enjoyed this article, let's stay in touch on Twitter 🐤 @qbitme