Manipulating files and directories are basic operations for any program. Since Node.js is a server-side platform and can interact with the computer that it’s running on directly, being able to manipulate files is a basic feature.
Fortunately, Node.js has a fs
module built into its library. It has many functions that can help with manipulating files and folders. File and directory operations that are supported include basic ones like manipulating and opening files in directories.
Likewise, it can do the same for files. It can do this both synchronously and asynchronously. It has an asynchronous API that has functions that support promises.
Also, it can show statistics for a file. Almost all the file operations that we can think of can be done with the built-in fs
module. In this article, we will create read streams to read a file’s data sequentially and listen to events from a read stream. Since Node.js ReadStreams are descendants of the Readable object, we will also listen to events to it.
Streams are collections of data that may not be available all at once and don’t have to fit in memory. This makes stream handy for processing large amounts of data.
It’s handy for files because files can be big and streams can let us get a small amount of data at one time. In the fs
module, there are 2 kinds of streams. There’s the ReadStream and the WriteStream.
ReadStream
ReadStreams are for reading in data from a file and then outputting them a small part at a time. A ReadStream can read a small part of a file or it can read in the whole file.
To create a ReadStream, we can use the fs.createReadStream
function. The function takes in 2 arguments. The first argument is the path of the file.
The path can be in the form of a string, a Buffer object, or an URL object.
The second argument is an object that can have a variety of options as properties. The flag
option is the file system flag for setting the mode for opening the file. The default flag
is r
. The list of flags are below:
'a'
– Opens a file for appending, which means adding data to the existing file. The file is created if it does not exist.'ax'
– Like'a'
but exception is thrown if the path exists.'a+'
– Open file for reading and appending. The file is created if it doesn’t exist.'ax+'
– Like'a+'
but exception is thrown if the path exists.'as'
– Opens a file for appending in synchronous mode. The file is created if it does not exist.'as+'
– Opens a file for reading and appending in synchronous mode. The file is created if it does not exist.'r'
– Opens a file for reading. An exception is thrown if the file doesn’t exist.'r+'
– Opens a file for reading and writing. An exception is thrown if the file doesn’t exist.'rs+'
– Opens a file for reading and writing in synchronous mode.'w'
– Opens a file for writing. The file is created (if it does not exist) or overwritten (if it exists).'wx'
– Like'w'
but fails if the path exists.'w+'
– Opens a file for reading and writing. The file is created (if it does not exist) or overwritten (if it exists).'wx+'
– Like'w+'
but exception is thrown if the path exists.
The encoding
option is a string that sets the character encoding in the form of the string. The default value is null
.
The fd
option is the integer file descriptor which can be obtained with the open
function and its variants. If the fd
option is set, then the path argument will be ignored. The default value is null
.
The mode
option is the file permission and sticky bits of the file, which is an octal number that are the same as Unix or Linux file permissions. It’s only set if the file is created. The default value is 0o666
. The autoClose
option specifies that the file descriptor will be closed automatically. The default value is true
.
If it’s false
, then the file descriptor won’t be closed even if there’s an error. It’s completely up to us to close it it autoClose
is set to false
to make sure there’s no file descriptor leak. Otherwise, the file descriptor will be closed automatically if there’s an error
or end
event emitted.
The emitClose
option will emit the close
event when the read stream ends. The default value is false
.
The start
and end
options specifies the beginning and end parts of the file to read. Everything in between will be read in addition to the start
and end
. start
and end
are numbers that are the starting and ending bytes of the file to read.
The highWaterMark
option is limit to the number of bytes that are read in the stream. The read stream will continue to be read and buffered if the highWaterMark
value is reached, but the memory usage will be high and the garbage collection performance will be poor, or it can crash your program with the Allocation failed - JavaScript heap out of memory
error.
The createReadStream
function returns a ReadStream object where you can attach event handlers to it.
To create a ReadStream, we can use the createReadStream
like in the following code:
const fs = require("fs");
const file = "./files/file.txt";
const stream = fs.createReadStream(file, {
flags: "r",
encoding: "utf8",
mode: 0o666,
autoClose: true,
emitClose: true,
start: 0
});
stream.on("open", () => {
console.log("Stream opened");
});
stream.on("ready", () => {
console.log("Stream ready");
});
stream.on("data", data => {
console.log(data);
});
stream.on("readable", () => {
while ((chunk = stream.read())) {
console.log(chunk);
}
});
stream.on("close", () => {
console.log("Stream closed");
});
When we run the code above, we should get something like the following outputted to the screen, assuming that you have ‘datadatadatadata’ written your a files.txt
file:
Stream opened
Stream ready
datadatadatadata
datadatadatadata
Stream closed
Stream closed
ReadStream Events
With a ReadStream, we can listen to the following events. There’s a close
event that is emitted when the close
event is emitted after the file is read.
The open
event is emitted when the stream is opened. The file descriptor number fd
will be passed with the event when it’s emitted. The ready
event is emitted when the ReadStream is ready to be used. It’s fired immediately after the open
event is fired.
The ReadStream extends the stream Readable object, which emits events of its own. The data
event is emitted whenever the stream data is sent to the consumer. It’s emitted when the readable.pipe()
function or readable.resume()
are called, or by attaching a listener callback to the data
event.
The data
event will also be emitted when the readable.read()
function is called and a chunk of data is available to be returned. The end
event is emitted when there’s no more data to be consumed from the stream. It won’t be emitted until the data is completely consumed.
This can be done by switch the stream to flowing more or calling stream.read()
repeated until all the data are consumed.
The error
event is emitted whenever an error occurs during the streaming or consumption of the stream. It can be because the stream can’t generate data due to internal failure or a stream attempts to push invalid chunks of data. The pause
event is emitted whenever ReadStream.pause()
is called and readableFlowing
isn’t false
.
readableFlowing
can have one of 3 states. One is null
. When it’s null
, this means that no mechanism for consuming the stream’s data is provided and therefore the stream won’t generate data.
When readableFlowing
is null,
attaching a listener for the 'data'
event, calling the readable.pipe()
method, or calling the readable.resume()
method will switch readable.readableFlowing
to true
, causing the ReadStream to start emitting events as data is generated.
Calling readable.pause()
, readable.unpipe()
, or receiving backpressure, which is the situation where data fills the buffer, readable.readableFlowing
to be set as false
, temporarily halting the flow of events but not halting the generation of data.
Attaching a listener for the 'data'
event will not switch readable.readableFlowing
to true
when readable.readableFlowing
is set as false
.
The readable
event is emitted when there’s data available to be read from the stream or the end of the stream has been reached. Attaching an event listener for the readable
event may cause some amount of data to be read into an internal buffer.
It will also be emitted when the end of the stream is reached but before the end
event is emitted. The resume
event is emitted when ReadStream.resume()
is called and the readableFlowing
isn’t true
.
A ReadStream object also has the following properties. The bytesRead
property let us get the number of bytes read so far.
The path
property is a string or a buffer that gets us the reference to the file. It’s the same as the first argument of createReadStream()
.
The data type will also be the same as what we pass in as the first argument. The pending
property is a boolean which is true
if the underlying file hasn’t been opened yet, or before the ready
event is emitted.
By using the fs.createReadStream
function, we created read streams to read a file’s data sequentially and listen to events from a read stream. Since Node.js ReadStreams are descendants of the Readable object, we will also listen to events to it.
We have lots of control over how the read stream is created. We can set the path or file descriptor of the file. Also, we can set the mode of the file to be read and the permission and sticky bit of the file being read.
Also, we can choose to close the streams automatically or not or emit close
event automatically. We can also set the highWaterMark
option which sets the event of maximum buffer size for storing the read data.
Also, we can call pipe
to move data to a writable stream, and pause the streaming of data with the pause
function, and resume streaming with the resume
function.