tutorial - Soulserv Team

Terminal-friendly application with Node.js - Part. III: User Inputs!

Cédric Ronvel — Wed, 13 May 2015 15:28:06 GMT

This is the third part of a series of tutorials about terminal-friendly application. This one will focus on user inputs!

This tutorial will focus exclusively on the Terminal-kit lib for Node.js. Make sure to npm install terminal-kit before trying the given examples.

Keyboard inputs!

The .grabInput() method turns input grabbing on. When input grabbing is on, the terminal will switch to what is known as the raw mode.

From wikipedia:

Raw mode:

Raw mode is the other of the two character-at-a-time modes. The line discipline performs no line editing, and the control sequences for both line editing functions and the various special characters ("interrupt", "quit", and flow control) are treated as normal character input. Applications programs reading from the terminal receive characters immediately, and receive the entire character stream unaltered, just as it came from the terminal device itself.

The first thing you need to know: your program will not exit when you hit CTRL-C anymore. Instead, you will have to watch for CTRL-C and use process.exit() by yourself.

Turning input grabbing on make your terminal interface emit key events.

Here is a small example:

var term = require( 'terminal-kit' ).terminal ;

term.grabInput() ;

term.on( 'key' , function( name , matches , data ) {  
    console.log( "'key' event:" , name ) ;

    // Detect CTRL-C and exit 'manually'
    if ( key === 'CTRL_C' ) { process.exit() ; }
} ) ;

The key event is emitted with three arguments:

name string the key name
matches Array of matched key name
data Object contains more informations, mostly useful for debugging purpose, where:
- isCharacter boolean is true if this is a regular character, i.e. not a control character
- codepoint number (optional) the utf-8 code point of the character, if relevant
- code number or Buffer, for multibyte character it is the raw Buffer input, for single byte character it is a number between 0 and 255

Usually, if the name argument's length is 1, this is a regular character, if it is longer, it is a special key code, like CTRL_C, ENTER, DELETE, TAB, UP , HOME , F1, and so on... But be careful! A single asian character (Chinese, Japanese and Korean) always has a length of 2, so you should not rely on that and instead always data.isCharacter if you want to know if it is a true regular character or not.

The full list of special key's code can be found here

Note that there are few issues with the way keys produce inputs in a terminal application that you should be aware of.

That's not the lib that should be blamed for that, but the way terminals actually works. Have in mind that it's not a kind of keyboard driver that pass keys to our application, we are just reading from the Standard Input stream (STDIN). And that's your terminal that pushes bytes into that STDIN stream.

For that purpose, the matches argument contains all matched keys. This is because sometimes, the input stream produces code that matches many possibilities. E.g. ENTER, KP_ENTER and CTRL_M are all producing a 0x0d in STDIN. TAB and CTRL_I both produce 0x09, BACKSPACE usually produces a 0x08 like CTRL_H, ...

When multiple matches happens, Terminal-kit will pass as the name argument the most useful matches. By the way ENTER has a greater priority than CTRL_M, TAB has greater priority than CTRL_I and BACKSPACE greater priority than CTRL_H.

Actually, all Ctrl-letter combo key produce a control character, i.e. one of the 32 lower ASCII character. But most of those control characters are useless nowaday, so it is safe to use almost all Ctrl-letter except CTRL_M, CTRL_I and CTRL_H.

Finally, you should be aware that special keys produce input in STDIN that vary greatly from one terminal to another. E.g. there is rarely two terminals that produce the same escape sequence for all F1 to F12 keys. Terminal-kit try to abstract that away from you, but exotic terminals can still causes some detection troubles. That's because there isn't any standard for that.

Also, some terminals like Gnome-terminal will intercept function keys for their own stuffs, e.g. F1 will open the Gnome-terminal help window, F11 will go fullscreen, ALT_F4 will close the window, and your application will never get those intercepted keys. So, the best practice is to bind multiple keys for the same action in your application. If you are going to use function keys, try to bind a function key and it's shift or ctrl variant to the same action, e.g. F1, CTRL_F1 and SHIFT_F1: if the terminal intercepts F1, there are chances that SHIFT_F1 will work...

Even better, if it's relevant and you can afford it, allow your users to configure their own key binding.

When you are done with user input, you can turn input grabbing off with .grabInput( false ). The terminal will leave the raw mode and returns to the cooked mode.

A bit further: mouse handling!

Terminal-kit supports mouse handling. To turn mouse handling on, simply pass an object of options to .grabInput()!

Example:

var term = require( 'terminal-kit' ).terminal ;

term.grabInput( { mouse: 'button' } ) ;

term.on( 'mouse' , function( name , data ) {  
    console.log( "'mouse' event:" , name , data ) ;
} ) ;

The mouse option can take three values:

'button': report only button-event
'drag': report button-event and report motion-event only when a button is pressed (i.e. a mouse drag)
'motion': report button-event and all motion-event, use it only when needed, many escape sequences are sent from the terminal (e.g. you may consider it for script running over SSH)

The key event is emitted with two arguments:

name string the name of the subtype of event
data Object provide the mouse coordinates and keyboard modifiers status, where:
- x number the row number where the mouse is
- y number the column number where the mouse is
- ctrl boolean true if the CTRL key is down or not
- alt boolean true if the ALT key is down or not
- shift boolean true if the SHIFT key is down or not

The argument 'name' can be:

MOUSE_LEFT_BUTTON_PRESSED: well... it is emited when the left mouse button is pressed
MOUSE_LEFT_BUTTON_RELEASED: when this button is released.
MOUSE_RIGHT_BUTTON_PRESSED, MOUSE_RIGHT_BUTTON_RELEASED, MOUSE_MIDDLE_BUTTON_PRESSED, MOUSE_MIDDEL_BUTTON_RELEASED: self explanatory.
MOUSE_WHEEL_UP, MOUSE_WHEEL_DOWN: self explanatory
MOUSE_OTHER_BUTTON_PRESSED, MOUSE_OTHER_BUTTON_RELEASED: a fourth mouse button is sometime supported by terminals.
MOUSE_BUTTON_RELEASED: a button were released, however the terminal does not tell us which one.
MOUSE_MOTION: if the option { mouse: 'motion' } is passed to grabInput(), every moves of the mouse will fire this event, if { mouse: 'drag' } is given, it will be fired if the mouse move while a button is pressed.

Again, there are some issues to be aware of.

Firstly, do not expect all terminals to emit all *_RELEASED subtype. You should not rely on this, or you should at least have some fallbacks. E.g. Gnome-terminal emits MOUSE_LEFT_BUTTON_RELEASED and MOUSE_RIGHT_BUTTON_RELEASED, but does not emit MOUSE_MIDDEL_BUTTON_RELEASED... don't ask me why... -_-'

Secondly, do not expect all terminals to support the option { mouse: 'motion' }. E.g. the KDE Konsole will only report the MOUSE_MOTION event-subtype when a button is pressed, the same way it works with the { mouse: 'drag' } mode.

Thirdly, some terminals intercept right click to display a context menu. Gnome-terminal used to do that, but it seems that newer versions (at least on my Fedora at time of writing) don't do that anymore when the terminal has switched to raw mode, which was done with .grabInput().

By the way, the good old Xterm works perfectly fine! Outdated UI/UX, but extremely reliable when it comes to raw features support.

The Ultimate Geek Touch: Terminal-kit even supports the mouse in the Linux Console by talking directly with the GPM driver if it is installed on your box. Seriously, I'm quite proud of that, since I have almost done reverse engineering to provide that. Yay, there is no documentation for the GPM driver, so one have to: read the source code, watch inputs and outputs, guess how it works, repeat.

Putting it all together

Here a small sample code that allows one to write anywhere on the screen, using arrow keys to move while other keys are echoed:

var term = require( 'terminal-kit' ).terminal ;

term.grabInput( { mouse: 'button' } ) ;

term.on( 'key' , function( key , matches , data ) {

    switch ( key )
    {
        case 'UP' : term.up( 1 ) ; break ;
        case 'DOWN' : term.down( 1 ) ; break ;
        case 'LEFT' : term.left( 1 ) ; break ;
        case 'RIGHT' : term.right( 1 ) ; break ;
        case 'CTRL_C' : process.exit() ; break ;
        default:   
            // Echo anything else
            term.noFormat(
                Buffer.isBuffer( data.code ) ?
                    data.code :
                    String.fromCharCode( data.code )
            ) ;
            break ;
    }
} ) ;

term.on( 'mouse' , function( name , data ) {  
    term.moveTo( data.x , data.y ) ;
} ) ;

Misc inputs with the 'terminal' event

The terminal event is a general purpose event for all things coming from your terminal that are not key or mouse event.

The terminal event is emitted with two arguments:

name string the name of the subtype of event
data Object provide some data depending on the event's subtype

The SCREEN_RESIZE subtype is emited when the terminal is resized by the user. The data argument will contain the width and height property: the new size of the screen expressed in characters.

Finally, if .grabInput() was called with the { focus: true } option, a terminal event will be emited with FOCUS_IN or FOCUS_OUT subtype when the terminal gains or loses focus. Not all terminal supports that.

Next time?

So, we have learn many interesting things, but we have not explored all features Terminal-kit has.

Next time we will learn how to use higher level user-inputs methods, like .inputField().

I hope you enjoyed this tutorial!

Terminal-friendly application with Node.js - Part. II: Moving and Editing

Cédric Ronvel — Mon, 30 Mar 2015 13:11:05 GMT

This tutorial will focus exclusively on the terminal-kit lib for Node.js. Make sure to npm install terminal-kit before trying the given examples.

For those who have already done some C/C++ coding with Ncurses, you may know that there is a saying:

More than often, a programmer dealing with Ncurses will curse.

The goal of terminal-kit is to provide a simple and modern way to interact with the terminal. The design should be simple and intuitive, and yet give maximum power in the hand of the user. Hopefully, you will say goodbye to the good ol' Ncurses' days!

Moving the cursor

That cannot be easier!

To move the cursor relative to its position:

.up( n ): move the cursor up by n cells
.down( n ): move it down by n cells…
.left( n ): …
.right( n ): …
.nextLine(n): move the cursor to beginning of the line, n lines down
.previousLine(n): move the cursor to beginning of the line, n lines up
.column(x): move the cursor to column x

Example:

var term = require( 'terminal-kit' ).terminal ;  
term( "Hello" ) ;  
term.down( 1 ) ;  
term.left( 1 ) ;  
term( "^-- world!\n" ) ;  
/* It writes:
Hello  
    ^-- world!
*/

Also, you can one-line that, this way:

term( "Hello" ).down.left( 1 , 1 , "^-- world!\n" ) ;

Also you can use move() and moveTo() to achieve relative and absolute positioning:

.moveTo(x,y): move the cursor to the (x,y) coordinate (1,1 is the upper-left corner)
.move(x,y): relative move of the cursor

With those methods, you can move it move it anywhere. Creating an app with a nice layout is almost at our finger...

One last couple of function that can be useful:

.saveCursor(): save cursor position
.restoreCursor(): restore a previously saved cursor position

This is extremely useful in many case, for example i you want to update status bar:

var term = require( 'terminal-kit' ).terminal ;

term( "Mike says: " ) ;

term.saveCursor() ;  
term.moveTo.bgWhite.black( 1 , 1 ).eraseLine() ;  
term( "This is a status bar update!" ) ;  
term.white.bgBlack() ;  
term.restoreCursor() ;

term( '"Hey hey hey!"\n' ) ;  
// Cursor is back to its previous position
// Thus producing: Mike says: "Hey hey hey!"

This creates a status bar with a white background and black text color, on the top of the screen. The update code saves the cursor's position and restores it, so it doesn't disturb the main text flow, that's why it still writes Mike says: "Hey hey hey!" as if no cursor movement happened.

Editing the screen

Now that we've learn how to move the cursor, let's see how to edit what is already displayed.

Let's see what we can do:

.clear(): clear the screen and move the cursor to the upper-left corner
.eraseDisplayBelow(): erase everything below the cursor
.eraseDisplayAbove(): erase everything above the cursor
.eraseDisplay(): erase everything
.eraseLineAfter(): erase current line after the cursor
.eraseLineBefore(): erase current line before the cursor
.eraseLine(): erase current line
.insertLine(n): insert n lines
.deleteLine(n): delete n lines
.insert(n): insert n char after (works like INSERT on the keyboard)
.delete(n): delete n char after (works like DELETE on the keyboard)
.backDelete(): delete one char backward (works like BACKSPACE on the keyboard), shorthand composed by a .left(1) followed by a .delete(1)

E.g., move the cursor and then erase anything below it to clean the area:

term.moveTo( 1 , 5 ) ;  
term.eraseDisplayBelow() ;

For all erase-like methods, note that the screen is erased using the current background color! Hence, if we wanted to erase the screen, and paint anything below the cursor with a red background, we can modify the above code this way:

term.moveTo( 1 , 5 ) ;  
term.bgRed() ;  
term.eraseDisplayBelow() ;

This apply to .clear() as well, so term.bgBlue.clear() will erase the whole screen, paint it blue, and move the cursor to the top-left corner.

Also there is the advanced method .fullscreen( true ) (not chainable) that clears the screen, moves the cursor to the top-left corner, and if the terminal supports it, it turns the alternate screen buffer on.

If alternate screen buffer is available, when you invoke .fullscreen( false ), the screen will be restored into the state it was before calling .fullscreen( true ).

This is really a key feature for writing cool terminal application! Now you can code something that behaves like the htop linux command, using the whole terminal display area, and when users quit your app, the command history of their shell will be restored. Neat!

If alternate screen buffer is not supported by your terminal, it will fail gracefully.

Last words

Now the cool factor: you can change your terminal window's title with the method .windowTitle():

term.windowTitle( "My wonderful app" )` ;    // set the title of the window to "My wonderful app"

Hehe ;)

Next time we will focus on input: keyboard and mouse!

I hope you have found this tutorial useful, and will be happy if you drop me a line or two!

Have a nice day!

Terminal-friendly application with Node.js - Part. I: Styles & Colors

Cédric Ronvel — Tue, 24 Mar 2015 15:24:44 GMT

Prologue

Have you ever wanted to make your CLI script shine? While cool scripts expose colors and styles, you are still stuck with black & white boring text? Hey! I've got something for you!

In this series of article, we will explore how to build a great terminal application: how to move the cursor, edit the screen, handle keyboard input, handle mouse input, and even more goodies!

But let's start with something simpler... Life is really bland without colors, doesn't it?... Let's add some spicy colors to our terminal application!

The hard way

You wonder how colors and styles are achieved? You guess that one have do deal with an obscur underlying driver? Actually, you're wrong: you can do that with a simple console.log().

So let's write a red "Hello world!":

console.log( '\x1b[31mHello world!' ) ;

Yes, you guessed it: \x1b[31m is an escape sequence that will be intercepted by your terminal and instructs it to switch to the red color. In fact, \x1b is the code for the non-printable control character escape... you know, that key on the top-left corner of your keyboard?

Eventually, if we want to produce a notice of this kind:
Warning: the error #105 just happened!
... you will have to code this:

console.log( '\x1b[31m\x1b[1mWarning:\x1b[22m \x1b[93mthe error \x1b[4m#105\x1b[24m just happened!\x1b[0m' ) ;

Just copy-paste it to the node's REPL console, to try it out!

Okey, now if you want the full list of what you can achieve with control sequences, have a look at the official xterm's control sequences guide.

Isn't that nice? We don't have to deal with an obscur underlying driver! We just have to deal with obscur escape sequences!

But that's not all: complex sequences are not compatible between all terminals.

Ok, ok... Let's hunt for a good lib then...

Popular libs

Since the goal of this part is just to put some colors and styles to our terminal application, there are two popular packages with a very high download count in the npm's registry: colors and chalk.

Escape sequences dealing only with colors & styles are known as ANSI escape code, they are standardized and therefore they work everywhere. That's the focus of those two packages.

The first package, colors, is probably the oldest package around dealing with colors and styles. At time of writing, it has no less than 3.5 millions of downloads for the last month.

Some example of the syntax, taken directly from the package's documentation:

var colors = require('colors');

console.log('hello'.green); // outputs green text  
console.log('i like cake and pies'.underline.red) // outputs red underlined text  
console.log('inverse the color'.inverse); // inverses the color  
console.log('OMG Rainbows!'.rainbow); // rainbow  
console.log('Run the trap'.trap); // Drops the bass

This is the old syntax. While it looks really easy to use, note that it extends the native String object, which is a very bad thing.

Due to popular pressure, some of them directly coming from the author of the chalk package itself, now colors can be required without extending native object... Of course, the syntax differs:

var colors = require('colors/safe');

console.log(colors.green('hello')); // outputs green text  
console.log(colors.red.underline('i like cake and pies')) // outputs red underlined text  
console.log(colors.inverse('inverse the color')); // inverses the color  
console.log(colors.rainbow('OMG Rainbows!')); // rainbow  
console.log(colors.trap('Run the trap')); // Drops the bass

The chalk package has recently surpassed colors, at time of writing its download count reached 5.7 millions.

Again, taken directly from the chalk's documentation:

var chalk = require('chalk');

// style a string 
console.log( chalk.blue('Hello world!') );

// combine styled and normal strings 
console.log( chalk.blue('Hello') + 'World' + chalk.red('!') );

// compose multiple styles using the chainable API 
console.log( chalk.blue.bgRed.bold('Hello world!') );

// pass in multiple arguments 
console.log( chalk.blue('Hello', 'World!', 'Foo', 'bar', 'biz', 'baz') );

// nest styles 
console.log( chalk.red('Hello', chalk.underline.bgBlue('world') + '!') );

// nest styles of the same type even (color, underline, background) 
console.log( chalk.green(  
    'I am a green line ' +
    chalk.blue.underline.bold('with a blue substring') +
    ' that becomes green again!'
) );

The outsider: terminal-kit

It's time to introduce my little gem: terminal-kit.

The basic part of terminal-kit, the part dealing with styles and colors was inspired by chalk. It's easy, you can combine styles and colors, and furthermore, you don't have to use console.log() anymore. Why? Because terminal-kit IS a terminal lib, not just an ANSI's style helper.

Just compare the chalk's way:

console.log( chalk.blue.bold( 'Hello world!' ) ) ;

... and the terminal-kit's way:

term.blue.bold( 'Hello world!' ) ;

Less cumbersome, right?

This is the spaceship demo! After doing an npm install terminal-kit, go to node_modules/terminal-kit/demo/ and run ./spaceship.js to see it alive!

Let's have a look to some of the terminal-kit's primer, from the documentation:

// Require the lib, get a working terminal 
var term = require( 'terminal-kit' ).terminal ;

// The term() function simply output a string to stdout, using current style 
// output "Hello world!" in default terminal's colors 
term( 'Hello world!\n' ) ;

// This output 'red' in red 
term.red( 'red' ) ;

// This output 'bold' in bold 
term.bold( 'bold' ) ;

// output 'mixed' using bold, underlined & red, exposing the style-mixing syntax 
term.bold.underline.red( 'mixed' ) ; 

// printf() style formating everywhere: this will output 'My name is Jack, I'm 32.' in green 
term.green( "My name is %s, I'm %d.\n" , 'Jack' , 32 ) ;

// Width and height of the terminal 
term( 'The terminal size is %dx%d' , term.width , term.height ) ;

// Move the cursor at the upper-left corner 
term.moveTo( 1 , 1 ) ;

// We can always pass additionnal arguments that will be displayed... 
term.moveTo( 1 , 1 , 'Upper-left corner' ) ;

// ... and formated 
term.moveTo( 1 , 1 , "My name is %s, I'm %d.\n" , 'Jack' , 32 ) ;

// ... or even combined with other styles 
term.moveTo.cyan( 1 , 1 , "My name is %s, I'm %d.\n" , 'Jack' , 32  ) ;

Okey, what if you don't want to output a string directly on the terminal, but just put it into a variable like colors and chalk does? Easy: just add the str property into your chainable combo:

var myString = term.red.str( 'Hello world!\n' ) ;

It happens that at least 95% of time, we deal with color only when we are about to output something directly into the terminal, and don't care about storing a string containing obscur escape sequences into some variable... That's why the default behaviour of the lib is to print, not to return.

If you want to output to stderr instead, just add the error property into your chainable combo:

term.red.error( 'Some errors occurs...\n' ) ;

Here is the list of styles the lib support:

styleReset: reset all styles and go back to default colors
bold: bold text
dim: faint color
italic: italic
underline: underline
blink: blink text, not widely supported
inverse: foreground and background color
hidden: invisible, but can be copy/paste'd
strike: strike through

Note that your terminal may not support all of those features.

And now, here is the list of supported colors: black, red, green, yellow (actually brown or orange, most of the time), blue, magenta, cyan, white, brightBlack (grey, it sounds dumb, but in fact it's the combo of black and bright escape code), brightRed, brightGreen, brightYellow (true yellow), brightBlue, brightMagenta, brightCyan and brightWhite.

If you want to change the background color of characters about to be written, just prefix the above colors with 'bg' (and don't forget the camelCase rule, e.g. cyan becomes bgCyan).

For example, if you want to output 'Hello world!', 'Hello' in bold red over cyan and 'world!' in italic green over magenta for maximum contrast and bleeding eyes, just do that:

term.red.bgCyan.bold( 'Hello ' ).green.bgMagenta.italic('world!\n') ;

Hello world!

Ok, ok... Let's stop making that kind of insane example now...

Finally, if we are not bothered by very old terminals, we can use all of the 256 colors. Most terminal support them.

.color256(register): it chooses between one of the 256 colors directly
.colorRgb(r,g,b): pick the closest match for an RGB value (from a 16 or 256 colors palette or even the exact color if the terminal support 24 bits colors), r,g,b are in the 0..255 range
.colorGrayscale(l): pick the closest match for a grayscale value (from a 16 or 256 colors palette or even the exact color if the terminal support 24 bits colors), l is in the 0..255 range

Example:

term.colorRgb( 0x33 , 0xff , 0x88 , "This is teal!" )

They all have their bg* counter-part, e.g.:

term.bgColorRgb( 0x33 , 0xff , 0x88 , "The background color is teal!" )

It's worth noting that providing some text to a style method will apply the color *ONLY* for the given text, while calling a style method without text will turn the current style on, until another style contradict that.

E.g:

term.red( "Red!\n" ) ;  
term( "This is white\n" ) ;

term.red()( "Red!\n" ) ;  // Note the parenthesis  
term( "This is red\n" ) ;

In practice, it's always good to define some handy shortcuts. For example, one may want to define a standard way to output errors. Putting it all together:

var logError = require( 'terminal-kit' ).terminal.error.bold.red ;

// Later...

logError( 'Oh my! A bad thing happens!\n' ) ;

I hope you liked this little tutorial, next time we will learn how to move the cursor and edit the screen.

See ya! ;)

Understanding Object Cloning in Javascript - Part. II

Cédric Ronvel — Mon, 12 Jan 2015 14:42:09 GMT

Shallow copy vs deep copy

A shallow copy will clone the top-level object, but nested object are shared between the original and the clone. That's it: an object reference is copied, not cloned. So if the original object contains nested object, then the clone will not be a distinct entity.

A deep copy will recursively clone every objects it encounters. The clone and the original object will not share anything, so the clone will be a fully distinct entity.

Shallow copies are faster than deep copies.

When it is ok to share some data, you may use shallow copy. There are even use case where it is the best way to do the job. But whenever you need to clone a deep and complex data structure, a tree, you will have to perform deep copy. Have in mind that on really big tree, it can be an expensive operation.

How to perform a deep copy of an object in Javascript

Okey, so let's modify the shallowCopy() function of the previous article.

We need to detect properties containing objects, and recursively call the deepCopy() function again.

Here is the result:

function naiveDeepCopy( original )  
{
    // First create an empty object with
    // same prototype of our original source
    var clone = Object.create( Object.getPrototypeOf( original ) ) ;

    var i , descriptor , keys = Object.getOwnPropertyNames( original ) ;

    for ( i = 0 ; i < keys.length ; i ++ )
    {
        // Save the source's descriptor
        descriptor = Object.getOwnPropertyDescriptor( original , keys[ i ] ) ;

        if ( descriptor.value && typeof descriptor.value === 'object' )        
        {
            // If the value is an object, recursively deepCopy() it
            descriptor.value = naiveDeepCopy( descriptor.value ) ;
        }

        Object.defineProperty( clone , keys[ i ] , descriptor ) ;
    }

    return clone ;
}

By the way, if the property is a getter/setter, then descriptor.value will be undefined, so we won't perform recursion on them, and that's what we want. We actually don't care if the getter return an object or not.

There are still unsolved issues:

Circular references will produce a stack overflow
Some native objects like Date or Array do not work properly
Design pattern emulating private members using a closure's scope cannot be truly cloned (e.g. the revealing pattern)

What is this circular reference thing?

Let's look at that object:

var o = {  
    a: 'a',
    sub: {
        b: 'b'
    },
    sub2: {
        c: 'c'
    }
} ;

o.loop = o ;  
o.sub.loop = o ;  
o.subcopy = o.sub ;  
o.sub.link = o.sub2 ;  
o.sub2.link = o.sub ;

This object self-references itself.

That means that o.loop.a = 'Ha! implies that console.log( o.a ) outputs "Ha!" rather than "a". You remember how object assignment works? o and o.loop simply point to the same object.

However, the naiveDeepCopy() method above does not check that fact and therefore is doomed, iterating o.loop.loop.loop.loop.loop... forever.

That what is called a circular reference.

Even without the loop property, it happens that the original object want that the subcopy and sub properties point to the same object. Here, again, the naiveDeepCopy() method would produce two differents and independent clone.

A good clone method should be able to overcome that.

The closure's scope hell

Okey, let's examine this code:

function myConstructor()  
{
    var myPrivateVar = 'secret' ;

    return {
        myPublicVar: 'public!' ,
        getMyPrivateVar: function() {
            return myPrivateVar ;
        } ,
        setMyPrivateVar( value ) {
            myPrivateVar = value.toString() ;
        }
    } ;
}

var o = myContructor() ;

So... o is an object containing three properties, the first is a string, the two others are methods.

The methods are currently using a variable of the parent scope, in the scope of myConstructor(). That variable (named myPrivateVariable) is created when the constructor is called, however while it is not part of the contructed object in any way, it still remains used by those methods.

Therefore, if we try to clone the object, methods of both the original and the clone will still refer to the same parent's scope variable.

It would not be a problem if this was not a common Javascript's pattern to simulate private members...

As far as I know, there is no way to alter the scope of a closure, so this is a dead-end: pattern using the parent scope cannot be cloned correctly.

Next step: using a library

Okey, so far, we have done a good job hacking Javascript, and it was fun.

Now how about using a ready to use library?

The tree-kit library has a great clone() method, that works in most use case.

It happens that I'm actually the author of this lib, probably some kind of coincidence! ;)

clone( original , [circular] )

original Object the source object to clone

circular boolean (default to false) if true then circular references are checked and each identical objects are reconnected (referenced), if false then nested object are blindly cloned

It returns a clone of the original object.

How to use it? That's pretty straightforward:

first run the command npm install tree-kit --save into your project directory
then use it like this:

var tree = require( 'tree-kit' ) ;  
var myClone = tree.clone( myOriginal ) ;

... where myOriginal is the object you want to clone.

Some optimization work have been done, so tree.clone() should be able to clone large structure efficiently.

One big step in optimization: removing recursivity in the algorithm – it's all taking place in a loop. It avoids stack-overflow and function's call overhead. As a side-effect, depth-first search has been replaced by a breadth-first search algorithm.

Great news: this method is able to detect circular references and reconnect them if the circular option is set to true! Oooh yeah!

If you are interested, you can visit the related source code on github.

If you are that kind of lazy guy, here is the code as of version 0.4.1 (MIT license):

exports.clone = function clone( originalObject , circular )  
{
    // First create an empty object with
    // same prototype of our original source

    var propertyIndex ,
        descriptor ,
        keys ,
        current ,
        nextSource ,
        indexOf ,
        copies = [ {
            source: originalObject ,
            target: Object.create( Object.getPrototypeOf( originalObject ) )
        } ] ,
        cloneObject = copies[ 0 ].target ,
        sourceReferences = [ originalObject ] ,
        targetReferences = [ cloneObject ] ;

    // First in, first out
    while ( current = copies.shift() )
    {
        keys = Object.getOwnPropertyNames( current.source ) ;

        for ( propertyIndex = 0 ; propertyIndex < keys.length ; propertyIndex ++ )
        {
            // Save the source's descriptor
            descriptor = Object.getOwnPropertyDescriptor( current.source , keys[ propertyIndex ] ) ;

            if ( ! descriptor.value || typeof descriptor.value !== 'object' )
            {
                Object.defineProperty( current.target , keys[ propertyIndex ] , descriptor ) ;
                continue ;
            }

            nextSource = descriptor.value ;
            descriptor.value = Array.isArray( nextSource ) ?
                [] :
                Object.create( Object.getPrototypeOf( nextSource ) ) ;

            if ( circular )
            {
                indexOf = sourceReferences.indexOf( nextSource ) ;

                if ( indexOf !== -1 )
                {
                    // The source is already referenced, just assign reference
                    descriptor.value = targetReferences[ indexOf ] ;
                    Object.defineProperty( current.target , keys[ propertyIndex ] , descriptor ) ;
                    continue ;
                }

                sourceReferences.push( nextSource ) ;
                targetReferences.push( descriptor.value ) ;
            }

            Object.defineProperty( current.target , keys[ propertyIndex ] , descriptor ) ;

            copies.push( { source: nextSource , target: descriptor.value } ) ;
        }
    }

    return cloneObject ;
} ;

Happy hacking!

Understanding Object Cloning in Javascript - Part. I

Cédric Ronvel — Sat, 20 Dec 2014 16:47:22 GMT

Prerequisite: Understanding objects assignment in Javascript

As you know, the assignment does not copy an object, it only assign a reference to it, therefore the following code:

var object = { a: 1, b: 2 } ;  
var copy = object ;  
object.a = 3 ;  
console.log( copy.a ) ;

... will output 3 rather than 1.

The two variables object & copy reference the same object, so whatever the variable used to modify it, you will get the same result.

If you come from a C/C++ background, you should understand that object.a in Javascript should be translated into object->a in C/C++, it will help understand how copy = object works.

When it comes to object, a Javascript variable behaves more like a kind of automatic pointer.

Also there is a misleading saying commonly used in javascript, one may say that “Object are passed as reference”.

That's totally wrong.

If it was true, then the following code:

var object = { a: 1, b: 2 } ;

function fn( ob )  
{
    ob = { c: 3, d: 4 } ;
}

fn( object ) ;  
console.log( object ) ;

... would output { c: 3, d: 4 }, but actually object still reference { a: 1, b: 2 }.

So what happened really at function call?

Nothing unusual, each caller's argument are assigned to a callee's argument just like it would if you had manually used the = operator. There are no special case for object.

When you pass a variable by reference in a language that supports this pass by reference feature, the caller & callee variable are identical, as if they were each others aliases, so mutating one mutates the other.

Here in Javascript, we have two distinct variables, that happen to point to the same object... ... ... until re-assignment happens.

That's why I prefer to say that a variable, after an object assignment, behaves like a pointer to that object. In a C/C++ fashion, object = { a: 1, b: 2 } should be understood as object = &( { a: 1, b: 2 } ).

How to perform a shallow copy of an object in Javascript

Javascript does not have built-in object-cloning facilities.

A quick and dirty way to clone an object would be to create a new empty object, then iterate over the original to copy properties one by one.

This naive function will do the trick:

function naiveShallowCopy( original )  
{
    // First create an empty object
    // that will receive copies of properties
    var clone = {} ;

    var key ;

    for ( key in original )
    {
        // copy each property into the clone
        clone[ key ] = original[ key ] ;
    }

    return clone ;
}

However, there are few issues with this code:

The clone produced doesn't have the same prototype than the original, it is simply an instance of Object... the prototype of the clone is not the same than the prototype of the original.
However, inherited properties of the original (inherited from its prototype) are copied into the clone as regular owned properties.
Only enumerable properties are copied.
Properties' descriptor are not copied, e.g. a read-only property in the original will be writable in the clone.
Finally: if a property is an object, then it will be shared between the clone and the original, their respective properties will point to the same object.

The 5th point is what make it a shallow copy: only the surface of the object is cloned, deeper objects are shared.

A variant using Object.keys() can be used if we want to copy only owned and enumerable properties:

function shallowCopyOfEnumerableOwnProperties( original )  
{
    // First create an empty object
    // that will receive copies of properties
    var clone = {} ;

    var i , keys = Object.keys( original ) ;

    for ( i = 0 ; i < keys.length ; i ++ )
    {
        // copy each property into the clone
        clone[ keys[ i ] ] = original[ keys[ i ] ] ;
    }

    return clone ;
}

If you want to copy non-enumerable properties as well, you can replace Object.keys() with Object.getOwnPropertyNames():

function shallowCopyOfOwnProperties( original )  
{
    // First create an empty object
    // that will receive copies of properties
    var clone = {} ;

    var i , keys = Object.getOwnPropertyNames( original ) ;

    for ( i = 0 ; i < keys.length ; i ++ )
    {
        // copy each property into the clone
        clone[ keys[ i ] ] = original[ keys[ i ] ] ;
    }

    return clone ;
}

Still, non-enumerable properties will be enumerable properties in the clone...

We can improve this function using Object.getOwnPropertyDescriptor() & Object.defineProperty(), so descriptors will be cloned properly.

And finally, if we create the clone with Object.create() and use the result of Object.getPrototypeOf( original ) as its argument, we can ensure that the clone will have the correct prototype.

function shallowCopy( original )  
{
    // First create an empty object with
    // same prototype of our original source
    var clone = Object.create( Object.getPrototypeOf( original ) ) ;

    var i , keys = Object.getOwnPropertyNames( original ) ;

    for ( i = 0 ; i < keys.length ; i ++ )
    {
        // copy each property into the clone
        Object.defineProperty( clone , keys[ i ] ,
            Object.getOwnPropertyDescriptor( original , keys[ i ] )
        ) ;
    }

    return clone ;
}

Okey, this is far better.

Next time we will go further, we will see how to perform deep copy, and inspect issues that cannot be overcome easily.