Better authentication for socket.io (no query strings!)

Introduction

This post describes an authentication method for socket.io that sends the credentials in a message after connection, rather than including them in the query string as usually done. Note that the implementation is already packed in the socketio-auth module, so you should use that instead of the code below.

The reason to use this approach is that putting credentials in a query string is generally a bad security practice (see this, this and this), and though some of the frequent risks may not apply to the socket.io connection request, it should be avoided as there’s no general convention in treating urls as sensitive information. Ideally such data should travel on a header, but that doesn’t seem to be an option for socket.io, as not all of the transports it supports (WebSocket being one) allow sending headers.

Needless to say, all of this should be done over HTTPS, otherwise no security level is to be expected.

Implementation

In order to authenticate socket.io connections, most tutorials suggest to do something like:

io.set('authorization', function (handshakeData, callback) {
  var token = handshakeData.query.token;
  //will call callback(null, true) if authorized
  checkAuthToken(token, callback);
});

Or, with the middleware syntax introduced in socket.io 1.0:

io.use(function(socket, next) {
  var token = socket.request.query.token;
  checkAuthToken(token, function(err, authorized){
    if (err || !authorized) {
      next(new Error("not authorized"));
    }
    next();
  });
});

Then the client would connect to the server passing its credentials, which can be an authorization token, user and password or whatever value that can be used for authentication:

socket = io.connect('http://localhost', {
  query: "token=" + myAuthToken
});

The problem with this approach is that it credentials information in a query string, that is as part of an url. As mentioned, this is not a good idea since urls can be logged and cached and are not generally treated as sensitive information.

My workaround for this was to allow the clients to establish a connection, but force them to send an authentication message before they can actually start emitting and receiving data. Upon connection, the server marks the socket as not authenticated and adds a listener to an ‘authenticate’ event:

var io = require('socket.io').listen(app);

io.on('connection', function(socket){
  socket.auth = false;
  socket.on('authenticate', function(data){
    //check the auth data sent by the client
    checkAuthToken(data.token, function(err, success){
      if (!err && success){
        console.log("Authenticated socket ", socket.id);
        socket.auth = true;
      }
    });
  });

  setTimeout(function(){
    //If the socket didn't authenticate, disconnect it
    if (!socket.auth) {
      console.log("Disconnecting socket ", socket.id);
      socket.disconnect('unauthorized');
    }
  }, 1000);
}

A timeout is added to disconnect the client if it didn’t authenticate after a second. The client will emit it’s auth data to the ‘authenticate’ event right after connection:

var socket = io.connect('http://localhost');
socket.on('connect', function(){
  socket.emit('authenticate', {token: myAuthToken});
});

An extra step is required to prevent the client from receiving broadcast messages during that window where it’s connected but not authenticated. Doing that required fiddling a bit with the socket.io namespaces code; the socket is removed from the object that tracks the connections to the namespace:

var _ = require('underscore');
var io = require('socket.io').listen(app);

_.each(io.nsps, function(nsp){
  nsp.on('connect', function(socket){
    if (!socket.auth) {
      console.log("removing socket from", nsp.name)
      delete nsp.connected[socket.id];
    }
  });
});

Then, when the client does authenticate, we set it back as connected to those namespaces where it was connected:

socket.on('authenticate', function(data){
  //check the auth data sent by the client
  checkAuthToken(data.token, function(err, success){
    if (!err && success){
      console.log("Authenticated socket ", socket.id);
      socket.auth = true;

      _.each(io.nsps, function(nsp) {
        if(_.findWhere(nsp.sockets, {id: socket.id})) {
          console.log("restoring socket to", nsp.name);
          nsp.connected[socket.id] = socket;
        }
      });

    }
  });
});

45 thoughts on “Better authentication for socket.io (no query strings!)

  1. I did the exact same thing for Python. A nice way to avoid passing username and passwords would be to embed a security token generated from a login into javascript and use that variable in the connection and then invalidate that security token server side so that the session can not be restarted.

  2. This actually looks like a pretty interest concept (especially for someone picking up socket.io recently). I have a quick question though: how does the “checkAuthToken()” function have access to the same socket object, where you’re setting socket.auth = false?

    • checkAuthToken is just a function you define somewhere that takes the data sent by the client (the token in this case) and tells if that client is a logged in user (in this case, it checks it’s a valid auth token). If you check the socketio-auth module, you’ll see that’s the work performed by the “authenticate” function you pass in the configuration.

  3. This is pretty neat, I’ve actually used this library. One question though, what if I have multiple namepsaces on my app. Such as io.of(‘my namespace’); and I want the user to be logged in all of them.

    • That depends, if by “logged in all of them” you mean you want it to be logged in to access any of the namespaces, then that’s what the library currently does. If you want to separately authenticate the user on each namespace then that would require a bit of rewriting: perhaps set the auth flag at nsp.connected[socket.id] rather than in the global socket, listen to the ‘authenticate’ event in every namespace, etc.

  4. Hey there, maybe I am just looking it the wrong way but I can’t seem to authenticate. Tried to figure it out and whenever I try the emit from the client side it didn’t trigger the authenticate function on the server side. After a while I figured it out. I was sending json to your module like so:
    socket.emit(‘authentication’, {“username”: “John”, “password”: “secret”});
    However it only accepts the authentication emit if i use your syntax without the quotes for the keys like so:
    socket.emit(‘authentication’, {username: “John”, password: “secret”});
    This triggers the function but doesn’t allow me to grab the username or password as you described:

    authenticate: function (data, callback) {
    //get credentials sent by the client
    var username = data.username;
    var password = data.password;

    console.log(username); //returns undefined

    Is it intended to receive a non-json object? Really confused about that. Can’t seem to get it working like that. As a client i am using the socket.io swift client for the iphone if that helps. Would really appreciate a suggestion on how to get it to work. Thanks and great job so far 🙂

    • I’ve only tested it with the javascript client, which implicitly handles converting a js object to JSON (so the server too gets an object). I guess you need to figure out how to properly send JSON with the swift client.

      If you still need help, please open a Github issue and we can troubleshoot there.

      Thanks!

  5. Hi,
    I liked it. but am facing an issue, what if i want to reconnect, i am unable to reconnect. next time i try to send the authentication its not taking it in. is there any way to do so. please help me
    thanks,
    Shireesh

    • You mean after an authentication error? You need to connect again before sending the authentication again. If you need more help please open an issue in github

  6. This is unnecessary complicated. You should use HTTPS when authenticating anyway so query string approach is perfectly valid and very simple at the client and also server side.

    • Yes, you HAVE to use HTTPS if you expect any degree of security (and you do, otherwise why bother with authentication, right?). But even with HTTPS, querystrings are not secure enough as stated in the post: the urls can be logged and cached. Check the OAuth specification link, where it warns against putting your auth token in the querystring. More on the subject here and here.

      • About the concerns pointed in the links:
        1. HTTP referrer leakage – since this is xhr request, the url is not in the browser address bar so referrer will not be populated with user or password
        2. Passwords will be stored in server logs – if you are in control of the server (which is true in almost all cases) you could not switch them on. If any other guy have control over your server then you couldn’t hide passwords from him anyway regardless whether they are sent in url or POST them as json
        3. History caches in browsers – I’m not really sure whether xhr requests are going to browser’s history (probably not) – this must be checked

      • You make some interesting points, but I still wouldn’t use the querystring.

        Being in control of the server is something relative: in a lot of cases the guy doing the programming is not the guy managing the server configuration. Taking the logs example, you usually have those enabled by default (which is good, as they’re useful) so someone has to take the responsibility of turning them off.

        The big picture is, I feel, that there is no general convention in treating the url as sensitive data. So when putting credentials there, you need to account for every component that may be touching it. As you pointed, the known vulnerabilities may not apply to xhr requests, but as a developer I’d rather not have to worry about every possible case and just go with the conventional approach.

  7. Thanks for this example.
    My question is why do you need to bother with using the underscore model.
    You can register the client to events only after they are authenticated and that way prevent them from sending or receiving data while they are not. (The client can send data but it will be ignored by socket-io engine).

    • I’m not sure what you mean by “the underscore model”. If you mean why I go over each nsp and manually remove the socket from the connected list, this is necessary to account for the broadcasting of messages.

      If your server code does something like io.emit(“myEvent”, data) then the unauthenticated sockets will receive the message unless you manually delete it from the connected list (or override the emit code somehow to account for those sockets).

  8. Can you please tell me if I’m missing something out – I still don’t understand what kind of “sensitive” information the token is? I mean, if I have a game that users must first login (traditional way) and then connect to socket client, why not use token? The token is issued on login, saved to session and passed to html once the user enters “play” section. Then the socket client sends it in query string to server. The token expires in 30 seconds, which means that the client should handshake in that time. The client also sends user ID, which is validated with hashed token, so no mistakes there. My main point here, is that I won’t give free connection to EVERYONE to my socket server, as if the token is wrong, the handshake will fail. Otherwise, if we use the socket-authorization option, it’s absolutely simple for someone to open 100k connections to my socket server, and just do nothing. They will be disconnected after a short period (because of not authorizing), but will just reconnect, and I can do nothing about that flood!

    • Sorry for spamming, but I must also mention, that Facebook also sends their bearer access token over http, as a query string (or at least they used to in a couple of months).

    • In a lot of common scenarios (for example, when using OAuth or something similar to secure access to a REST API), the access token is in a way as sensitive as the user credentials, since it grants access to the user resources that are protected by those credentials. So in terms of data access having the token is somewhat like having the password.

      We’re talking of scenarios where the token hangs around for hours or days (in some cases it doesn’t ever expire). If in your case the token is a one-off that expires after 30 seconds and won’t be used outside the scope of the socket.io connection, then probably there’s no real gain in using the setup described in this post.

      • I like your approach and it’s pretty effective, but my only concern is that someone can spam the server and block real users from connecting (as all connections can be taken by spam bots). Other than the I fully agree that tokens are somehow sensitive and not things that need to be saved for days! 🙂
        Thanks for the reply!

  9. This looks much better than the alternative handshake authentication. If the token is on the query string, hackers can see this despite using SSL right?

  10. We can also put something like a flag right. Something that tells if the socket is authenticated before emitting anything else.. But still better than the query string alternative

  11. Is the event listener registration guaranteed to happen before emit? ie, are these in order?:
    server fires “connection” and register “authentication”
    client fires “connect” and emit “authentication”
    server fires “authentication”
    server runs “postAuthenicate”, register user events
    client fires “authenticated”, emit user events

    • I’m not sure I follow your sequence. The order is:

      – client calls connect -> server receives ‘connection’
      – server marks the socket as non authenticated (client won’t receive broadcast messages)
      – server emits ‘connect’ -> client receives ‘connect’
      – client emits ‘authentication’ -> server receives ‘authentication’
      – server marks the socket as authenticated (client will receive broadcast messages)
      – server emits ‘authenticated’ -> client receives ‘authenticated’
      – server calls postAuthenticate

      Hope it helps.

      • > – server emits ‘authenticated’ -> client receives ‘authenticated’
        > – server calls postAuthenticate

        Suppose I have an app event called `foo`, which I should add to a socket after it is authenticated so the client can start receiving user stuff. Does this means I need to call `socket.on(“foo”,…)` in the `authenticate` function, rather than in the `postAuthenticate` function? Since it is possible that the client emits “foo” after it receives “authenticated” but before the server calls `postAuthenticate`, it would be too late to call `socket.on(“foo”,…)` in postAuthenticate.

  12. Doing it in postAuthenticate should be safe since it gets called right after the ‘authentcated’ is emitted (so it should be called before any new event from the client arrives).

    Another option you have is to just add the listener socket.on(‘foo’) on connection, and inside check that socket.auth is true (meaning making sure it has authenticated already)

  13. Hello, hope you could lend me a helping hand. I am new to node.js and socket.io but I have studied the basics on how to make a real time chat system using the aforementioned real time technologies. My chat system is hosted by apache running in port 80 and my node server runs on another port say 4000. I only used the node server for real time purposes, all the others are handled by apache. Now, I hope you could give me some idea or insights on how to allow only those authenticated users in my system to connect to the node server. Is the idea explained by your article enough to get me started? And sorry for my bad English.

    • This article covers the low level details of a specific way of authenticating. If you’re interested in just using it, I suggest you check the npm module that does the job. Also note this suggests an alternate way of doing authentication to the one proposed by socket.io documentation; If you don’t understand the details it’s probably better to stick to the standard stuff.

  14. > socket.disconnect(‘unauthorized’);

    What does mean the ‘unauthorized’ string parameter? I thought #disconnect took only a boolean paramter as og socket.io v 1.4.5

    Socket.prototype.disconnect = function(close){
    if (!this.connected) return this;
    if (close) {
    this.client.disconnect();
    } else {
    this.packet({ type: parser.DISCONNECT });
    this.onclose(‘server namespace disconnect’);
    }
    return this;
    };

  15. Pingback: Socket.io in iOS | // Zoltan Szabados

  16. Don’t you think 1 sec interval load process into the cpu?(esp lots of users connected and also DOS attack can be made) I understand your concern about querystring but can not get why we can put this token into header?

    • I guess the interval shouldn’t be a problem, but you can always configure to be less (if you’re using the socketio-auth package). About the DOS attack, I’m no security expert but I imagine that you can take measures against before the websockets come to play.

      About putting the token in the header, that’s indeed the best idea, but as mentioned in the article, socket.io doesn’t allow that since some of its transports don’t support headers.

  17. Socket.io allows headers now. You can set the header on the client and retrieve it in the middleware. You should update this article to reflect this. You could for example, have a sign in process with traditional REST, then return a token. After you have this token, set it in the socket.io request header, then initiate a socket.io connection.

    • I’m wrong. This only works in Node.js sorry. The damned protocol doesn’t support it. What I’m doing myself now is I am signing the request hmac sha 256 public private key pair with timestamp on the initial request. I suppose its possible to do repeat attack but the time is very short, like < 5 seconds where the allowed timestamp is ok. I'm using (new Date()).toISOString() to pass the timestamp around and use the same one in the signing.

  18. Hi dude, nice tutorial. I am using MD5 authentication for Socket.io, everything working fine but the Server is receiving the same data multiple times.

Leave a comment