Websockets with Angular, socket.io and Apache

Seeing as I just spent the better part of the afternoon trying to figure this out, I thought I’d write a quick blog post on how I eventually got this working – both for other people struggling with this and for my own archive since I’ll probably forget again and want to look it up next time I do such a project 🙂

socket.io is a pretty awesome implementation of HTML5 websockets with transparent fallbacks (polyfills) for non-supporting browsers. However, its documentation can be…sketchy. For instance, their examples all assume everything is handled by nodeJS (including your HTML pages), which no doubt some people actually do but if you’re anything like me you’ll prefer to use Apache or nginx or the like in combination with a server side language like PHP for serving up your HTML pages. That’s not documented, ehm, extensively (to use an understatement), and also stuff like authentication isn’t really covered. So this is my solution, many thanks to Google, Stackoverflow and the rest of the interwebz:

1. The node server

socket.io is in the end a node module, so there’s no escaping that. Get it installed:

$ npm install --save socket.io http express

The most basic server script would look something like this:


let http = require('http');
let express = require('express');
let server = http.createServer(app);
let io = require('socket.io').listen(server);
io.on('connection', socket => {
    socket.on('someEvent', function () {
        socket.emit('anotherEvent', {msg: 'Hi!'});
    }
});
server.listen(8080);

Port 8080 is arbirtrary, anything not already in use will do. I usally go for something in a higher range, but this is good enough for the example.

To wrap your head around this: We have an HTTP server created using an Express app with socket.io plugged in, and it listens on port 8080. We can now run it using node path/to/server.js (in real life you’ll want to use something like PM2 for running it, but that’s not the issue at hand here).

2. The Angular client

There’s a bunch of Angular modules for tying sockets into the Angular “digest flow”. I chose this one: https://github.com/btford/angular-socket-io, but others should also work. The client script will be auto-hosted under /socket.io/socket.io.js (let’s ignore our port 8080 for now), so make sure that and the Angular module are loaded in your HTML (in that order, I might add). Adding an Angular factory for your socket is then as simple as


app.factory('Socket', ['socketFactory', socketFactory => {
    if (window.io) {
        let ioSocket = window.io.connect('', {query: ''});
        return socketFactory({ioSocket});
    }
    // I'm sure you can handle this more gracefully:
    return {};
}]);

This uses the defaults (the name Socket is arbitrary, you can call it Gregory for all I care), and sets up the query parameter. This is going to come in handy later on.

That’s really all there is to it; e.g. in your controller you can now do something like this:


export default class Controller {
    constructor(Socket) {
        Socket.on('anotherEvent', data => {
            console.log(data); // object{msg: 'hi'}
        });
    }
}

Controller.$inject = ['Socket'];

3. Configuring the actual web server

This is the part that mainly had me banging my head. Of course, we could just instruct socket.io to use our “special” port 8080, but that leads to all sorts of problems with proxies, company networks, crappy clients etc. We just want to tunnel everything through port 80 (or 443 for secure sites). An important thing to understand here: A web socket connection is just a regular HTTP connection, but upgraded. That means we can pipe it through regular HTTP ports, as long as the server behind it handles the upgrade.

Apache comes with two modules (well, as of v2.4, but it’s been around for a while) to handle this: mod_proxy and mod_proxy_wstunnel. “ws” stands for “Web Socket” of course, so yay! Only, the documentation didn’t go much farther than “you can turn this on if you need it”. In any case, make sure both modules are enabled.

This is the configuration that finally got it working for me (the actual rules vary slightly between socket.io versions, but this works for 1.3.6):


RewriteEngine On
RewriteCond %{REQUEST_URI}  ^/socket.io/1/websocket  [NC]
RewriteRule /(.*)           ws://localhost:8080/$1 [P,L]

ProxyPass        /socket.io http://localhost:8080/socket.io
ProxyPassReverse /socket.io http://localhost:8080/socket.io

A breakdown:

  1. First, we check if ‘websocket’ is in the request_uri. socket.io provides fallbacks like JSON polling for older clients – which is cool – and it does that by requesting different URLs and seeing which one works. I think in my version it tries /socket.io/1/xhr-polling next if websockets fail. Anyway, the point is: if Apache gets a request for the websocket URL, we rewrite to the ws:// scheme (on localhost, which is fine for now – if you’re running a gazillion apps this way you’ll probably want to handle this differently ;)) and just pass on the URL including query parameters (the P flag) and end it there (the L flag). If that works, the request was succesfull and our client support actual sockets. If not, the rewrite returns an invalid result and we move on to the next rule…
  2. For anything else under /socket.io, we now proxy to localhost:8080 via regular http and let the fallbacks handle it. This rule also makes sure that we can safely serve socket.io.js (since the URL doesn’t contain /websocket, it just gets forwarded).

If you’re using something else than Apache (e.g. nginx) there’s similar rules and rewrites available, but I’m not experienced enough in those to offer them here 🙂 The principle will be the same though.

4. Bonus: authentication

I rarely build apps without some form of authentication involved, and I’m guessing I’m not the only one. Regular authentication in a web app is usually something with cookies and sessions, but since socket.io is actually running on a different domain (localhost) – and besides doesn’t know about cookies – this won’t work. That’s where that ‘query’ option when we connected earlier comes in.

The query parameter is just something that gets appended to the socket.io requests as a regular GET parameter, and which can be read on the server. Exactly how you implement this is up to you, but a very simple (and by the way not extremely secure) option would be to just pass the session ID:


let ioSocket = window.io.connect('', {query: 'session=' + my_session_id});

And then in your node app on the server you can do something like this:


io.on('connection', socket => {
    socket.session = socket.handshake.query.session;
    // perform some validation, perhaps including a query to a database that stores sessions?
});

Server side languages like PHP by default store their sessions in proprietary flat files, so setting the server up to store them in a way that your node script can reach them depends on your platform. At least in PHP it’s pretty trivial to use a database like MySQL for that.

And that’s it really: you can now start building a real time application!

On software pricing

I’ve been giving some thought lately on software pricing, and I think I’ve come up with an analogy that works (no it does not involve cars). The analogy concerns music, since that’s also something I feel comfortable writing about.

The central issue at hand is this: whenever I quote a client for (complicated project), they usually initially be like “SO MANY ZEROES AFTER THE COMMA???”. Sure, dude, you’ve just asked me to commit to 3 or 6 or whatever month’s of fulltime work, what did you think? Still for some reason non-technical people tend to think of “programming” as “well you just change a few things in an Excel file, right?” Ehm, wrong.

Let’s rewrite that in terms of music. Your 50th birthday is coming up – you’re giving an awesome party, and you need a band to play there. Okay, can do. You go to an agency and state your budget. It’s EUR500,- (which is sort-of reasonable for a not-too-awesome band, but really pushing it already). The agency shows you some samples, and one coverband plays a lot of Queen. You happen to like Queen, but this particular singer isn’t a very good Freddie Mercury impersonator.

“You got a better one in the Queen department?” you ask.

“Sure” says the agency, “we have AwesomeQueenCoverbandAlmostCantDistinguishFromTheRealThing”, but they charge ten times as much. Ouch. Okay, that’s WAY out of your budget.

“Ehm, anything in between?” you ask. “Well,” the agency tells you, “we also book The Kings” who are reasonable but only have two or three Queen songs on their repertoire.”

My point is this: you can’t expect Freddie Mercury to play your birthday bash unless you’re willing to pay top bill. It’s that simple. You also can’t expect to have a perfect impersonator play there unless you’re willing to pay serious money – not millions, but still a serious amount. These people have skills and are in demand. It’s the same for good programmers – they won’t code your local webshop for peanuts (unless maybe if you owe them a favour). These people have serious skills and would rather think about encryption schemes than your petty HTTPS connection. Learning and mastering an instrument takes a lot of time. The same goes for code.

I’ve seen McCartney a few times (which was awesome, by the way) and I’ve also seen good Beatles cover bands. I’m sure Macca got paid more, but the cover bands also didn’t come free. If you don’t care about quality – I’m a crappy singer – I’ll come and play the whole Beatles catalog at your party in exchange for a few beers. Pay peanuts, get monkeys – it’s as simple as that.

Nota bene: this is about custom proprietary software. I’ll write later on what I think is wrong with selling “software packages” a.k.a. Word, Photoshop etc.

The poor man’s PHP daemon

We have this project lying around from a while back that’s based on PHP/AngularJS and also sports a socket.io server for (among other things) a real time chat. Pretty nifty (no, we didn’t do the design :)). The socket part is powered by a NodeJS process, and since we didn’t feel like (nor did the client have budget for) rewriting all our PHP code to Javascript, we used dNode-PHP (well, actually a fork with a few small project-specific adjustments) to let the Javascript code in the NodeJS process communicate with our existing PHP libraries. So now we had two processes which is suboptimal but worked at the time.

As a poor man’s solution (time pressure, limited budget, etc.) these processes were simply kept running in a permanent screen on the server. In theory, that worked well enough for then – the idea was that once the client found new budget, we’d finally properly daemonize them. And our server doesn’t reboot that often anyway, so remembering to restart the screen and processes on those occasions wasn’t a big deal.

Of course, that day never came.

There was however one annoying issue: PHP’s database resource would go stale after a while. According to the docs it should automatically reconnect, only it didn’t. This brought the need to manually restart those processes every once in a while (we put the MySQL timeout to a week to alleviate the burden, but the exact moment was random depending on the last moment of activity. Of course, one could argue that a site that’s completely inactive for >7 days regularly isn’t worth the effort, but a) that wasn’t our problem and b) the client was still working on his marketing plan. Fair enough).

Today I got fed up with it, had an hour to spare, and the marketing plan was ready as well. Time to bite the bullet; here’s what I came up with.

1. The NodeJS process

That part was easy; essentially I followed the instructions here. We don’t use Ubuntu but rather Debian, but it should be similar on all *nix systems.

2. The dNode-PHP process

This is where it got interesting, and which had been the actual bullet I’d been putting off biting. PHP isn’t very well suited to run as a daemon. It’s possible, but that doesn’t mean it’s desirable, in the same sense as writing out all your CSS in <script>document.write('<style>...<' + '/style>')</script> tags is only a theoretical option. But in this case it was still better than duplicating code in Javascript.

Now, there are ways to turn a PHP script into an actual daemon. It was still overkill for our purposes, so I simply went with a cronjob that killed any existing process and restarts it every hour. If the client ever needs an actual daemon, we’ll get to that then 🙂

Yes, this “daemon” is a beggar, but as they say: beggars can’t be choosers…

Graceful routing fallback in hybrid AngularJS apps

My current favourite way of working with the otherwise-awesome AngularJS framework is to only apply it to those sections of a page that actually need Angular-behaviour, and let common PHP solve the rest (what I call a “hybrid Angular app”, lacking a better term). While Angular does offer stuff like ngRoute and uiRouter to handle “HTML5-mode routing”, this comes with some other problems, the most notable being search engine indexing. While there are solutions for that they have their own drawbacks, so today I’d rather just have a regular site with some <element my-awesome-directive/> where needed.

I did however run into a problem: IE<10. (Yes, it’s Explorer again…) IE9 and before don’t support the HTML5 history API of course, so Angular has a fallback using hashed URLs. Fine. I assumed that as long as you’re not actually using routes, Angular would leave your URLs alone.

I was wrong.

For any regular URL /foo/, Angular insists on redirecting to /#/foo/ in older Explorers, which of course will just render the homepage with a useless hash URL since I’m not actually letting Angular handle the routing. Bummer. The solution turned out to be simple enough though: don’t force $location.html5Mode(true) if the browser doesn’t even support it. This will cause your IE<10 to just treat all URLs as regular links, will let Angular handle HTML5 URLs where you do define them (e.g. an image gallery) in supporting browsers, will still allow old IEs to run all other Angular code and thus provides the perfect graceful fallback.

Hence:

angular.module('mythingy', []).config(['$locationProvider', $locationProvider => {
    if (!!(window.history && window.history.pushState)) {
        $locationProvider.html5Mode(true);
    }
}]);

Dashing along

Although I’m a programmer and not a sysadmin, I do know a bit about the latter. Today, a client asked me to install a new SSL certificate for their website which we’re still hosting.

Sure, how hard can it be?

Thing is, we don’t really do hosting anymore so installing SSL certificates is something I’m asked to do about once a year. Because of that, I’m hardly an expert and I’m often required to do a quick Google on stuff like the exact syntax of things.

In this case though, we got the new certificate from the soon-to-be new provider for this client, so it should be just a question of a quick SSH and an /etc/init.d/apache reload, amirite?

Well, it ended up taking me over an hour and a fair amount of head-desking. The server was refusing to reload/start. The error log wasn’t particularly helpful:

[Mon Jun 15 15:59:51.844898 2015] [ssl:emerg] [pid 576] AH02561: Failed to configure certificate www.xxx.com:443:0, check /etc/ssl/private/www.xxx.com.crt
[Mon Jun 15 15:59:51.844980 2015] [ssl:emerg] [pid 576] SSL Library Error: error:0906D06C:PEM routines:PEM_read_bio:no start line (Expecting: CERTIFICATE) -- Bad file contents or format - or even just a forgotten SSLCertificateKeyFile?
[Mon Jun 15 15:59:51.845000 2015] [ssl:emerg] [pid 576] SSL Library Error: error:140AD009:SSL routines:SSL_CTX_use_certificate_file:PEM lib
AH00016: Configuration Failed

Hm, so a borked certificate maybe? Seemed unlikely, but that was what Googlestackoverflow was suggesting when I searched the error. It seemed to be confirmed when manually validating the certificate file:

root@(none):/etc/ssl/private# openssl x509 -hash -noout -in www.xxx.com.crt
unable to load certificate
140591264368272:error:0906D06C:PEM routines:PEM_read_bio:no start line:pem_lib.c:701:Expecting: TRUSTED CERTIFICATE

However, file looked good to me – Unix line endings, definitely in the right format etc. I also compared it to their previous, now invalid certificate – at first glance, they looked alike (apart obviously from the actual certificate).

Then, just as I was about to give up and put back the old certificate for now, I saw it: for reasons unknown, the .crt began with ----BEGIN CERTIFICATE----- instead of -----BEGIN CERTIFICATE-----. That’s right, one freakin’ dash just cost me an hour of my life. I checked: the error was in the file I’d gotten from the hosting provider. Ouch… but I suppose an easy enough mistake to make when quickly copy/pasting stuff.

So, for anyone else losing their minds over this: the tags are delimited by 5 dashes (and the number of thy counting shall be 5). Not an entirely obvious mistake to spot, so a good one to check. 🙂

Hello Betelgeuse!

Right, so I finally took the time to install blogging software here. Yay. The theme is just something quick and dirty I whipped up – me != designer.

I’ll try to start posting musings on software and tech in general soon.