Thursday, January 22, 2015

Breaking open https://web.whatsapp.com/

Today whatsapp have launched an online/web version of their overly popular smartphone messaging app.

I was very much interested in seeing the architecture of this app because as far as i knew, they never stored messages on their server but all the data was only stored in users phone. So I started to look under the hood of the webapp and what I saw was a beauty.

First let me list down the frameworks they have used in creating this app:


  1. React.js : A JAVASCRIPT LIBRARY FOR BUILDING USER INTERFACES from Facebook.
  2. Underscore.js : Unerscore is a JavaScript library that provides a whole mess of useful functional programming helpers without extending any built-in objects. It’s the answer to the question: “If I sit down in front of a blank HTML page, and want to start being productive immediately, what do I need?” … and the tie to go along with jQuery's tux and Backbone's suspenders.
  3. Velocity.js : Velocity is an animation engine with the same API as jQuery's $.animate(). It works with and without jQuery. It's incredibly fast, and it features color animation, transforms, loops, easings, SVG support, and scrolling. It is the best of jQuery and CSS transitions combined.
These are the major pieces. They have been using secured websockets for communication with your phone through there server. I wonder why they didn't use webrtc's dataChannel there? Hmmm as I ask this question answer became clear, because only android would have supported that.

They are using Chrome's FileSystem Api which makes their application Chrome specific. In this case even data channels could have been used, as it negates the previous argument. I think the reason for not using WebRtc based data channel is to avoid difficulty of setting up the initial connection, which websockets are solving by putting a server in between.

They seem to be using Google Material Design principles.


So, I see they have modified form of XMPP present in there chat protocol and they are forwarding stanzas which there phone receives to the webclient. So, to use the webclient phone should be on and working. And every communication that happens on webclient actually would go via your phone. So, webclient is a just a proxy  UI  Remote Client for your phone.

What does this means?
  • More data transfer over phone. Check your data usage.
  • More  battery consumption because of data transfer.
Though the web-client of whatsapp makes our life easier, it does comes at a cost.

Going ahead I would try to see if I can write some sort of chrome plugin which can get me some data out of Wa object of js and store the data on my server. Keeping my fingers crossed. Though i think i can still write a dumb plugin to parse html and get me the data, but let me first attempt a more elegant solution if possible.


Edit: How does initial handshake takes place?


So you open the webclient and you see a qr code, how does that happen? What happens in the background?


See the above image, it would explain the steps:


  1. It first sent the details of the current client, os, browser and session id. 
  2. Then it sent the stored session details about the connected client.
  3. It got 401 unauthorized request from the server saying that current session is logged out and it needs to create a new one.
  4. I think the the third frame is the ttl frame.
When using your client you scan the QR code, the mobile client connects over websocket to the same channel as specified by the qr code and then sends its initial info.
See the frames recieved on the webclient below:




The selected frame is the most interesting one as it has all the data from the mobile client to the web client.

s1,["Conn",{"ref":"@4O46ffWiLT9bwxmLw4ilte86YFX6TKe+lCpNmN3J9bPQcc7/1Va2tr86","wid":"919844186612@c.us","connected":true,"isResponse":"false","serverToken":"GBH8CFMxtifHo2C6aFZN52C55HWRpALj+n/H4GQs/27y9okIPaCwIClP/M4rJe0ntzHY/fYAZYsIlnzxcu06qDGvve+o6W++FzHlrZb07SYko6wFcDAy6YRQSOm3w6zf","browserToken":"9qy+sI5tqMlxnbViLjET9gM/tt/wOzB23nHlySDPvn6RGFe6G4vj//ItYnU76gHTBPYx88oulI9ggI55L3XH7kpXarbYT1wNgUCMbUnRigTC7EdnfgUDMIFxbcy+rc/DxLe4pIl7cE9wQZV4V1WFeA==","clientToken":"RUZB+rzOCWc5TlxBgIUut5ligZSxKR99eJmtIxfrpFk=","lc":"US","lg":"en","is24h":true,"secret":"vDHtvSzOHNYp0luwrnnV6ycc50luz2sLCM7SDcCNyBZ+aUIQNHh80s6dePcFhfO1QTkYC2dZT9BmZ/GGCsT4NSbP3zdoZCqB3oLrCNbsi0PHGWx/jxuyr2qgrvpxxCcH+O/gqo5S5N356f6icOpuSjZfZDKZ+DbT28/pfycG0qAFaIZ2sDAorHendgtfVk6y","protoVersion":[0,3],"battery":42,"plugged":false,"platform":"android"}]

As you can see the info also has battery state and is the phone charging or not also sent in this frame.
I hope, I answered your question.