WebSockets on the ESP32

Sending data between an embedded device and something like an PC sometime can be frustrating. Usually communication standards like UART/RS232 are used to establish an easy to use connection, while other standards like USB are difficult to handle and tend to be very complicated. I was playing with the ESP32 and wrote a basic WebSocket server. The advantage of websockets is the flexibility, combined with high data rates, low latency and the availability of webSocket client modules as well in modern browsers but also in .net or java.

This software is a PROTOTYPE version and is not designed or intended for use in production, especially not for safety-critical applications! The user represents and warrants that it will NOT use or redistribute the Software for such purposes. This prototype is for research purposes only. This software is provided “AS IS,” without a warranty of any kind.

WebSocket?

WebSockets are similar to HTTP connections. When you request a webpage from a server, a TCP connection is established and closed as soon as the content has been transferred from the server to the client (e.g. browser). The difference between HTTP and websockets is that a websocket connection remains established and bidirectional communication becomes possible. The advantage over e.g. AJAX is, that there is no overhead for the handshake, as the connection is already open and thus the latency is lower.

Learn more about WebSockets:

WebSocket Handshake on the ESP32

The first thing we need is a WebSocket Task. It is very similar to a HTTP Server but might listen to another port. However, you also can listen to port 80 for websocket connections but then need to distinguish between HTTP and websocket request. I choose to listen to a dedicated port, in order to reduce complexity.

void ws_server(void *pvParameters) {
 //connection references
 struct netconn *conn, *newconn;
 
 //set up new TCP listener
 conn = netconn_new(NETCONN_TCP);
 netconn_bind(conn, NULL, 9998);
 netconn_listen(conn);

 //wait for connections
 while (netconn_accept(conn, &newconn) == ERR_OK)
 ws_server_netconn_serve(newconn);

 //close connection
 netconn_close(conn);
 netconn_delete(conn);
}

This snippet creates a new TCP listener at port 9998. If we receive a connection request, 

ws_server_netconn_serve

 is called.

The WebSocket Handshake is a little tricky 🙁 .The client (e.g. a browser) sends a request which looks like this:

GET / HTTP/1.1
Host: 192.168.4.1:9998
Connection: Upgrade
Pragma: no-cache
Cache-Control: no-cache
Upgrade: websocket
Origin: file://
Sec-WebSocket-Version: 13
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36
Accept-Encoding: gzip, deflate, sdch
Accept-Language: de-DE,de;q=0.8,en-US;q=0.6,en;q=0.4
Sec-WebSocket-Key: Sb0llpkUl572foZxqBOxMw==
Sec-WebSocket-Extensions: permessage-deflate; client_max_window_bits

The server has to read Sec-WebSocket-Key, concatinate the magic string “258EAFA5-E914-47DA-95CA-C5AB0DC85B11” to it, take the SHA1 of it, and return the base64 encoded result to the client:

HTTP/1.1 101 Switching Protocols 
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: FIGtZNMa70eYrHzrhGsJMsSf47w=

Yep… I know… but no worries, we will go trough it step by step. Maybe its now time to get a coffee or a bottle of wine 😉

We need the following strings:

const char WS_sec_WS_keys[]="Sec-WebSocket-Key:";
const char WS_sec_conKey[]="258EAFA5-E914-47DA-95CA-C5AB0DC85B11";
const char WS_srv_hs[]="HTTP/1.1 101 Switching Protocols \r\nUpgrade: websocket\r\nConnection: Upgrade\r\nSec-WebSocket-Accept: %.*s\r\n\r\n";

First, there is the Argument we are looking for, our magic string and finally the handshake response with a wildcard for our calculated hash.

Next we need to allocate some memory for the SHA1 input and the SHA1 result

//allocate memory for SHA1 input
p_SHA1_Inp=pvPortMallocCaps(WS_CLIENT_KEY_L+sizeof(WS_sec_conKey),MALLOC_CAP_8BIT);

//allocate memory for SHA1 result
p_SHA1_result=pvPortMallocCaps(SHA1_RES_L,MALLOC_CAP_8BIT);

We copy the static “private” key into the SHA1 input and try to get the parameter start from the request:

//write static key into SHA1 Input
for(i=0;i<sizeof(WS_sec_conKey);i++)
 p_SHA1_Inp[i+WS_CLIENT_KEY_L]=WS_sec_conKey[i];

//find Client Sec-WebSocket-Key:
p_buf=strstr(buf, WS_sec_WS_keys);

If we have found the Sec-Key, we load in into the SHA1 input, get the hash from the ESP32 SHA1 engine and base64 encoding:

//check if needle "Sec-WebSocket-Key:" was found
if(p_buf!=NULL){

//get Client Key
for(i=0;i<WS_CLIENT_KEY_L;i++)
 p_SHA1_Inp[i]=*(p_buf+sizeof(WS_sec_WS_keys)+i);

// calculate hash
esp_sha(SHA1,(unsigned char*)p_SHA1_Inp,strlen(p_SHA1_Inp),(unsigned char*)p_SHA1_result);

//hex to base64
p_buf =(char*)_base64_encode((unsigned char*)p_SHA1_result, SHA1_RES_L,(size_t*)&i);

Now that we have the SHA1 result, we can send the handshake response:

//allocate memory for handshake
p_payload = pvPortMallocCaps(sizeof(WS_srv_hs)+i-WS_SPRINTF_ARG_L,MALLOC_CAP_8BIT);

//check if malloc suceeded
if(p_payload!=NULL){

 //prepare handshake
 sprintf(p_payload,WS_srv_hs, i-1, p_buf);

 //send handshake
 netconn_write(conn, p_payload, strlen(p_payload), NETCONN_COPY);

The connection is now open and we can wait for incoming messages:

//Wait for new data
while(netconn_recv(conn, &inbuf)==ERR_OK){
 
 //read data from inbuf
 netbuf_data(inbuf, (void**) &buf, &i);

 

Receive WebSocket frames

The format of Websocket frames can be found here. I created a structure to help “parsing” WebSocket frames:

typedef struct{
	uint8_t		opcode:WS_MASK_L;
	uint8_t		reserved:3;
	uint8_t		FIN:1;
	uint8_t		payload_length:7;
	uint8_t		mask:1;
}WS_frame_header_t ;

For my application, a frame size with 2^7 bytes is sufficient, thus only frames with a length <126 are handled.

The received framed will be casted to the frame header structure. We check if the client wants to close the connection and then we extract the payload mask and unmask the payload. Once we have the payload, I call a function 

WS_process_in_data

 , which (in this example) will loop back the Frame

//get pointer to header
p_frame_hdr=(WS_frame_header_t*)buf;

//check if clients wants to close the connection
if(p_frame_hdr->opcode==WS_OP_CLS)
	break;

//get payload length
if(p_frame_hdr->payload_length<=WS_STD_LEN){

	//get beginning of mask or payload
	p_buf=(char*)&buf[sizeof(WS_frame_header_t)];

	//check if content is masked
	if(p_frame_hdr->mask){

		//allocate memory for decoded message
		p_payload = pvPortMallocCaps(p_frame_hdr->payload_length+1,MALLOC_CAP_8BIT);

		//check if malloc succeeded
		if(p_payload!=NULL){

			//decode playload
			for (i = 0; i < p_frame_hdr->payload_length; i++)
				p_payload[i] = (p_buf+WS_MASK_L)[i] ^ p_buf[i % WS_MASK_L];

		        //add 0 terminator
		        p_payload[p_frame_hdr->payload_length]=0;
		}
	}
	else
		//content is not masked
		p_payload=p_buf;

	//do stuff
	if((p_payload!=NULL)&&(p_frame_hdr->opcode==WS_OP_TXT)){
		WS_process_in_data(p_payload,p_frame_hdr->payload_length);
	}

	//free payload buffer
	if(p_frame_hdr->mask&&p_payload!=NULL)
		free(p_payload);

}//p_frame_hdr->payload_length<126

//free input buffer
netbuf_delete(inbuf);
Send WebSocket frames

Once the connection is established, I save the connection reference in a static variable

//stores open WebSocket connections
static struct netconn* WS_conn=NULL;

In order to send WebSocket Frames, we simply need to send a WebSocket Header, followed by the (luckily no masking etc here) payload. Again, this demo is limited to 2^7 bytes, but can be easily extended.

err_t WS_write_data(char* p_data, size_t length){

	//check if we have an open connection
	if(WS_conn==NULL)
		return ERR_CONN;

	//currently only frames with a payload length <WS_STD_LEN are supported
	if(length>WS_STD_LEN)
		return ERR_VAL;

	//netconn_write result buffer
	err_t	result;

	//prepare header
	WS_frame_header_t hdr;
	hdr.FIN=0x1;
	hdr.payload_length=length;
	hdr.mask=0;
	hdr.reserved=0;
	hdr.opcode=WS_OP_TXT;

	//send header
	result=netconn_write(WS_conn, &hdr, sizeof(WS_frame_header_t), NETCONN_COPY);

	//check if header was send
	if(result!=ERR_OK)
		return result;

	//send payload
	return netconn_write(WS_conn, p_data, length, NETCONN_COPY);
}

An example project with a WebSocket receive task can be found here:

GitHub

Download

5 thoughts on “WebSockets on the ESP32”

  1. Hello Thomas, Thanks for this detailed guide. But I really lost myself in concepts of websocket communication and library exceptions with the GNU or kinda C90 C89 standarts. 🙂

    Can you tell me what IDE you are using to develop for ESP32 examples you publish ;What compiler mingw or any other gnu you are using for it ?
    I was using Arduino ide with https://github.com/espressif/arduino-esp32 .
    i downloaded your example project but cant figure out what i need to do to make it run on arduino ide.

    Esp32 and microcontroller programming is so new for me. Thanks again. 🙂

  2. Hey thomas,
    I managed to set the environment up for ESP-IDF development with eclipse and for a second setup ATOM IDE with platformIO..
    I build your example project on github. and trying to make it work. but when i send a request from chrom browser for the first time, i only getting these messages from serial monitor .
    W (54134) wifi: noTIM!!
    I (54134) wifi: n:11 0, o:11 2, ap:255 255, sta:11 0, prof:1
    I (54254) wifi: n:11 2, o:11 0, ap:255 255, sta:11 2, prof:1
    W (54904) wifi: noTIM!!
    I (54904) wifi: n:11 0, o:11 2, ap:255 255, sta:11 0, prof:1
    I (54964) wifi: n:11 2, o:11 0, ap:255 255, sta:11 2, prof:1

    Cant find out what is the problem.
    Thanks.

  3. Hi Thomas,
    It was my bad. nothing was wrong. just i have writtin another c# aplication to test the ws . “wrong port” was the key for it 🙂
    Thanks for sample . Do you mind adding it a softAP mode ?

    Best Regards.

  4. Hello Thomas,
    I used a part of your websocket project, but I’m having some problems because my client need to be turned off sometimes in an unexpected way, so it didn’t do the disconnect procedure, just is turned off. Because off that the WS server don’t recognise that the client “disconected”, and keep runing normal. Do you know any code routin, flag or function that could help me identify the connection all the time and then reset the WS.
    Thanks.

Comments are closed.

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies.