Speed up Websock receive queue reads via DataView by PekingSpades · Pull Request #2024 · novnc/noVNC

PekingSpades · 2025-11-18T11:03:07Z

Summary

Replace the byte-by-byte addition in core/websock.js:_rQshift() with a DataView-backed fast path for 1/2/4 byte reads to cut CPU time in the receive queue.
Maintain a cached DataView whenever the receive queue buffer is allocated or resized so the optimized path is always available.
Capture and share reproducible browser benchmarks that highlight the performance win across different engines and machines.

Performance Summary

Average speed-up = mean reduction in the 1/2/4-byte benchmark cases (higher is better).

Speed-up (% faster)
                0        10       20       30       40       50
                |--------|--------|--------|--------|--------|
Chrome   45.2%  █████████████████████████████████████████
Edge     40.9%  █████████████████████████████████████
Firefox  29.9%  ███████████████████████████
Safari   43.5%  ███████████████████████████████████████

Browser / Platform	Avg speed-up
Windows Chrome 142	43.6% faster
Windows Chrome 142 (Machine 2)	46.4% faster
Windows Chrome 101	41.7% faster
Windows Chrome 92.0	45.6% faster
Windows Chrome 83.0	44.7% faster
Windows Chrome 71.0	49.0% faster
Windows Edge 142	46.1% faster
Windows Edge 142 (Machine 2)	35.6% faster
Windows Firefox 113	31.6% faster
Windows Firefox 142	35.7% faster
Windows Firefox 145.0	22.5% faster
Safari 18	43.5% faster

Testing

Manual benchmark – Windows 10, Chrome 142 (20 logical cores) ✅
Manual benchmark – Windows 10, Chrome 142 (dual-core machine) ✅
Manual benchmark – Windows 10, Chrome 101/92/83/71 ✅
Manual benchmark – Windows 10, Edge 142 (two hardware profiles) ✅
Manual benchmark – Windows 10, Firefox 113/142/145 ✅
Manual benchmark – macOS 10.15, Safari 18.6 ✅

Benchmark Results

Windows Chrome 142

Key	Value	Key	Value
Buffer size (bytes)	67108864	Buffer size (MB)	64.0
Rounds per case	10	Bytes tested	1, 2, 4
Timer	performance.now()	User agent	Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/142.0.0.0 Safari/537.36
Platform	Win32	HW concurrency	20
Device memory (GB)	8	Language	zh-CN
Languages	zh-CN, zh	Screen resolution	2560x1440
Screen pixel depth	24	JS heap size limit (MB)	4095.8
Total JS heap (MB)	183.7	Used JS heap (MB)	176.1
Performance timeOrigin	1763268027246.2

Bytes	Method	Rounds	Avg ms	Min ms	Max ms	Winner
1	loop	10	205.920	192.900	249.300
1	DataView	10	164.550	156.800	201.300	🏆
2	loop	10	179.260	177.000	181.200
2	DataView	10	99.260	96.000	118.300	🏆
4	loop	10	184.910	181.100	197.600
4	DataView	10	62.880	60.700	73.600	🏆

Windows Chrome 142(Machine 2)

Key	Value	Key	Value
Buffer size (bytes)	67108864	Buffer size (MB)	64.0
Rounds per case	10	Bytes tested	1, 2, 4
Timer	performance.now()	User agent	Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/142.0.0.0 Safari/537.36
Platform	Win32	HW concurrency	2
Device memory (GB)	8	Language	zh-CN
Languages	zh-CN, zh	Screen resolution	2560x1440
Screen pixel depth	24	JS heap size limit (MB)	4095.8
Total JS heap (MB)	75.5	Used JS heap (MB)	72.3
Performance timeOrigin	1763455005177.6

Bytes	Method	Rounds	Avg ms	Min ms	Max ms	Winner
1	loop	10	366.130	353.100	424.600
1	DataView	10	235.270	226.700	275.500	🏆
2	loop	10	215.860	206.400	279.500
2	DataView	10	136.190	129.600	166.100	🏆
4	loop	10	261.090	238.400	290.300
4	DataView	10	87.640	76.900	96.700	🏆

Windows Chrome 101

Key	Value	Key	Value
Buffer size (bytes)	67108864	Buffer size (MB)	64.0
Rounds per case	10	Bytes tested	1, 2, 4
Timer	performance.now()	User agent	Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.0.0 Safari/537.36
Platform	Win32	HW concurrency	2
Device memory (GB)	8	Language	zh-CN
Languages	zh-CN, zh	Screen resolution	2560x1440
Screen pixel depth	24	JS heap size limit (MB)	4095.8
Total JS heap (MB)	73.5	Used JS heap (MB)	71.3
Performance timeOrigin	1763454953935.7

Bytes	Method	Rounds	Avg ms	Min ms	Max ms	Winner
1	loop	10	292.310	270.600	343.900
1	DataView	10	243.210	228.300	273.700	🏆
2	loop	10	227.190	216.700	290.900
2	DataView	10	127.550	123.700	147.800	🏆
4	loop	10	238.620	233.300	264.500
4	DataView	10	85.110	81.900	98.400	🏆

Windows Chrome 92.0

Key	Value	Key	Value
Buffer size (bytes)	67108864	Buffer size (MB)	64.0
Rounds per case	10	Bytes tested	1, 2, 4
Timer	performance.now()	User agent	Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36
Platform	Win32	HW concurrency	2
Device memory (GB)	8	Language	zh-CN
Languages	zh-CN, zh	Screen resolution	2560x1440
Screen pixel depth	24	JS heap size limit (MB)	4095.8
Total JS heap (MB)	72.2	Used JS heap (MB)	69.8
Performance timeOrigin	1763454862654.8

Bytes	Method	Rounds	Avg ms	Min ms	Max ms	Winner
1	loop	10	332.680	310.400	374.800
1	DataView	10	239.060	226.800	267.200	🏆
2	loop	10	224.740	217.400	248.000
2	DataView	10	127.160	122.800	154.100	🏆
4	loop	10	241.880	230.600	282.900
4	DataView	10	83.860	75.000	113.400	🏆

Windows Chrome 83.0

Key	Value	Key	Value
Buffer size (bytes)	67108864	Buffer size (MB)	64.0
Rounds per case	10	Bytes tested	1, 2, 4
Timer	performance.now()	User agent	Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.106 Safari/537.36
Platform	Win32	HW concurrency	2
Device memory (GB)	8	Language	zh-CN
Languages	zh-CN, zh	Screen resolution	2560x1440
Screen pixel depth	24	JS heap size limit (MB)	3585.8
Total JS heap (MB)	73.1	Used JS heap (MB)	68.9
Performance timeOrigin	1763454793849.9363

Bytes	Method	Rounds	Avg ms	Min ms	Max ms	Winner
1	loop	10	372.971	349.125	447.595
1	DataView	10	288.493	251.800	475.025	🏆
2	loop	10	269.894	263.340	302.725
2	DataView	10	148.638	144.555	166.145	🏆
4	loop	10	267.037	245.535	303.445
4	DataView	10	89.074	82.745	119.115	🏆

Windows Chrome 71.0

Key	Value	Key	Value
Buffer size (bytes)	67108864	Buffer size (MB)	64.0
Rounds per case	10	Bytes tested	1, 2, 4
Timer	performance.now()	User agent	Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.44 Safari/537.36
Platform	Win32	HW concurrency	2
Device memory (GB)	8	Language	zh-CN
Languages	zh-CN, zh	Screen resolution	2560x1440
Screen pixel depth	24	JS heap size limit (MB)	2222.1
Total JS heap (MB)	9.5	Used JS heap (MB)	9.5
Performance timeOrigin	1763454480039.055

Bytes	Method	Rounds	Avg ms	Min ms	Max ms	Winner
1	loop	10	268.720	224.600	449.000
1	DataView	10	218.370	210.400	236.700	🏆
2	loop	10	318.890	229.300	422.500
2	DataView	10	151.140	113.800	217.400	🏆
4	loop	10	259.290	229.800	318.600
4	DataView	10	63.390	57.000	81.200	🏆

Windows Edge 142

Key	Value	Key	Value
Buffer size (bytes)	67108864	Buffer size (MB)	64.0
Rounds per case	10	Bytes tested	1, 2, 4
Timer	performance.now()	User agent	Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/142.0.0.0 Safari/537.36 Edg/142.0.0.0
Platform	Win32	HW concurrency	16
Device memory (GB)	8	Language	en
Languages	en, zh-CN, en-GB, en-US	Screen resolution	2560x1440
Screen pixel depth	24	JS heap size limit (MB)	4095.8
Total JS heap (MB)	123.7	Used JS heap (MB)	114.7
Performance timeOrigin	1763453859070.7

Bytes	Method	Rounds	Avg ms	Min ms	Max ms	Winner
1	loop	10	281.940	267.900	343.200
1	DataView	10	213.270	202.200	268.100	🏆
2	loop	10	228.380	226.100	241.200
2	DataView	10	126.480	123.100	155.000	🏆
4	loop	10	249.870	246.300	264.600
4	DataView	10	76.660	74.500	88.000	🏆

Windows Edge 142(Machine 2)

Key	Value	Key	Value
Buffer size (bytes)	67108864	Buffer size (MB)	64.0
Rounds per case	10	Bytes tested	1, 2, 4
Timer	performance.now()	User agent	Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/142.0.0.0 Safari/537.36 Edg/142.0.0.0
Platform	Win32	HW concurrency	2
Device memory (GB)	8	Language	zh-CN
Languages	zh-CN, en, en-GB, en-US	Screen resolution	2560x1440
Screen pixel depth	24	JS heap size limit (MB)	4095.8
Total JS heap (MB)	93.1	Used JS heap (MB)	89.2
Performance timeOrigin	1763432982331

Bytes	Method	Rounds	Avg ms	Min ms	Max ms	Winner
1	loop	10	281.980	249.200	366.600
1	DataView	10	245.630	233.200	272.700	🏆
2	loop	10	260.800	207.600	384.000
2	DataView	10	211.140	135.500	398.800	🏆
4	loop	10	868.830	295.300	3937.300
4	DataView	10	219.100	90.300	735.600	🏆

Windows Firefox 113

Key	Value	Key	Value
Buffer size (bytes)	67108864	Buffer size (MB)	64.0
Rounds per case	10	Bytes tested	1, 2, 4
Timer	performance.now()	User agent	Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/113.0
Platform	Win32	HW concurrency	2
Language	zh-CN	Languages	zh-CN, zh, zh-TW, zh-HK, en-US, en
Screen resolution	2560x1440	Screen pixel depth	24
Performance timeOrigin	1763454236144

Bytes	Method	Rounds	Avg ms	Min ms	Max ms	Winner
1	loop	10	679.600	611.000	918.000
1	DataView	10	656.000	621.000	747.000	🏆
2	loop	10	500.000	487.000	519.000
2	DataView	10	364.100	342.000	430.000	🏆
4	loop	10	893.800	590.000	1707.000
4	DataView	10	319.800	244.000	506.000	🏆

Windows Firefox 142

Key	Value	Key	Value
Buffer size (bytes)	67108864	Buffer size (MB)	64.0
Rounds per case	10	Bytes tested	1, 2, 4
Timer	performance.now()	User agent	Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:142.0) Gecko/20100101 Firefox/142.0
Platform	Win32	HW concurrency	2
Language	zh-CN	Languages	zh-CN, zh, zh-TW, zh-HK, en-US, en
Screen resolution	2560x1440	Screen pixel depth	24
Performance timeOrigin	1763454368830

Bytes	Method	Rounds	Avg ms	Min ms	Max ms	Winner
1	loop	10	1069.500	547.000	2812.000
1	DataView	10	614.900	532.000	700.000	🏆
2	loop	10	457.100	386.000	562.000
2	DataView	10	378.400	327.000	550.000	🏆
4	loop	10	498.000	244.000	2302.000
4	DataView	10	261.600	196.000	591.000	🏆

Windows Firefox 145.0

Key	Value	Key	Value
Buffer size (bytes)	67108864	Buffer size (MB)	64.0
Rounds per case	10	Bytes tested	1, 2, 4
Timer	performance.now()	User agent	Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:145.0) Gecko/20100101 Firefox/145.0
Platform	Win32	HW concurrency	2
Language	zh-CN	Languages	zh-CN, zh, zh-TW, zh-HK, en-US, en
Screen resolution	2560x1440	Screen pixel depth	24
Performance timeOrigin	1763455136427

Bytes	Method	Rounds	Avg ms	Min ms	Max ms	Winner
1	loop	10	601.100	541.000	749.000
1	DataView	10	503.900	459.000	646.000	🏆
2	loop	10	426.900	346.000	525.000
2	DataView	10	327.800	268.000	377.000	🏆
4	loop	10	256.500	233.000	310.000
4	DataView	10	184.100	169.000	203.000	🏆

Safari 18

Key	Value	Key	Value
Buffer size (bytes)	67108864	Buffer size (MB)	64.0
Rounds per case	10	Bytes tested	1, 2, 4
Timer	performance.now()	User agent	Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/18.6 Safari/605.1.15
Platform	MacIntel	HW concurrency	4
Language	en-US	Languages	en-US
Screen resolution	3840x2160	Screen pixel depth	24
Performance timeOrigin	1763455352927

Bytes	Method	Rounds	Avg ms	Min ms	Max ms	Winner
1	loop	10	1415.600	1365.000	1546.000
1	DataView	10	962.200	934.000	1029.000	🏆
2	loop	10	889.100	870.000	917.000
2	DataView	10	472.100	468.000	474.000	🏆
4	loop	10	849.900	694.000	1098.000
4	DataView	10	412.500	269.000	615.000	🏆

Karma Test

  Websock
    Receive queue methods
      rQpeek8
        √ should peek at the next byte without poping it off the queue
      rQshift8()
        √ should pop a single byte from the receive queue
      rQshift16()
        √ should pop two bytes from the receive queue and return a single number
      rQshift32()
        √ should pop four bytes from the receive queue and return a single number
      rQlen())
        √ should return the number of buffered bytes in the receive queue
      rQshiftStr
        √ should shift the given number of bytes off of the receive queue and return a string
        √ should be able to handle very large strings
      rQshiftBytes
        √ should shift the given number of bytes of the receive queue and return an array
        √ should return a shared array if requested
      rQpeekBytes
        √ should not modify the receive queue
        √ should return a shared array if requested
      rQwait
        √ should return true if there are not enough bytes in the receive queue
        √ should return false if there are enough bytes in the receive queue
        √ should return true and reduce rQi by "goback" if there are not enough bytes
        √ should raise an error if we try to go back more than possible
        √ should not reduce rQi if there are enough bytes
    Send queue methods
      sQpush8()
        √ should send a single byte
        √ should not send any data until flushing
        √ should implicitly flush if the queue is full
      sQpush16()
        √ should send a number as two bytes
        √ should not send any data until flushing
        √ should implicitly flush if the queue is full
      sQpush32()
        √ should send a number as two bytes
        √ should not send any data until flushing
        √ should implicitly flush if the queue is full
      sQpushString()
        √ should send a string buffer
        √ should not send any data until flushing
        √ should implicitly flush if the queue is full
        √ should implicitly split a large buffer
      sQpushBytes()
        √ should send a byte buffer
        √ should not send any data until flushing
        √ should implicitly flush if the queue is full
        √ should implicitly split a large buffer
      flush
        √ should actually send on the websocket
        √ should not call send if we do not have anything queued up
    lifecycle methods
      opening
        √ should pick the correct protocols if none are given
        √ should open the actual websocket
      attaching
        √ should attach to an existing websocket
      closing
        √ should close the actual websocket if it is open
        √ should close the actual websocket if it is connecting
        √ should not try to close the actual websocket if closing
        √ should not try to close the actual websocket if closed
        √ should reset onmessage to not call _recvMessage
      event handlers
        √ should call _recvMessage on a message
        √ should call the open event handler on opening
        √ should call the close event handler on closing
        √ should call the error event handler on error
      ready state
        √ should be "unused" after construction
        √ should be "connecting" if WebSocket is connecting
        √ should be "open" if WebSocket is open
        √ should be "closing" if WebSocket is closing
        √ should be "closed" if WebSocket is closed
        √ should be "unknown" if WebSocket state is unknown
        √ should be "connecting" if RTCDataChannel is connecting
        √ should be "open" if RTCDataChannel is open
        √ should be "closing" if RTCDataChannel is closing
        √ should be "closed" if RTCDataChannel is closed
        √ should be "unknown" if RTCDataChannel state is unknown
    WebSocket receiving
      √ should support adding data to the receive queue
      √ should call the message event handler if present
      √ should not call the message event handler if there is nothing in the receive queue
      √ should compact the receive queue when fully read
      √ should compact the receive queue when we reach the end of the buffer
      √ should automatically resize the receive queue if the incoming message is larger than the buffer
      √ should automatically resize the receive queue if the incoming message is larger than 1/8th of the buffer and we reach the end of the buffer

Can I use

https://caniuse.com/mdn-javascript_builtins_dataview

PekingSpades · 2025-12-13T03:56:44Z

Websock now reads 8/16/32-bit values directly via the cached DataView in rQshift8/16/32() and no longer calls _rQshift, so every call assumes _rQdv is in sync with _rQ.
Tests that manually overwrite _rQ (the shared buffer stub in tests/test.rfb.js, plus the buffer-mutation cases in tests/test.websock.js) were updated to rebuild _rQdv whenever they replace the queue. Without that, the new inline methods dereferenced null.getUint* and the suite failed.
For any future test that patches _rQ (or _sQ) to custom buffers, make sure to mimic the production invariant by also assigning new DataView(buffer) to _rQdv; otherwise the hot path will crash before the test actually exercises the intended behavior.

demike

LGTM

PekingSpades · 2026-01-14T15:35:14Z

@samhed

samhed · 2026-01-14T17:15:52Z

This looks interesting. It looks like a lot of testing was involved, I'm curious about the process here.

How was this improvement discovered? And how did you go about testning?

PekingSpades · 2026-01-29T12:21:26Z

Hi! Happy to share the background and testing process. @samhed

How this improvement was discovered

I noticed a small detail while reading websock.js: there’s a comment that says
TODO(directxman12): test performance with these vs a DataView.

That TODO caught my attention because my team is also building a high-performance remote control product, and our controller side is Web-based as well. I’ve been digging into VNC and noVNC to learn how the high-throughput message buffering is done, and websock.js is a great reference for that. So I decided to actually run the performance comparison suggested by the TODO, and that’s what led to this change.

How I went about testing

The benchmarking approach was straightforward:

I used a small JS benchmark script (written with GPT’s help) to repeatedly run the relevant buffer operations.
The script did multiple rounds with a warm-up phase first, then measured steady-state performance across many iterations.
I ran the tests on multiple physical machines that I already have access to (so I could validate across different CPU/memory/OS combinations, not just a single device).

Unfortunately I can’t locate the exact script anymore (I didn’t preserve it at the time), but the structure was roughly:

warm up loops,
timed loops using performance.now(),
repeat multiple times and take the average/median,
avoid counting the first runs to reduce JIT/GC noise.

Why so many devices/browsers?

The main reason I tested across so many machines/browsers was to rule out “this TODO exists for a reason” scenarios — e.g. historical browser compatibility issues, engine-specific slow paths, etc. In practice I didn’t see compatibility problems, and the improvement was consistently measurable.

For older Safari versions, I didn’t have an old macOS machine available, so I used LambdaTest (their free quota) to cover those versions.

Outcome

Across the devices/browsers I tested, the change showed a clear performance improvement and I didn’t observe regressions or compatibility issues in the test matrix I ran.

If it would help, I can recreate a new minimal benchmark script and share it in the PR so others can reproduce/extend the testing going forward.

PekingSpades · 2026-02-06T09:04:19Z

@samhed

samhed · 2026-02-07T17:40:53Z

It sounds like you only tested JS code snippets out of context from the rest of the noVNC code? Or am I misunderstanding? Did you do any manual "real-life" testing as well?

Yes, please share a benchmarking script, preferably similar to the one you used.

PekingSpades · 2026-02-08T06:20:14Z

@samhed

(() => {
  const SIZE   = 64 * 1024 * 1024; // 64MB
  const ROUNDS = 10;     
  const BYTES_LIST = [1, 2, 4];

  const hasPerf = typeof performance !== "undefined" && typeof performance.now === "function";
  const now = hasPerf ? () => performance.now() : () => Date.now();
  const timerName = hasPerf ? "performance.now()" : "Date.now()";
  const buf = new ArrayBuffer(SIZE);
  const u8  = new Uint8Array(buf);
  const dv  = new DataView(buf);

  for (let i = 0; i < u8.length; i++) {
    u8[i] = i & 0xFF;
  }

  let _rQ  = u8;
  let _rQi = 0;

  function _rQshift_loop(bytes) {
    let res = 0;
    for (let byte = bytes - 1; byte >= 0; byte--) {
      res += _rQ[_rQi++] << (byte * 8);
    }
    return res >>> 0;
  }

  let _rQiDV = 0;

  function _rQshift_dataview(bytes) {
    let res;
    if (bytes === 1) {
      res = dv.getUint8(_rQiDV);
    } else if (bytes === 2) {
      res = dv.getUint16(_rQiDV, false); 
    } else if (bytes === 4) {
      res = dv.getUint32(_rQiDV, false);
    } else {
      throw new Error("only support 1/2/4 bytes");
    }
    _rQiDV += bytes;
    return res >>> 0;
  }

  // ===== benchmark =====
  const results = []; // { method, bytes, round, timeMs }

  function bench(bytes) {
    const iterations = (SIZE / bytes) | 0;
    let dummy = 0;

    for (let round = 1; round <= ROUNDS; round++) {
      // loop
      _rQi = 0;
      let t0 = now();
      for (let i = 0; i < iterations; i++) {
        dummy ^= _rQshift_loop(bytes);
      }
      let t1 = now();
      results.push({ method: "loop", bytes, round, timeMs: t1 - t0 });

      // DataView
      _rQiDV = 0;
      t0 = now();
      for (let i = 0; i < iterations; i++) {
        dummy ^= _rQshift_dataview(bytes);
      }
      t1 = now();
      results.push({ method: "DataView", bytes, round, timeMs: t1 - t0 });
    }

    globalThis.__benchmarkDummy = dummy;
  }

  BYTES_LIST.forEach(bench);

  function summarize(method, bytes) {
    const rows = results.filter(r => r.method === method && r.bytes === bytes);
    const times = rows.map(r => r.timeMs);
    const sum = times.reduce((a, b) => a + b, 0);
    const avg = sum / times.length;
    const min = Math.min(...times);
    const max = Math.max(...times);
    return { method, bytes, rounds: rows.length, avg, min, max };
  }

  const summaries = [];
  ["loop", "DataView"].forEach(method => {
    BYTES_LIST.forEach(bytes => {
      summaries.push(summarize(method, bytes));
    });
  });

  const winners = {}; // { [bytes]: "loop" | "DataView" | "tie" }
  BYTES_LIST.forEach(bytes => {
    const sLoop = summaries.find(s => s.bytes === bytes && s.method === "loop");
    const sDV   = summaries.find(s => s.bytes === bytes && s.method === "DataView");
    if (!sLoop || !sDV) return;
    if (Math.abs(sLoop.avg - sDV.avg) < 1e-6) {
      winners[bytes] = "tie";
    } else if (sLoop.avg < sDV.avg) {
      winners[bytes] = "loop";
    } else {
      winners[bytes] = "DataView";
    }
  });

  const envPairs = [];

  function addEnvPair(key, value) {
    if (value === undefined || value === null) return;
    const v = String(value).replace(/\|/g, "\\|");
    envPairs.push({ key, value: v });
  }

  // Config
  addEnvPair("Buffer size (bytes)", SIZE);
  addEnvPair("Buffer size (MB)", (SIZE / (1024 * 1024)).toFixed(1));
  addEnvPair("Rounds per case", ROUNDS);
  addEnvPair("Bytes tested", BYTES_LIST.join(", "));
  addEnvPair("Timer", timerName);

  // Client Info
  try {
    addEnvPair("User agent", navigator.userAgent);
    addEnvPair("Platform", navigator.platform);
    addEnvPair("HW concurrency", navigator.hardwareConcurrency);
    addEnvPair("Device memory (GB)", navigator.deviceMemory);
    addEnvPair("Language", navigator.language);
    addEnvPair("Languages", navigator.languages && navigator.languages.join(", "));
  } catch (e) {}

  try {
    addEnvPair("Screen resolution", `${screen.width}x${screen.height}`);
    addEnvPair("Screen pixel depth", screen.pixelDepth);
  } catch (e) {}

  try {
    if (hasPerf && performance && performance.memory) {
      addEnvPair("JS heap size limit (MB)", (performance.memory.jsHeapSizeLimit / (1024 * 1024)).toFixed(1));
      addEnvPair("Total JS heap (MB)", (performance.memory.totalJSHeapSize / (1024 * 1024)).toFixed(1));
      addEnvPair("Used JS heap (MB)", (performance.memory.usedJSHeapSize / (1024 * 1024)).toFixed(1));
    }
    if (hasPerf && performance && typeof performance.timeOrigin === "number") {
      addEnvPair("Performance timeOrigin", performance.timeOrigin);
    }
  } catch (e) {}

  let md = "";

  // Config + Client Info 
  md += `## Config & Client Info\n\n`;
  md += `| Key | Value | Key | Value |\n`;
  md += `| --- | ----- | --- | ----- |\n`;
  for (let i = 0; i < envPairs.length; i += 2) {
    const a = envPairs[i];
    const b = envPairs[i + 1];
    md += `| ${a.key} | ${a.value} | ${b ? b.key : ""} | ${b ? b.value : ""} |\n`;
  }
  md += `\n`;

  md += `## Result\n\n`;
  md += `| Bytes | Method   | Rounds | Avg ms | Min ms | Max ms | Winner |\n`;
  md += `| ----- | -------- | ------ | ------ | ------ | ------ | ------ |\n`;

  BYTES_LIST.forEach(bytes => {
    const sLoop = summaries.find(s => s.bytes === bytes && s.method === "loop");
    const sDV   = summaries.find(s => s.bytes === bytes && s.method === "DataView");
    const winner = winners[bytes];

    const loopWinEmoji =
      winner === "loop" ? "🏆" :
      winner === "tie"  ? "⚖️" : "";
    const dvWinEmoji =
      winner === "DataView" ? "🏆" :
      winner === "tie"      ? "⚖️" : "";

    if (sLoop) {
      md += `| ${bytes} | loop     | ${sLoop.rounds} | ${sLoop.avg.toFixed(3)} | ${sLoop.min.toFixed(3)} | ${sLoop.max.toFixed(3)} | ${loopWinEmoji} |\n`;
    }
    if (sDV) {
      md += `| ${bytes} | DataView | ${sDV.rounds} | ${sDV.avg.toFixed(3)} | ${sDV.min.toFixed(3)} | ${sDV.max.toFixed(3)} | ${dvWinEmoji} |\n`;
    }
  });

  md += `\n`;

  console.log(md);
})();

samhed · 2026-02-10T22:37:42Z

It sounds like you only tested JS code snippets out of context from the rest of the noVNC code? Or am I misunderstanding? Did you do any manual "real-life" testing as well?

PekingSpades · 2026-04-16T02:09:04Z

My earlier numbers were mostly from isolated JS benchmarks, not a full live VNC-session benchmark.

I’ve now added two browser-level checks to the PR that exercise noVNC itself rather than standalone snippets:

A smoke test that drives an actual noVNC RFB handshake plus a Raw framebuffer update in headless Chrome. That passes and renders the expected pixel.
A protocol-stream benchmark that feeds complete FramebufferUpdate/CopyRect messages through the actual noVNC RFB/Websock path.

In the parser-focused configuration of that benchmark (display work stubbed so the measurement stays attributable to the receive path changed by this PR), I’m seeing about 30-34% improvement versus current master on repeated runs on my machine.

I also ran the same protocol stream with display work enabled. There the total time was essentially unchanged, which is why I think this particular optimization is hard to validate with end-to-end “real-life session” timing alone: once rendering is included, the receive-path signal gets swamped by display cost.

So the short answer is: I had not originally done a good live-session benchmark, but I have now added browser-level smoke/perf scripts to the PR that run noVNC in context rather than just isolated snippets. They can be rerun locally with node tests/perf/run_rfb_bench.mjs.

Use DataView for receive queue shifts

8f73e2e

demike reviewed Dec 12, 2025

View reviewed changes

Comment thread core/websock.js Outdated

PekingSpades added 2 commits December 13, 2025 11:10

Inline DataView reads in rQshift methods

de7723c

Fix tests to rebuild Websock DataView when swapping buffers

588d7d8

PekingSpades requested a review from demike December 31, 2025 14:12

demike reviewed Jan 13, 2026

View reviewed changes

Add browser-level RFB perf benchmarks

9822ba2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up Websock receive queue reads via DataView#2024

Speed up Websock receive queue reads via DataView#2024
PekingSpades wants to merge 4 commits intonovnc:masterfrom
PekingSpades:master

PekingSpades commented Nov 18, 2025

Uh oh!

Uh oh!

PekingSpades commented Dec 13, 2025

Uh oh!

demike left a comment

Uh oh!

PekingSpades commented Jan 14, 2026

Uh oh!

samhed commented Jan 14, 2026

Uh oh!

PekingSpades commented Jan 29, 2026 •

edited

Loading

Uh oh!

PekingSpades commented Feb 6, 2026

Uh oh!

samhed commented Feb 7, 2026

Uh oh!

PekingSpades commented Feb 8, 2026

Uh oh!

samhed commented Feb 10, 2026

Uh oh!

PekingSpades commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

PekingSpades commented Nov 18, 2025

Summary

Performance Summary

Testing

Benchmark Results

Windows Chrome 142

Windows Chrome 142(Machine 2)

Windows Chrome 101

Windows Chrome 92.0

Windows Chrome 83.0

Windows Chrome 71.0

Windows Edge 142

Windows Edge 142(Machine 2)

Windows Firefox 113

Windows Firefox 142

Windows Firefox 145.0

Safari 18

Karma Test

Can I use

Uh oh!

Uh oh!

PekingSpades commented Dec 13, 2025

Uh oh!

demike left a comment

Choose a reason for hiding this comment

Uh oh!

PekingSpades commented Jan 14, 2026

Uh oh!

samhed commented Jan 14, 2026

Uh oh!

PekingSpades commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

How this improvement was discovered

How I went about testing

Why so many devices/browsers?

Outcome

Uh oh!

PekingSpades commented Feb 6, 2026

Uh oh!

samhed commented Feb 7, 2026

Uh oh!

PekingSpades commented Feb 8, 2026

Uh oh!

samhed commented Feb 10, 2026

Uh oh!

PekingSpades commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

PekingSpades commented Jan 29, 2026 •

edited

Loading