Skip to content

Use less memory for constraint lookup radix cache#860

Merged
phillip-stephens merged 2 commits intozmap:mainfrom
droe:droe/constraint-less-cache
Apr 29, 2024
Merged

Use less memory for constraint lookup radix cache#860
phillip-stephens merged 2 commits intozmap:mainfrom
droe:droe/constraint-less-cache

Conversation

@droe
Copy link
Contributor

@droe droe commented Apr 26, 2024

Reduce the size of the precomputed array of prefixes from 4 MB to 1 MB in order to fit into the cache of CPUs with lower amounts of cache. Improves send rate on systems where send rate is CPU/memory-bound and cache is limited.

With the default blocklist, this changes the array/tree tradeoff from 3702243328 IPs in radix array, 15104 IPs in tree to 3702194176 IPs in radix array, 64256 IPs in tree. That seems like a reasonable price to pay for the perf boost of reducing the memory footprint to one fourth. On a system with 8 core Intel Atom, 8*2 MB L2 cache, 10 GbE NIC, netmap, AES-NI, before this change, send rate was 11% below what the NIC can do, while with this change, zmap pushes packets faster than the NIC can send them. You may want to test this change on higher end systems before merging, to assert that it does not perform substantially worse on different system configurations.

Reduce size of precomputed array of prefixes from 4 MB to 1 MB in order
to fit into the cache of CPUs with lower amounts of cache.  Improves
send rate on systems where send rate is CPU/memory-bound.
@droe
Copy link
Contributor Author

droe commented Apr 26, 2024

Test failures look unrelated to change under test.

@zakird
Copy link
Member

zakird commented Apr 27, 2024

Sounds fairly reasonable. @phillip-stephens can you confirm whether there are any performance implications on more resourced systems (e.g., one of our boxes)?

@zakird zakird added this to the ZMap 4.2 milestone Apr 27, 2024
@phillip-stephens
Copy link
Contributor

@zakird Doesn't look like any negative performance implications on my VM with plenty of cores/RAM and a large bandwidth uplink.
Test Command - sudo ./src/zmap -p 80 -t 20 -T 6 -B 10G -o /dev/null

main Branch Results

  1. 0:29 100% (0s left); send: 52595072 done (2.63 Mp/s avg); recv: 695316 0 p/s (24.0 Kp/s avg); drops: 0 p/s (0 p/s avg); hitrate: 1.32%
  2. 0:29 100% (0s left); send: 56374656 done (2.82 Mp/s avg); recv: 743188 0 p/s (25.6 Kp/s avg); drops: 0 p/s (0 p/s avg); hitrate: 1.32%

Pull Request #860 Results

  1. 0:28 100% (0s left); send: 54433902 done (2.72 Mp/s avg); recv: 719076 3 p/s (25.6 Kp/s avg); drops: 0 p/s (0 p/s avg); hitrate: 1.32%
  2. 0:28 100% (0s left); send: 59560768 done (2.98 Mp/s avg); recv: 785961 5 p/s (28.0 Kp/s avg); drops: 0 p/s (0 p/s avg); hitrate: 1.32%

@phillip-stephens phillip-stephens self-requested a review April 29, 2024 15:40
Copy link
Contributor

@phillip-stephens phillip-stephens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested on a more resourced VM, (not using netmap), doesn't seem to negatively impact performance there.

@phillip-stephens phillip-stephens merged commit 3e5f387 into zmap:main Apr 29, 2024
@droe droe deleted the droe/constraint-less-cache branch June 8, 2024 09:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants