Martin Grigorov
2 min readJul 13, 2020

--

Hi Willy,

Thank you for reading my article and for the advices!

I will try to answer you on each point:

  1. number of threads: I’ve removed the nbthread and cpu-map settings and indeed now the results are better than before. aarch64, HTTP: 16688.53 (value from Update 3) -> 19446.49; x86_64, HTTP: 25908.50-> 32309.90; aarch64, HTTPS: 16821.60 -> 19049.35; x86_64, HTTPS: 30376.95 -> 31555.68;
  2. same physical CPUs: yes, this is what I’ve been doing in the first and second run. Since Update 3 above I’ve started using backend servers from the other VM, i.e. the HAProxy on the aarch64 VM uses the backends running on the x86_64 VM and the HAProxy on x86_64 uses the backends running on aarch64. I have just these two VMs so I cannot spread the backends on their own VMs. But the performance of the Golang servers is really good (120–160K reqs/sec) so I think this setup should be OKish.
  3. keep-alive: WRK does not send Connection: Keep-Alive request header but since it is HTTP 1.1 it is implied. I’ve added Connection: Keep-Alive explicitly but this didn’t change the results. conntrack is not installed on the VMs but it is installed/used on the hypervisor (KVM)!
  4. x86_64 versus aarch64: I used the very same VMs to run performance tests for Apache Tomcat and Memcached. For Tomcat again x86_64 was the winner but for Memcached the aarch64 one performed better. I run those tests regularly and they are consistent.
  5. openssl speed: I will write a separate article for this. The output is not easy to read and compare. I’ll have to color code it.
  6. option http-server-close: facepalm! I’m not sure how this one got in my settings! It improved the results some more: aarch64, HTTP: 19446.49 (value after removing nbthread and cpu-map, p.1) -> 25046.17; x86_64, HTTP: 32309.90 -> 45398.78; aarch64, HTTPS: 19049.35 -> 25003.79; x86_64, HTTPS: 31555.68 -> 41769.52;
  7. option prefer-last-server: this setting didn’t help. The results are either slightly better or slightly worse than p.6.
  8. pool-low-conn 16: the results are still the same as p.6.
  9. tune.fd.edge-triggered: the results are still the same as p.6. +- few hundreds
  10. option http-use-htx: Thanks! I’ve removed it!

To summarize: Points 1 and 6 helped to improve the performance on both VMs with 50–80%! But still the x86_64 VM performs 65–80% better than my aarch64 VM!

--

--

No responses yet