Thus far being able to handle ~ 70 connections a second isn’t bad but can we do better?
Yes.
Let’s continue where we left off.
- From your WordPress dashboard, select plug-ins. This time we’re going to install the WordPress Nginx proxy cache integrator. As before search, and install.
- Now edit /etc/nginx and add
proxy_cache_path /dev/shm/nginx levels=1:2 keys_zone=czone:16m max_size=32m inactive=10m; proxy_temp_path /dev/shm/nginx/tmp; proxy_cache_key "$scheme$host$request_uri"; proxy_cache_valid 200 302 10m;directly after the line that has server_names_hash_bucket_size 64;
- edit /etc/hosts modifying the entry for localhost
127.0.0.1 localhost backend frontend
- Now edit /etc/nginx/conf.d/default.conf, and change everything before
# BEGIN W3TC Page Cache cache
to
server { server_name frontend; location / { proxy_pass http://backend:8080; proxy_set_header X-Real-IP $remote_addr; proxy_cache czone; } } server { server_name backend; root /var/www/; listen 8080; index index.html index.htm index.php; include conf.d/drop; location / { # This is cool because no php is touched for static content try_files $uri $uri/ /index.php?q=$uri&$args; } location ~ \.php$ { fastcgi_buffers 8 256k; fastcgi_buffer_size 128k; fastcgi_intercept_errors on; include fastcgi_params; fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; fastcgi_pass unix:/dev/shm/php-fpm-www.sock; } - Restart services with
service php5-fpm restart service nginx restart
Now we’re ready for more speed tests with ApacheBench.
That’s quite a speed up. This configuration is able to handle about 200 concurrent dispatches a second before performance starts to drop off. Even at 300 connections per second, the system is still able to handle requests faster than they are coming in however the latency for each request is starting to build. At 400 the system is able to process just as many requests as are coming in.
Here’s a dstat of a 400 concurrent connections run. There’s an interesting behavior that occasionally shows up here.
root@linaro-server:/etc/nginx/conf.d# dstat You did not select any stats, using -cdngy by default. ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system-- usr sys idl wai hiq siq| read writ| recv send| in out | int csw 1 0 97 1 0 0|3069B 21k| 0 0 | 0 0 | 144 66 0 1 99 0 0 0| 0 0 | 150B 1082B| 0 0 | 207 44 0 0 100 0 0 0| 0 0 | 52B 366B| 0 0 | 106 56 0 0 100 0 0 0| 0 0 | 104B 492B| 0 0 | 96 50 0 0 100 0 0 0| 0 0 | 104B 732B| 0 0 | 104 54 3 0 88 1 0 8| 0 40k| 101k 43k| 0 0 | 639 76 48 33 0 0 0 19| 0 0 | 196k 375k| 0 0 |2578 705 47 33 0 0 0 21| 0 0 | 306k 539k| 0 0 |3391 593 45 33 0 0 0 22| 0 0 | 404k 4112k| 0 0 |4545 696 63 16 0 0 0 22| 0 0 | 458k 7742k| 0 0 |6007 1656 96 4 0 0 0 0| 0 0 |2600B 66k| 0 0 | 525 760 94 6 0 0 0 0| 0 48k|3894B 98k| 0 0 | 605 886 94 6 0 0 0 0| 0 24k|3940B 98k| 0 0 | 591 809 93 7 0 0 0 0| 0 16k|3675B 87k| 0 0 | 564 833 97 3 0 0 0 0| 0 0 |3062B 76k| 0 0 | 565 829 96 3 0 0 0 0| 0 0 |3432B 87k| 0 0 | 547 828 97 3 0 0 0 0| 0 24k|3796B 98k| 0 0 | 561 843 96 4 0 0 0 0| 0 0 |3016B 76k| 0 0 | 543 855 95 5 0 0 0 0| 0 0 |4078B 98k| 0 0 | 574 809 97 3 0 0 0 1| 0 0 |3227B 76k| 0 0 | 563 802 96 4 0 0 0 0| 0 0 |3848B 98k| 0 0 | 573 839 97 3 0 0 0 0| 0 24k|2600B 66k| 0 0 | 554 750 94 5 0 0 0 1| 0 0 |3848B 98k| 0 0 | 581 881 97 3 0 0 0 0| 0 0 |3016B 76k| 0 0 | 567 791 95 5 0 0 0 0| 0 0 |3432B 87k| 0 0 | 564 840 97 3 0 0 0 0| 0 0 |3940B 98k| 0 0 | 567 782 38 2 59 0 0 1| 0 48k| 18k 99k| 0 0 | 559 216 0 1 99 0 0 0| 0 0 | 52B 366B| 0 0 | 153 62 0 1 99 0 0 0| 0 0 | 104B 492B| 0 0 | 112 48 0 1 99 0 0 0| 0 0 | 52B 366B| 0 0 | 138 61 0 1 99 0 0 0| 0 0 | 52B 366B| 0 0 | 101 46 0 0 100 0 0 0| 0 0 | 52B 366B| 0 0 | 110 47 0 0 100 0 0 0| 0 0 | 52B 366B| 0 0 | 110 55
These last runs at 400 concurrent connections seems to stick at the end, waiting for some last requests. We can see this in the ab report.
Server Software: nginx/1.2.1
Server Hostname: linaro-server.testdomain.com
Server Port: 80
Document Path: /
Document Length: 172 bytes
Concurrency Level: 400
Time taken for tests: 20.397 seconds
Complete requests: 3000
Failed requests: 1175
(Connect: 0, Receive: 0, Length: 1175, Exceptions: 0)
Write errors: 0
Non-2xx responses: 1825
Total transferred: 12837398 bytes
HTML transferred: 12277473 bytes
Requests per second: 147.08 [#/sec] (mean)
Time per request: 2719.560 [ms] (mean)
Time per request: 6.799 [ms] (mean, across all concurrent requests)
Transfer rate: 614.63 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 31 105.1 1 1076
Processing: 1 842 2772.7 82 20243
Waiting: 1 842 2772.6 81 20242
Total: 2 873 2771.4 146 20264
Percentage of the requests served within a certain time (ms)
50% 146
66% 194
75% 330
80% 347
90% 2003
95% 2206
98% 13677
99% 17262
100% 20264 (longest request)
It doesn’t always stick, but I sense there’s a software issue here.
Regardless. For a very cheap ARM based server, this seems good. It would be worthwhile to compare to an intel setup.
Does this apply to real world needs? Well, I tweeted to @gruber asking what were his daringfireball stats the other day and he was kind enough to respond back with a set of 24 hour stats from his site which is one of the more popular blogs out there. I don’t know that he uses wordpress but regardless. He responded with this:
Presuming an evenly distributed number of requests over an hour allowing us to divide by 3600, during the busiest hour 15815, yields ~4.4 requests a second. Seems like this setup would be able to do that.
Update #1: Doing a little digging around last night, I discovered in the 400 concurrent request range the kernel is throwing page allocation failures yet it’s not really out of memory yet. I’ve posted a question to the linaro-dev list about this to which Andy Green mentioned it might be related to a problem observed on the Raspberry Pi which seems to involve the ethernet driver. Stay tuned.


Just to make sure… Are the stats for daringfireball requests or visitors ? Stats software often reports visits and unique visitors, so if that’s the case, the number of requests/hits (is hit the same as request?) would be much higher.
That’s a good question. On the right hand side of that chart, those are unique visit counts. While on the left it’s a count of visits. It is unclear to me if the number of page hits per vistor per session are combined in some fashion. Given the way John constructs his site, the average reader is more likely to just read the front page which includes the past couple of days of posts rather than comb through past archives. Regardless this is conjecture. I’ll see if John might comment.
The next steps are as I see it are to do some profiling and to get to the bottom of the ethernet driver issue.