Tuesday, January 27, 2009

slow lstat, slow php, slow drupal

If you are short on time go to bottom line.

Ok, the lstat() and stat() system calls system call are not really slow per se. But take a look at an apache strace -r dump from a drupal installation I was testing:

[user@host ~]# strace -o strace.load -r -s 256 -p <PID>
...
0.000081 getcwd("/mnt/var/www/html/drupal", 4096) = 32
0.000048 lstat("/mnt", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
0.000805 lstat("/mnt/var", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
0.001596 lstat("/mnt/var/www", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
0.000104 lstat("/mnt/var/www/html", {st_mode=S_IFDIR|0777, st_size=4096, ...}) = 0
0.000362 lstat("/mnt/var/www/html/drupal", {st_mode=S_IFDIR|0755, st_size=8192, ...}) = 0
0.000804 lstat("/mnt/var/www/html/drupal/sites", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
0.057575 lstat("/mnt/var/www/html/drupal/sites/all", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
0.000297 lstat("/mnt/var/www/html/drupal/sites/all/modules", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
0.014762 lstat("/mnt/var/www/html/drupal/sites/all/modules/Internet", {st_mode=S_IFDIR|0755, st_size=4096,
0.000214 lstat("/mnt/var/www/html/drupal/sites/all/modules/Internet/date", {st_mode=S_IFDIR|0755, st_size=4
0.000112 lstat("/mnt/var/www/html/drupal/sites/all/modules/Internet/date/date.module", {st_mode=S_IFREG|064
0.010816 open("/mnt/var/www/html/drupal/sites/all/modules/Internet/date/date.module", O_RDONLY) = 20
0.000835 fstat(20, {st_mode=S_IFREG|0644, st_size=44118, ...}) = 0
0.000069 lseek(20, 0, SEEK_CUR) = 0
0.000063 stat("././sites/all/modules/Internet/date/date.module", {st_mode=S_IFREG|0644, st_size=44118, ...}) = 0
0.000159 close(20) = 0
...

As you can see, for every file included in the system there are around 10-15 lstat calls. this means that if we have ~200 files included we have ~2500 lstat calls per page. A simple sum on the strace results gave me ~8 seconds per page of lstat time, which was over a third of my test page time. see?

[user@host ~]# cat strace.single | awk '{printf " " $1 "\n " $2}' |sed 1d | sed \$d |awk '{printf $2 "+"}' | sed s/.$/\\n/ | bc
21.204960
[user@host ~]# cat strace.single | awk '{printf " " $1 "\n " $2}' |sed 1d | sed \$d | fgrep lstat | awk '{printf $2 "+"}' | sed s/.$/\\n/ | bc
8.361994
if you really want to understand this shell script let me know. It deserves a post by itself

Now we have 3 options: a) make lstat() faster. b) make php stop doing all those lstats(). c) make drupal include less files.

Well, (a) is not an option as our infrastructure team (aka sysadmins) insist there is nothing wrong with the filesystem and nothing can be done to enhance it's performance. Since I gave up being a sysadmin when it was still call 'sysadmin' it would be hard for me to prove them wrong (especially since they might be are right).

As (c) would require us to refactor most of our code in a very very 'not drupal way' and create very large php files, I was hoping to find a better solution.

By a rare chance of good fortune i came to learn about realpath_cache_size. Once I changed the foolish default to a reasonable 1M in php.ini, I lost most of the lstat() calls and over 10 seconds of processing time.

[user@host ~]# cat strace.2.single | awk '{printf " " $1 "\n " $2}' |sed 1d | sed \$d |awk '{printf $2 "+"}' | sed s/.$/\\n/ | bc
9.752309
[user@host ~]# cat strace.2.single | awk '{printf " " $1 "\n " $2}' |sed 1d | sed \$d | fgrep lstat | awk '{printf $2 "+"}' | sed s/.$/\\n/ | bc
.040935

Here is what to do in php.ini

; ...php.ini...
; Determines the size of the realpath cache to be used by PHP. This value should
; be increased on systems where PHP opens many files to reflect the quantity of
; the file operations performed.
realpath_cache_size=1M

; Duration of time, in seconds for which to cache realpath information for a given
; file or directory. For systems with rarely changing files, consider increasing this
; value.
realpath_cache_ttl=300


For normal pages under normal load I was able to get a 10%-25% performance improvment. I think that's quite a bit for a configuration change.