PHP Scales

The news that Friendster migrated to PHP from JSP for scalability reasons has triggered much needed discussion in the Java community.

Chris Shiflett at O’Reilly had this to say (source rc3.org):

…how does scalability apply to the Web? First, you should ask yourself whether the Web’s fundamental architecture is scalable. The answer is yes. Some people will describe HTTP’s statelessness in a derogatory manner. The more enlightened people, however, understand that this is one of the key characteristics that make HTTP such a scalable protocol. What makes it scalable? With every HTTP transaction being completely independent, the amount of resources necessary grows linearly with the amount of requests received. In a system that does not scale (where “does not scale” means that it scales poorly), the amount of resources necessary would increase at a higher rate than the number of requests. While HTTP has its flaws (the proper spelling of referrer being one), there’s no arguing that it scales, and this is one of the things that made the Web’s early explosive growth less painful than it would have otherwise been.

The present discussion is about developing Web applications that scale well, and whether particular languages, technologies, and platforms are more appropriate than others. My opinion is that some things scale more naturally than others, and Rasmus’s explanation above touches on this. PHP, when compiled as an Apache module (mod_php), fits nicely into the basic Web paradigm. In fact, it might be easier to imagine PHP as a new skill that Apache can learn. HTTP requests are still handled by Apache, and unless your programming logic specifically requires interaction with another source (database, filesystem, network), your application will scale as well as Apache (with a decrease in performance based upon the complexity of your programming logic). This is why PHP naturally scales. The caveat I mention is why your PHP application may not scale.

A common (and somewhat trite) argument being tossed around is that scalability has nothing to do with the programming language. While it is true that language syntax is irrelevant, the environments in which languages typically operate can vary drastically, and this makes a big difference. PHP is much different than ColdFusion or JSP. In terms of scalability, PHP has an advantage, but it loses a few features that some developers miss (which is why there are efforts to create application servers for PHP). The PHP versus JSP argument should focus on environment, otherwise the point gets lost.

I actually disagree with George’s statement, “PHP doesn’t magically scale ‘naturally'”. Of course, I understand and agree with the spirit of what he’s trying to say, which is that using PHP isn’t going to make your applications magically scale well, but I do believe that PHP has a natural advantage, as I just described. Rasmus seems to agree with me, and George might also agree, despite his statement.

I think PHP scales well because Apache scales well because the Web scales well. PHP doesn’t try to reinvent the wheel; it simply tries to fit into the existing paradigm, and this is the beauty of it.

When he quotes Rasmus Lerdorf I think he gets to the heart of the matter:

A typical Java application will make use of the fact that it is running under a JVM in which you can store session and state data very easily and you can effectively write a web application very much the same way you would write a desktop application. This is very convenient, but it doesn’t scale. To scale this you then have to add other mechanisms to do intra-JVM message passing which adds another level of complexity and performance issues. There are of course ways to avoid this, but the typical first Java implementation of something will fall into this trap.

PHP has no scalability issues of this nature. Each request is completely sandboxed from every other request and there is nothing in the language that leads people towards writing applications that don’t scale.

It’s been my experience that because of Java’s abundance of riches when it comes to application design, many using it concentrate too much on tuning the Java code on the application tier, instead concentrating on all other areas of opportunity.

Because PHP does not offer so many different options to cache or pass data within applications written with it, there is far less chance for a developer or project manager to think the scalability problem can be solved entirely there. It forces you to look at the other sub-systems in across your architecture and make sure you have the resources to do so.

I have a perfect example from work experience that I?ll share with you sometime.