Monday, May 26, 2008

NGINX... why?!

Anyone who has any relationship with Rails development has, at this point, heard of Nginx. The point of Nginx is to replace the Apache, a the definitive global webserver that Rails devs feel is simply too slow for their lightening fast development framework. It's not the first time the Rails community has snubbed Apache, nor will it the last. Those Rails devs are simply fickle folks.

So, fine, let the Rails devs frolic with their uberfast webserver... what about the rest of us mere mortals? Is Nginx a good route for you? Let me say here and now, the answer to that question is almost always a strong, resilient, and durable no. The reasons for the rejection are many, so let's start with the funny ones first and proceed to the more technical ones.

First, it behaves in inexplicable ways for different browsers. Check out this screen shot of Penny-Arcade loaded in Firefox (on the top) and Konquerer (on the bottom) at the same time.

Click to see full resolution
This happened with multiple reloads (cache disabled)... it always worked with Firefox, always "failed" with Konqueror. Oh, and that "Bad Gateway" message is a something you should get used to if you are thinking about deploying Nginx, because it's an all too common sight (more about that later on).

Second, the primary documentation is in Russian. Yes, Русский. From what I can gather, the primary developers are Russian, which is great... yay global open source development! But, a webserver is a complicated beast, hence the great forests that are clear cut each year to produce the necessary library of books on Apache and MS Information Server. Let me be clear that when I say primary, I do mean to imply there is secondary documentation. This is secondary documentation in the same way that warning labels will list sixteen life threatening things you could do written in English, followed by a single warning in Spanish that translates to "Danger."

Third, nginx does not support .htaccess files. Anyone who spends much time building custom websites knows the power of these magic little files that alters the way Apache treats a particular folder. Securing a folder with basic authentication is two line simple lines and a password file. Nginx takes a different approach, where different means stop bugging us to add .htaccess support. Instead, every directive, for every folder, regardless of it's scope, must go into a master configuration file. You can split the conf file into many smaller files, but they are all loaded when the server starts and given global effect. The common approach here is to split each hosted domain into a conf file... but that only helps keep things organized, because in the end of the day, every conf file has global implications.

Third and a half, nginx requires you to have apache support tools lying around to do stuff. This really isn't worth a whole new point, because everyone already has apache lying around... but lets say you wanted to create a password file for basic authentication. There is no nginx utility to generate those handy hash values, you have to use htpasswd, available from your apache distribution.

Fourth, Nginx doesn't actually do anything beyond serve static HTML and binary assets... which is to say, it doesn't run php or perl or any of the other P's that you might find in the LAMP stack. What it does is take requests and proxies them to other servers that do know how to execute that code. This is great in the Rails world, which long ago decided to have Rails be it's on little server that you submit requests to and get responses back. Even under Apache, the standard approach is to run Rails as a cluster of Mongrel servers that Apache talks to via a proxy connection. In the world of PHP and Perl, this approach is somewhat counter-intuitive. Apache's mod_php loads a php interpreter into Apache, allowing Apache to do all the heavy lifting for you... ditto with mod_perl. Even ruby has a mod_ruby (although, it's still premature). With nginx, everything is it's own standalone server.

So, what if your php project needs to know something about the webserver (like the root folder, or a basic auth username)? Well, you need to know that ahead of time and setup the proxy (which you defined in that global conf file I mentioned in #3) to pass those variables to your application server, otherwise it won't be around for you to use. Better yet, what if the proxy server is down? Nginx will great you with a handy "Bad Gateway" message and no further information. Good luck debugging the underlying server, since it really only knows how to talk in http requests... perhaps you can code your own debugger with LWP.

Finally, I am left with the question why? The ostensible reason is that it's faster and can therefore handle more requests. Even if we accept that as true (*grumble, grumble*), it only accomplishes that speed by passing the buck off to other servers. When you find a non-responsive site it's not because the static assets like images and HTML text are being served slowly... it's because the dynamic content generated by php/perl/python/ruby/whatever and the underly database from which the data is drawn cannot keep up. Nginx suffers that same failing... while requiring just as many resources because you now have to run so many different servers for each of the languages you want to code it.

If you are developing Rails, then by all means, enjoy this flavor of the month until some new exciting technology comes along and all the little Ruby lemmings go marching off in a new direction. For everyone else writing applications that are meant to stand the test of time, stay with Apache, it hasn't let us down yet.

14 comments:

Sur said...

Though the most important reason I would be biased to go for Apache but Nginx, not mentioned here, is the phusion's passenger.

Sean Kellogg said...

modrails is an awfully cool technology, in my opinion, and I'm game to try it out. But I have suspicions if it will catch on with the rails community, as it is a significant departure from the current approach of running rails apps via mongrel clusters. If modrails catches on in the rails world, I will certainly have to tip my hat to their ability to change directions in such a fundamental way.

Having said that, I wasn't attempting to list the benefits of apache vs. nginx, just pointing out the failings of nginx independent of the alternatives. (Though, the discussion of .htaccess files suggests I failed in that). But yes, there are a ton of things apache does that nginx doesn't...

Evan Miller said...

The "point" of Nginx is not to replace Apache. It is foremost a proxy server, useful primarily in large-scale installations. Most of your argument is attacking a straw man.

1. Penny Arcade has probably mis-configured its servers.

2. It's true that the English-language documentation was originally translated from Russian. However, the English docs have evolved over the years and are quite reliable and up-to-date.

3. Since Nginx was never designed to be an Apache replacement, no effort has been expended to support Apache configuration syntax, including .htaccess.

4. You are right that set-ups that include Nginx are generally more complicated than a basic PHP/Apache set-up. Nginx is solving a different set of problems.

4.5. Nginx is fairly easy to debug if you take advantage of its error logs.

5. Nginx does "pass the buck" to other servers. That is the point of Nginx. It is the man behind the curtain of scalable web applications.

Sean Kellogg said...

Such excitement over this post... who would have thought on of my top five posts would be about Nginx!

Anyway, Evan, I think you make some good points. If one considers Nginx as a proxy server, it makes more sense. My concern is that there are those who are looking to eek out just a bit more performance from their web server and might just turn to Nginx as an Apache replacement.

Unfortunately, if you are dealing with situations where you are mixing content from a secondary server with static content (as we are were I work) you suddenly are using Nginx as an Apache replacement. I'm not saying that's the best design, but you gotta be flexible in this business. Which, in the end, is probably my biggest issue with Nginx... it's just not as flexible as Apache.

Your Millage May Vary :)

Cliff Wells said...

Actually I consider Nginx a perfectly acceptable (and in fact, superior) replacement for Apache.

As Evan mentioned, most of your arguments aren't valid:

1) There is extensive English documentation (I know because I helped set it up over *two years* ago).

2) The errors you mention are certainly the result of either misconfiguration or extremely poor performance/overload of the Penny Arcade backends that Nginx proxies to (Apache would probably just time out).

3) Nginx configuration is about a thousand times saner than the Apache config (I was able to convert a shared hosting box with literally dozens of specialized setups from a combination of Apache and Lighttpd in just over three days using only the Russian documentation - and I don't read Russian). Frankly all the documentation in the world wouldn't make Apache's configuration nearly as comprehensible. The reason Apache is often easier to deploy is because most distros ship Apache pre-configured to a large degree, not because Apache is easier to understand.

Finally, there is one, and only one place where Apache is a better solution than Nginx: if you need an esoteric module (such as mod_svn) to support a particular application. Nginx scales better both up *and* down (and not just by a little). It's easier to understand, has far better performance and a much, much, much smaller memory footprint.

The only times I recommend Apache over Nginx is if someone wants to run mod_svn or if they already have so much invested into Apache that switching makes little sense.

However, if you're setting up a new server or looking to eek a bit (or a lot) more performance out of an existing server, you can't go wrong with Nginx.

Cliff Wells said...

Oh, and as an aside, I absolutely despise Rails, so rest assured I'm not simply a dittohead echoing the rest of the Rails crowd ;-)

Cliff Wells said...

"The ostensible reason is that it's faster and can therefore handle more requests. Even if we accept that as true, it only accomplishes that speed by passing the buck off to other servers."

I just caught this assertion. I'm afraid you are really betraying your complete and utter lack of understanding of how Apache or Nginx do their jobs.

Nginx does exactly what Apache does in this respect. Both of them hand off processing of dynamic content to a programming language interpreter. The only difference is the protocol which they use to communicate with this other process. Apache tends to use a binary API (e.g. mod_php) whereas Nginx uses HTTP or FastCGI (which is also the method many Apache users prefer, for various reasons).

At the end of the day, Nginx is much faster and scales MUCH better not because it "passes the buck", but because it's asynchronous rather than process/thread based (and uses far less RAM and CPU to boot) and because it ditches things like .htaccess and CGI that are absolute performance killers.

Sean Kellogg said...

Well, Cliff, seems I've touched a nerve. Doesn't exactly help your credibility that you've clearly been involved with the project, but we'll let that overarching problem slide and just get right down to brass tacks. Let's go top to bottom.

Fairly certain I'm not "betraying your complete and utter lack of understanding of how Apache or Nginx do their jobs." When running PHP or Perl under Apache, the interpretor is within the Apache process itself... they share the memory and have access to the complete request cycle through the powers of mod_php and mod_perl. I don't know enough about mod_ruby (pathfinder) to understand the mechanics there, but I'm fairly certain it's the same sort of deal. Nginx, on the other hand, does not have mod_LANGUAGE support for any of the scripting languages, and thus passes stuff off to some other server. Happy to provide links about this upon request.

The memory footprint of nginx is smaller because it does less. No surprise there, and maybe exactly what you are looking for if you don't want all the bells and whistles. Speaking of bells ans whistles, it's very classy that nginx doesn't ship a copy of htpasswd, so that you have to use apache's in order to setup basic-auth password files!

Not sure what to say about the configuration being simpler... I've worked with Nginx a lot more since I first wrote that post, and some of it is very nice, I will admit. I still find the documentation with regards to rewrite to be poor as is when something must be handled with straight equivalence or when something must be a regex. Though, the at-startup syntax sanity check is really outstanding.

On the overall documentation question, here's an interesting experiment. Google for "nginx rewrite permanent" and then Google for "apache rewrite permanent". With nginx, you don't see anything from the official documentation about how to setup a permanent redirect for at least the first four pages (gave up once I saw how to do it in a pastie!). With apache, it's response #3 and #4, for v1.3 and v2 respectively. So, you may say that the nginx documentation is great, but I find the organization to be awful, the examples to be poor, and the google indexing to be useless.

And finally... on the konquerer thing, how could that possibly be?! What series of configuration options might possible cause a server to error out on one browser and not another? I don't have an answer to that, but it's behavior I've observed with more than a few of our sites running on nginx and never before on an apache server.

Cliff Wells said...

Yes, it's true I'm involved with Nginx: I host the English wiki and helped bootstrap the translation process. You can call it bias if you like, but I didn't undertake this rather large task because I was paid to do so, I did it because I was fed up with the complexity and bloat of Apache and found Lighttpd to be far too buggy for production use. At the very least I can claim I have a reason to defend Nginx whereas I cannot see what reason you have to attack it.

Anyway, to your points:

"When running PHP or Perl under Apache, the interpretor is within the Apache process itself... they share the memory and have access to the complete request cycle through the powers of mod_php and mod_perl".

I think you miss my point: my point is that whether PHP or Perl runs within the Apache process is irrelevant (and bad architecture, IMO, although I understand the historical reasons it was done this way). The fact remains that Apache simply passes the bulk of the request for dynamic content to another program, just as Nginx does.

If you doubt that mod_lang is a bad architecture, ask yourself why many people avoid those modules and use FastCGI to serve PHP and Perl under Apache (hint: try using mod_suexec to address security concerns when you realize that your mod_php process runs as the "apache" user).

This isn't the main reason Nginx doesn't have mod_lang but it's a good reason to not be concerned about it.

"The memory footprint of nginx is smaller because it does less."

Absolutely untrue. Unload every mod_lang, mod_svn, etc from Apache you like and it will still be much larger than Nginx. Have it serve a request or two and it becomes larger still. The reason for this is because Apache is process/thread based and Nginx is async. Every request Apache serves requires a separate thread which consumes a significant amount of RAM. Nginx, on the other hand, handles all requests in a single thread. This fundamentally different architecture is what distinguishes Nginx from Apache more than anything else (and yet you failed to note this even once). Also, once again you conflate what mod_php (aka the PHP interpreter) does with what Apache does.

As far as your Google experiment with searching for similar terms, I find it humorous that I get the Nginx wiki as the fourth item, but the Apache docs don't show up on the first page at all (1.3 docs show up as fourth item on second page, I got tired of clicking before I got to the 2.0 docs). Getting results in a vastly different order depending on your physical location is a common phenomenon with Google.

In any case, compare Nginx's "if" and "location" syntax to Apache's RewriteCond/Rewrite system.
Even if the documentation were twice as hard to find, the software is twice as easy to use once you do (and they both use PCRE so the underlying engine is equivalent).

"What series of configuration options might possible cause a server to error out on one browser and not another?"

Hard to say, as your image link is broken, so I'm unable to see what the original error even was. I've certainly never had any such errors (and I've used Konqueror extensively as a test substitute for Safari). You should ask the Penny Arcade folks. Possibly are they were doing some sort of browser detection that breaks Konqueror. There's no indication of whether the problem was with Nginx or with their backend software, but you *assume* it was an Nginx issue because it forwards your argument.

In any case, you didn't provide any examples from your own sites that you claim were broken and empty speculation about someone else's site that we have no information on is a waste of time.

Anoop Alias said...

Nginx is an Asynchronous web server. It uses less memory and solves the C10K problem. Whereas Apache does not scale much if there are large number of requests( even if scale-up the server with more RAM and processing power )

jim said...

I know this is an old post but I felt compelled to comment. Try running a vBulletin board with 100-150 concurrent users on a small VPS (128MB with burst to 256MB) along with about a twenty mainly static sites and you will have your answer as to Nginx...why?!. Sure it proxies out php. And php-fpm is quite a bit faster than mod_php which is what most "out of the box" Apache users go with. If you need certain functionality you can use spawn-fcgi without lighty's memory leak.

The rewrites are extremely similar to Apache's but they go in the site configuration file. Like most users, I use a "sites-enabled" directory and use an "include" line in nginx.conf. Big whoop! If .htaccess support is your main reason for staying with crapache... erm Apache, good luck with it. I doubt that you would ever see it in a high performance server like Nginx. Igor seems to favor performance.

And this is from a non-rails guy, just a guy who runs some sites.

Anonymous said...

Who knows where to download XRumer 5.0 Palladium?
Help, please. All recommend this program to effectively advertise on the Internet, this is the best program!

Anonymous said...

You don't qualify as a geek I'm afraid. More of a retard really. You also have some crazy blackheads going on.

Anonymous said...

"The memory footprint of nginx is smaller because it does less"

Bozo bit.....flipped.

No, r3tard, it's because Apache uses threads and assigns a comparatively huge stack to each thread/connection.

You are just pulling horse sh1t out of thin air and asserting it like you actually have a clue, which you very clearly don't.

I suggest you read more and talk less.