[web-devel] CGI

Curt Sampson cjs at starling-software.com
Sat Feb 21 07:00:33 EST 2009


On 2009-02-21 12:23 +0100 (Sat), Johan Tibell wrote:

> On Fri, Feb 20, 2009 at 3:17 PM, Manlio Perillo
> <manlio_perillo at libero.it> wrote:
>
> > I have some doubts about the name WAI. It means "Web application
> > interface", but I think that the word "Gateway" should appear in it.
> 
> I don't have a strong attachment to the name. I'm not sure what the
> word "gateway" means in this context, probably because I haven't
> studied CGI in depth.

It's fairly meaningless. CGI stands for "common gateway interface," and
is basically:

    * a web server invoking another program,
    * with a specifically defined set of environment variables set to
      various values related to the request and the program's location
      in the filesystem,
    * communicating further information via sending it to the program's
      stdin, and
    * receiving a header and body response on the other end of the program's
      stdout, which it then may process before returning it to the HTTP
      client.

There are some further tricks here: if you want to avoid having the server
parse and possibly modify the header you send back, you should start your
program's name with "nph-".

It's generally not possible to reconstruct the original HTTP request
that generated the CGI request because various things in the environment
variables change based on the configuration of the server, which the
client generally has no way of knowing. Even how the various bits of the
request are converted to envrionment variables is not very well defined.

All these issues translate to FastCGI and SCGI as well, with the
addition that the the "programs" don't always have names, meaning that
you can't prefix the name with "nph-" to avoid the headers in your
response being parsed and modified.

I've spent more time than I care to over the last four or five years
dealing with FastCGI on the application (i.e., opposite of web server)
side, both as a client of existing libraries and a writer of new
ones, and it's caused me much pain.

The one real advantage that FastCGI has is that it's supported across
a lot of web servers, and some have some nice special features. For
example, with lighttpd, you can avoid a lot of shoveling of data around
by returning an "x-send-file: /some/path" header and no body; lighttpd
will then efficently use sendfile or a similar system call to send that
file across the network, avoiding a lot of interprocess I/O for large
files. (We use this in our QWeb framework for our "docroot" servlet,
which just serves files from disk, and also for servlets where we're
sending the cached copy of previously generated content.)

But overall, I'd be much happier with a protocol that just handed me
the raw HTTP request, and allowed me to send back a raw HTTP response.
The issue is, of course, getting the protocol implemented in various
web servers. I could, I suppose, design a new protocol and write a
C implementation for lighttpd. But then, given that I'd hope to be
talking to a Haskell application server anyway, GHC has great support
for building efficient, multiprocessing, highly concurrent servers, and
Haskell is about a hundred times nicer to program in than C, I think I'd
just write a high-performance web server in Haskell. (Except that it's
already been done, anyway.)

Oh, right, where does "gateway" come in? Nowhere, really. Perhaps
the word came from someone steeped in the days of MS-DOS BBSs (I ran
one myself in the early '90s), many of which had "doors," which were
programs to which control would be handed from the BBS software, with
the BBS software doing some tricks to take care of the I/O over the
modem, a la RCP/M. (That stood for "Remote CP/M," a system so old that
Wikipedia doesn't even have an entry for it.)

cjs
-- 
Curt Sampson       <cjs at starling-software.com>        +81 90 7737 2974
           Functional programming in all senses of the word:
                   http://www.starling-software.com


More information about the web-devel mailing list