When web servers / web sites first came in to being, they only sent static text files on servers. When you type a URL into a browser, this translates to a path on that server and the file located at that path is sent back to the requester. With cgi (common gateway interface) it becomes possible to execute the file, rather than simply returning the contents. Our mental model of a web server as a file server still works though, there is just a little nuance that the file might be executed. This model is a great starting point for getting a rudimentary understanding of what a web server does.
For more advanced topics, however, this model falls short. We need a somewhat more complicated model to be able to understand. The full URL that is sent to the web server refers to a resource, rather than a file. It is up to the web server to translate that url into some response, usually a text based format, sometimes binary (e.g. an image). Effectively, a web server is a mapper that maps requests to (possibly dynamic) responses.
The most basic example of such a mapping is, as we've seen, returning a file existing at the same relative path. But apache may call a script to generate a response, ask another server application on the same machine (e.g. tomcat) to generate a response, or it may even send the request to another machine, possibly one running another apache instance. For the caller it seems as though all the content is hosted by this single apache instance. It looks as though it is calling a simple single server portal.
Like I said, Apache maps resources based on the URL (and possibly additional information sent with the request, such as cookies, headers, etc.). A handler then, is not simply a file that is executed for a URL, but it is a file that is executed for a range of urls. This makes it possible to manipulate html served without redefining the logic for every static file. For example, adding headers and footers, replacing / removing contents depending on security roles. It is also possible to call a handler for a path that is not backed by a file (for example to serve a page from the database). Being able to configure the distinction between the two makes sense (in the Apache configuration). Some scripts might require the backing file to work properly (such as the add footer example). Others require to be called regardless, because they can't fulfill their purpose otherwise (e.g. serving the file from the database). Apache does makes this possible and calls handlers that should be able to work without a backing file "virtual" handlers. Lastly, a handler might not modify the response at all. They can be used to satisfy non functional requirements, such as logging or auditing. Note: Apache has more specific modules for most things you would want to do, so check those first. The examples given in this section merely illustrate the possibilities of handlers.
Later in this article, I will give an example where a handler is mapped to a MIME type rather than to a URL, so what is a MIME type? When the browser sends a request to a server, it does not known in advance what format the response will have. For urls such as www.example.com/index.html it can be obvious, but this is not always the case. Urls ending in .php or .cgi could return HTML, binary (images) or plain text. Some urls do not have a file extension at all. The server generating the response informs the browser what kind of data it is sending by setting the Content-Type header in the response. This header contains a MIME type such as text/html, text/plain or image/jpeg. These MIME types are standardized and universally understood. If you have scripts that sometimes generate HTML, sometimes binary files then mapping handlers based on MIME types might make more sense. For more info on MIME types: https://en.wikipedia.org/wiki/MIME
I will be using mod_actions for calling the scripts, to install this mod:
sudo a2enmod actionsRestart apache, to process the changes:
sudo service apache2 restartThe cgi handler https://httpd.apache.org/docs/2.4/handler.htmlwe define here consists of 2 parts:
sudo vi /etc/apache2/sites-enabled/000-default.confThe general format for defining an action:
Action [name] [exec] virtual?When creating the rule, you will need the [name] to link the rule to the action. Under [exec] you will specify which cgi script needs to be executed for this action. The virtual keyword is optional and makes it possible to run handlers for URL's not backed by a file.
The following action is named hello-all and runs the hello.cgi file we created in the previous article.
Action hello-all /cgi-bin/hello.cgiThe following rule would invoke the hello-all action for all html pages:
AddHandler hello-all .htmlSome alternative means exist for specifying when the handler should be invoked, please refer to the manual: https://httpd.apache.org/docs/2.4/mod/mod_actions.html#action
To process the changes:
sudo service apache2 force-reloadTry to reach the index: http://localhost/index.html
If all went well, then the output of the hello.cgi script will be returned. In other words, the index.html will NOT be rendered. The handler is responsible for returning the original file contents and I will show an example of this after some additional considerations.
The script will NOT be called for the url http://localhost/idonotexist.html This is because the idonotexist.html file is not present on the filesystem. To "fix" this problem:
Action hello-all /cgi-bin/hello.cgi virtualAfter reloading Apache, the handler will be called for any url ending with .html regardless of whether it is backed by a file or not. Mind you, the index might still be available at http://localhost
Action text/html /cgi-bin/hello.cgiYou dont need to specify an AddHandler rule for this one, because the mapping is already in the action declaration.
sudo vi /var/www/cgi-bin/serve-file.cgiwith contents:
#!/bin/sh echo "Content-type: text/html" echo "" cat $PATH_TRANSLATED | sed "s/It works/It's broken/g"Replace the configuration with:
Action text/html /cgi-bin/serve-file.cgiDon't forget to:
sudo chmod +x /var/www/cgi-bin/serve-file.cgi sudo service apache2 force-reloadAnd now you will see that the index page shows up again at http://localhost, but the red bar has been modified to say that it it is broken.
You need to enable the ext_filter mod first:
sudo a2enmod ext_filterTo run all content in your VirtualHost through a simple script:
ExtFilterDefine test mode=output cmd=/var/www/cgi-bin/test.cgi SetOutputFilter testThe output of the shell script will go to the browser and the original HTML (or other content) will be available on the system input of the script. The filter is applied on all requests, but my quick test shows that the filter is only called if there is a valid response. Apache actually discourages using external filters in production for performance reasons and recommends using native compiled filters instead. Refer to the original documentation for more information. Previous: Installing cgi Main Page