Web server programs
CGI programs are executed by a web server program in response to a user request. Examples of web server programs include Apache ( http://www.apache.org/ ) which runs on all major operating systems and Internet Information Server (or IIS) which runs on Microsoft Windows only. Apache is the most popular web server by a very large margin, running 63.22% of over 14 million active Internet domains, with IIS running 26.14% of active domains (figures from http://www.netcraft.com/survey for Feb 2002).
Web server programs also serve static HTML format web pages. When the web server detects that a requested URL (Uniform Resource Locator) is a CGI program , then instead of sending the program file as text to the web browser, it loads and runs the program, supplying input if there is any (see below) and sending the output of the program to the browser which requested the URL. The web server will implement a set of rules to decide whether or not a requested file is a CGI program, e.g. from it having a file extension of .pl or .cgi and it being installed in a particular folder or directory, e.g. one named cgi-bin .
CGI program input
A CGI program can obtain input from part of a URL after a ? mark e.g:
This URL can be input by the user, linked from a web page, or generated automatically.
A CGI can also obtain similar input as posted from a user submitted HTML form to the program URL. Here is an example <FORM> tag which specifies a remote CGI program which will process and handle the submitted form data:
<form action="../cgi-bin/responseform.cgi" method="post">
This URL is relative to the location of the static HTML page containing the form, but absolute URLs will work just as well. The CGI program doesn't have to be on the same web site as the form that provides it with input, though it often will be.
CGI program output
In the simplest case, CGI programs can start with a suitable HTTP (HyperText Transport Protocol) header and send their output as plain text. This CGI program is about as simple as they get:
#!C:/python21/python # Simplest Python CGI Program print "Content-type: text/plain\n" print "hello CGI world!"
When installed as python/simple.cgi relative to the site root (in this case http://127.0.0.1/ ) it displays the plain text: hello CGI world! in the browser.
There are 3 active lines of code in this program.
Tells a (Windows based) Apache server to interpret the rest of the script using a python interpreter installed in c:\python21\python (note use of Unix/Internet style forward slashes to delimit the path) . The #! characters must be the first 2 characters in the script file, so that the web-server program interprets the rest of the first line as an interpreter path. This first line of the program isn't Python code: it tells the web-server that the rest of the program is Python code. On a Unix type system this line might be #!/usr/bin/python , or the path of the Python interpreter.
print "Content-type: text/plain\n"
This outputs the HTTP header stating that the rest of the output will be plain text. Normally we will be using "Content-type: text/html\n" in order to send formatted HTML output. The HTTP standard expects 2 newlines after HTTP headers (most of the time you only need the one), so we have to specify an extra \n in addition to the one Python print outputs by default.
print "hello CGI world!"
The output of this print statement goes straight to the browser and is displayed directly in the browser window.
Of course for something as simple as this you would probably prefer to use a static file rather than a Python CGI program. We can also get more attractive formatting by making the CGI program create its output in HTML format.
The reason for using CGI programs instead of static HTML files is to generate the information sent to the browser when the request is made by the user using data current at that time. Our next example tells the user the local time at the server at the time the request is serviced.
#!/usr/local/bin/python # Change top line to reflect where Python is installed on your system # Python local time CGI Program import time print "Content-type: text/html\n" print "<html><Head><Title>Hello Python CGI World</Title></Head>" print "<Body><H1>Hello Python CGI World !</H1>" to=time.localtime(time.time()) # time.time() parameter not needed Python >= 2.1 print "<p>The local time on this server is: %s </p>" % time.asctime(to) print "</body></html>"
Pointing our browser at the URL for this CGI gave the output:
The local time on this server is: Mon Mar 18 08:11:38 2002
Requesting the same URL a few seconds later gave an updated time.
The Python cgi module will parse the usual kinds of input to CGI programs. This works for additional information passed within the URL and for form data.
The URL encoding approach is useful where the URL itself is auto generated or for test purposes. For example a coded URL can be sent as part of a CGI generated web page or email. The recipient can confirm subscription to a mailing list simply by clicking on a link within this message. If using an email this can contain a subscription confirmation request with a URL containing a special randomly generated code which is sent to the email address requested, so if the confirmation-required message is suitably worded and the user clicks on the generated confirmation link this establishes:
a. That the person reading email at the address receiving this URL is consenting to join the mailing list.
b. That the person who provided this address to the server using a web form or other means is likely either to be the owner of this email address or to be acting upon their request.
The additional information part of the URL is much used in search engine queries and appears after a question mark (?) within the URL, so this is sometimes called a "query string". This consists of name=value pairs separated from each other using ampersands (&).
The following program: inputfields.cgi uses the FieldStorage class within the cgi module.
#!/usr/local/bin/python # Python CGI Program to get URL or Form data import cgi input=cgi.FieldStorage() print "Content-type: text/html\n" print "<html><Head><Title>CGI Input Fields</Title></Head>" print "<Body><H1> CGI Input Fields</H1>" print "<ul>" for key in input.keys(): print "<li>%s: %s</li>" % (key,input[key].value) print "</ul>" print "</body></html>"
This program is simple and works well enough for now, but we will need to develop it further to handle the situation where a key is used to access more than one value, which can happen with multiple selection form dialog boxes. Note that we need to use the value attribute of the object stored in the dictionary in which the cgi.FieldStorage class stores the form response. This value attribute will be a string if only one value for the relevant key was available.
When called using the URL:
This program displayed the following output as a HTML list:
CGI Form Fields
The same CGI program can be made to give the same output by posting the data using an HTML form. This approach is easier than hand-crafting URLs for applications requiring the end user to input the data to send to the CGI program.
Here is the form:
The following HTML code was used for this form:
<HTML> <HEAD> <TITLE>test form</TITLE> </HEAD> <BODY bgcolor="#ffffff"> <H1>Test Form</H1> <form action="inputfields.cgi" method="post"> <p>Please enter your name. * <INPUT TYPE="text" NAME="Name" SIZE="25" MAXLENGTH="40"></p><p> Please enter your email. * <INPUT TYPE="text" NAME="Email" SIZE="25" MAXLENGTH="40"></p><p> <INPUT TYPE="checkbox" NAME="mailing_list">Please include me on your mailing list</p><p> <INPUT TYPE="submit"> </form> </P> </BODY> </HTML>
and the following program output was obtained:
CGI Input Fields
The first 2 fields were provided by the text form fields, the mailing_list: on entry was provided by the <INPUT TYPE="checkbox" NAME="mailing_list"> form field.
These 2 approaches to handling CGI data input using the form tag method="post" attribute and URL encoding are complementary.
In a real mailing list application we would need to confirm that the address input by the user was correct. If a mischievous user submitted someone else's address, or someone incorrectly typed their address this could result in the wrong person being sent mail using the mailing list and becoming annoyed by this.
To confirm that an email address is correct we will get our CGI program to send a message to the mail address, asking the owner of the address to confirm the subscription or apologising for the error and asking them to ignore the message.
To keep this as simple as possible for now, we will obtain the confirmations using a manual exchange of email. For this, the mailing list owner needs to be emailed the request from the web submission form.
Sending mail from a Python program can be achieved using the smtplib module. Here SMTP stands for Simple Mail Transport Protocol, which is the format of the messages used to relay email over the Internet.
Here is a simple (non CGI) Python program that sends an email:
import smtplib,string fromaddress='webmaster@some_host.net' # make this an address to reply to email@example.com' # change this to your own address message="Subject: a message from Python smtplib\n\nHello mail world!" server = smtplib.SMTP('localhost') # address of SMTP server server.set_debuglevel(1) # get mail server debug messages server.sendmail(fromaddress, toaddress, message) # send the message server.quit() # close mail server connection
Note the use of 2 newlines (\n\n) between the Subject: header and the body of the mail message. You need a blank line after message headers, so mail client programs can tell where the mail header ends and the message body begins. You could alternatively add ordinary strings together to construct the message, and put a pair of \n newlines between the headers (e.g. From: and Subject: message lines and the message body itself.
Sending debug messages to the console or Python command line is useful for this test example, so we can see if the SMTP server is handling our message correctly. In a real application we would prefer for the mail receipt by the outgoing SMTP server to be quiet, once we know that it is working correctly.
The above script only works if the localhost (address: 127.0.0.1) happens to have an SMTP mail server able to relay outgoing messages running on it. This works on my PC at home. However, it won't work on yours unless you happen to have installed and configured a SMTP mail server to run on it. If you haven't you will need to replace this address with either the numeric Internet address or better, the domain name of your Internet Service Provider's outgoing mail server. Your Internet service provider or network administrator should be able to give you the required address.
Having demonstrated how to generate and send mail automatically, some do's and don'ts must to be stated, so you know how avoid annoying other mail users:
a. Don't send email to people who have not given you their consent to receive these messages. If you do this, your Internet account will be cancelled.
b. Do obtain their positive consent before you put someone onto a mailing list. To do this you can send a single message to a CGI user who has given details on a form saying they want to go onto a mailing list. This can contain an acceptance code for addition to a mailing list which they have to return to indicate consent. Having a confirmation step protects against a mistyped address or malicious use.
c. Don't send forged email. It is easy to send messages with any made-up outgoing address. However, you should only send such to your own mail addresses or to those who know in advance whom the mail is coming from, to educate you or them about how easy this is in practice. Otherwise, only use outgoing addresses which you are entitled to use and to which replies can find you. In many countries forging letters with intent to deceive their recipients is a criminal offence.
d. When setting up automated mail facilities of any kind do give some thought to the effect of mail loops. Consider what could happen if you are a member of a mailing list and set up a badly-configured automatic holiday mail responder which replies to the list posting address that you are temporarily away from your office. Whenever a message from this list is received a message will be sent back out to the list which the list will send back to you, and which your program will reply to etc. until someone breaks this loop. Mail loops like this could subject every other list member to thousands of unwanted messages before someone finds out who you are and throws you off this list.
We will now make a CGI process the data received from our mailing list submission form, and send it in an email to the list owner. We can do this by combining the form processing program with the email sending program:
#!/usr/local/bin/python # Send Mailing list request data to listowner import cgi,smtplib,string # get details from form form=cgi.FieldStorage() def html_header(): # print HTML header to browser print "Content-type: text/html\n" print "<html><Head><Title>CGI Form Submission</Title></Head>" print "<Body><H1> CGI Form Submission</H1>" def make_message(): #Create message from form data firstname.lastname@example.org' # give an address for a human reply email@example.com' # the address form data is sent to message="From: %s\n" % fromaddress # get one \n automatically here message="Subject: Mailing list subscription request\n" # need an extra \n message+="The following form submission was received:\n" for key in form.keys(): record= "%s: %s\n" % (key,form[key].value) message+=record return (fromaddress,toaddress,message) def send_mail(fromaddress,toaddress,message): # send message to listowner server = smtplib.SMTP('localhost') # create server object server.sendmail(fromaddress, toaddress, message) # send message server.quit() # close mail server connection html_header() (fromaddress,toaddress,message)=make_message() send_mail((fromaddress,toaddress,message) # thank user for input print """<p>Thanks for your request. Your details have been mailed to the list owner.</p>""" print "</body></html>" # end html
This works and is useful and simple, but it is not very robust. For example, if the user accidentally clicks the form submit button before entering valid data, invalid mail will be sent. This application could also send almost any email to the destination address if the CGI program is given data encoded as part of the URL. In a more practical form processing application we will want to check the validity of the data input by the user as much as possible and either give the user a valid confirmation through the browser or an error response asking them to resubmit the data. We should only send email if the form data validates correctly.
The following program can be used to obtain the environment variables which the web-server program makes available to a CGI program. Some of these variables will be the same as if you run the program interactively, while others will be special to the CGI environment.
#!/usr/local/bin/python # Python CGI Program to get host environment import os print "Content-type: text/html\n" print "<html><Head><Title>CGI Host Environment</Title></Head>" print "<Body><H1> CGI Host Environment </H1>" env=os.environ print "<ul>" for key in env.keys(): print "<li>%s: %s</li>" % (key,env[key]) print "</ul>" print "</body></html>"