Assignment 4

CS3357b Winter 2009

Short descriptionWrite a web server.
Due dateThursday, April 2, 2009 at 11:59 p.m.
Value towards final grade10%

You are going to write a minimal web server. This web server will not strictly be conforming to the RFC (that would be too much work), but will implement a subset of the RFC such that it can communicate with modern web browsers.

Brief introduction to HTTP

We'll be discussing HTTP in class, but it's worth describing here if you want to get a jump on it. HTTP is an application-level protocol. Its primary purpose is to transfer "Hypertext" (HTML) from a server to a client, though in actuality it is used to transfer all sorts of files, not just HTML. In its simplest form, an HTTP server reads a static file from disk and transfers that static file over the network. Fully implemented HTTP servers do a lot more (such as generating content dynamically), but your web server will deal only with server static content.

A simple HTTP connection consists of the client (browser) connecting to the server, making a request for a file, the server serving that file to the client, then closing the connection. Fully implemented HTTP servers can be more complex (for instance allowing for connections to be "kept alive"), but your web server will not be dealing with any of that nonsense.

HTML

You aren't really required to know HTML for this assignment (since, strictly speaking, HTTP isn't required to deal with HTML), but it may be handy if you want pretty looking testing data/error messages :). Here is a bare bones HTML file:

<html>
<head>
<title>Your file was not found!</title>
</head>
<body>
<h1>404 file not found</h1>
<p>The file you requested was not found. You may want to check if 
you're an idiot.</p>
</body>
</html>

The above is an example of a string you may wish to hardcore into your server to handle requests for files which don't exist. An HTML file consists of an opening <html> tag, content, then a closing </html> tag. Inside, the file is broken into two parts: the head (which, for a simple file, has nothing but the title) and a body (which is where all the content of the document is). If you want to know more about HTML, you can ask me, but you don't really need to know much about it for this assignment.

Port assignments

HTTP (Hypertext Transfer Protocol, used for the WWW) properly runs on port 80. For security reasons, non-priveleged users (that means you!) are prohibited from providing servers on "standard" ports (port numbers less than 1024). To keep you guys from stepping on each others' toes, please use the port assignments used for assignment 3.

Note that port number should not be hardcoded into your server. Your server should take in a command line arguments describing which port to use. These port numbers are what you should use during testing.

SurnamePort range
Antrobus6000–6009
Beaton6010–6019
Blay6020–6029
Bouteiller6030–6039
Brookfield6040–6049
Cruise6050–6059
Davison6060–6069
De Angelis6070–6079
Dizazzo6080–6089
E6090–6099
Eggleton6100–6109
Eidt6110–6119
Eineke6120–6129
Ewer6130–6139
Favaro6140–6149
Ferguson6150–6159
Fernihough6160–6169
Foster6170–6179
Galloway6180–6189
Gao6190–6199
Holik6200–6209
Kalawon6210–6219
Kapp6220–6229
Lepp6230–6239
Litvinov6240–6249
Loh6250–6259
Lu6260–6269
Maloney-Chumney6270–6279
Niu6280–6289
Rao6290–6299
Rumas6300–6309
Sham6310–6319
Simpson6320–6329
Sinnamon6330–6339
Sisco6340–6349
Smith6350–6359
Tsang6360–6369
Tsotsos6370–6379
Wong6380–6389
Zhang6390–6399

Server specifications

Your server must take in one command-line argument, a number representing which port to listen on. It is not required to produce any output to the screen, but it can (note: it would be extremely useful for your server to print stuff to stdout during development).

Your server accepts is only required to accept one connection at a time (i.e., you don't have to use poll anymore, booya!). It should accept a connection fram a web browser, handle that connection, close that connection, then go to the top of its infinite loop to handle more connections.

Upon accepting a connection, the server can immediately accept a request from the web browser (web browsers are designed to make a request immediately upon connecting). Your web server is only required to respond to GET requests (requests for files). The request will look like:

GET /path/to/a/file HTTP/1.1
User-agent: Mozilla/blah blah
Accepts: blah blah

Your web server need only look at the first line since that's all it needs to care about. Every line in HTTP is terminated with a carriage return followed by a newline ("\r\n" in C). You should read in the first line (so you know which file you're going to give the browser); you should then read (and discard) until you see "\r\n\r\n". "\r\n\r\n" (a blank line) signifies the end of a message in HTTP.

After getting a GET command, your server should try to open path/to/a/file on the local filesystem. Note that the file you read must be relative to your web browser's working directory and thus you should strip off the leading / character. (Note: stripping off leading characters is extremely efficient in C if you use pointer arithmetic. If filename is pointing to "/path/to/a/file", then filename + 1 is pointing to "path/to/a/file"!)

If you can successfully open the file requested, you must send the client "HTTP/1.1 200 OK\r\n\r\n" (a header indicating success, followed by a blank line) followed by the contents of the file. You should then close the socket (and the file).

If you cannot successfully open the file requested, you must check the value of errno to determine if it was due to the file not existing or due to insufficient permissions. If the file does not exist, you must send "HTTP/1.1 404 File not found\r\n\r\n" follow by a (optionally HTML-formatted) short human-readable message (to be displayed in the browser) indicating the file was not found. If the web server has insufficient permissions to read the file, you must send "HTTP/1.1 403 Permission denied\r\n\r\n" followed by a short human-readable message indicating permission was denied. You must then close the socket to the web browser.

Testing

To test correct operation of your server, you can fire up a web browser (such as Firefox). Supposing that your web server is running on port 9001 and you wish to request the file index.html, you would type the address http://obelix.gaul.csd.uwo.ca:9001/index.html into your web browser.

Further specifications

Commentary

Bonuses: up to an extra 20%

First of all, you probably won't get 20% bonus unless you really go nuts with it :). Getting 10% bonus is pretty doable, though.

If you want bonus marks, mention why you think you deserve it in your README.TXT. You can do whatever you like (that you think deserves bonus marks), but here are a few ideas off the top of my head:

What to hand in

Electronic submission

All files that are necessary for determing the proper working of your program should be submitted electronically. They should be put in a directory called asn4 and can then be submitted on GAUL via the command:
submit cs3357 asn4

You should submit:

Paper submission

In an unsealed 9" by 12" envelope, submit, in the CS3357 locker: