Description
Overview:
MyWebServer Checklist
Firefox Browser tools (Quick: Ctrl-Shift-E to raise console. Network / Inspector tabs | drag top up for larger console window.)
All MyWebserver programs MUST communicate with the Firefox browser.
In this program you will follow through the steps of capturing the http stream between existing clients and servers, and write a web server that supports this same protocol. It builds on the JokeServer, which application does much of the same work. While the text of the assignment is quite long, the application itself is quite straightforward, and you might be surprised at how easily it can be written.
There are four+ phases in the development process:
Capture the HTTP protocol first-hand by developing some hacking / debugging skills (hacking in the good sense).
Return simple, static files on request from a browser client.
Return dynamically created HTML (build a directory HTML page dynamically)
Accept FORM input from the user and do back-end processing on the server to return computed values in (simple!) dynamically-created HTML.
Add features of your own choosing, if you like.
See the MyWebServer Tips file for some suggestions once you get coding.
Run at port 2540 in the server directory!
In all cases these following specifications take precedence: The web server must run at port http://localhost:2540. It must, by default, serve files from the directory in which the web server is started, including dog.txt, cat.html. The source code should be contained in a single, stand-alone file name MyWebServer.java ready to compile and run. Subdirectories should be recursively traversed from the default directory in which the server is started.
Grading procedure:
Run our various plagiarism checkers on your submission.
Extract your zip file into a directory, and run a script file that:
Executes > javac MyWebServer.java
Populates the new directory with .txt files, .html files and .java files such as dog.txt, cat.html, MyWebserver.java and the file addnums.html (with an action statement that points to port 2540 on localhost), then creates subdirectories and populates those with .txt files and .html files.
Executes “> java MyWebserver” to start your webserver at port 2540.
In firefox read your directory listing for the directory where the server is running, using port 2540.
Select checklist-mywebserver.html from your listing and read it.
Browse the .txt .java (treated like .txt) and .html files with which we have populated your directory.
Select the addnums.html file and submit data through it.
Select http-streams.txt and read it.
Select serverlog.txt and read it.
Select MyWebserver.java, read your source code, and look at the comments. Note: you should display .java files the same as .txt files by sending the data as text/plain.
Navigate to the subdirectories and read .txt, .java and .html files there.
Special Security Note:
I expect that you will find that in its most basic form this is not a particularly difficult assignment. If so, you will soon have a viable, running webserver of your own creation. If you are developing on a machine that is also connected to the Internet this means that you might well expose all of the files on your local machine (or any remote machine where you might be running) to evil hackers from around the world who are anxious to steal information from your files. In the worst case this information would allow them write access to your disk, and/or put financial/personal information in their hands. So—be careful. Hard-code into your server that you only return files from your root server directory of unimportant files, keep your firewall on, etc. Be careful about the “../..” form of URLs, which would allow someone to retrieve files from above your server’s directory. For particularly sensitive machines you can always simply unplug your Internet connection while running your server.
Server Directories
For this assignment your server must serve files from the directory where the server is started. Place all of your submission files in this same directory.
Administration:
Submission files: MyWebServer.java, http-streams.txt, serverlog.txt, checklist-mywebserver.html You MUST use these exact names.
Copy the checklist for this programming assignment. Fill in the blanks. Update it as you make progress. NEVER change yes to no, unless you have completed the work. Turn it in to D2L along with your assignment.
Zip your your files into one, flat, directory, and submit to D2L (No subdirectories!) Verify that your submission has not been corrupted.
Concatenate MyWebServer.java, http-streams.txt, serverlog.txt into a single text file and submit to MyWebserverTII at D2L
“javac *.java” must work to compile your source code.
Make sure that you are familiar with the assignment submission rules (see assignment one, which covers this in detail). Programs that do not precisely conform to the rules will not be graded. Please do not ask for an exception to this policy.
Your websever must, by default, serve directories—and files—from the directory in which it runs so that we can test it. If you also want to implement something more sophisticated, such as a default webserver directory, then pass a flag as an argument to your webserver, but keep the default as the current directory.
Refer to the InetServer PDF document, and the lecture, along with your JokeServer if you have completed it, for the basic program on which you build. Most of you will have completed this assignment, and extended it, well in advance of the MyWebServer program.
Capturing HTTP:
Goal: Be hackers in the good sense… See what a Web browser, and a webserver are saying to one another for simple browser requests, so that you can later copy that functionality into your own server program.
Note that you can use WireShark (see the labs) to capture these streams, as an alternative to the hacking methods that follow. You probably can also capture the streaming data directly in the Firefox Browser (search on inspection tools). Any method is valid.
IF YOU WANT TO DO (PART OF) THIS YOURSELF USING JAVA:
Use the given MyListener.java code, based on Inet. Modify, and simplify, the code as desired so that it runs at port 2540 and on the console it simply displays everything sent to it, and optionally writes it to a log file as well. If you want, have it send back a valid text/plain response to the client, acknowledging receipt of the “request” ( but note that this is just some minor elegance, not really needed).
That is, if some simple client were to send the message, “ABC Hello there in Server land! ” then the server would display the message “ABC Hello there in Server land! ” on the server console, and optionally might send some message such as “Got your request” back to the client. (If you don’t return a message to a browser, the browser will just hang, but we don’t care.)
You now have a simple “listener” program which echos all input on the server console .
If you want to be fancy, your MyListener program can, in addition to the console display, also send all of the information back to the client as HTML-formatted (or plain text) data. This is not required but could be generally useful as an echo-server showing the full format of requests. Note that you will have to send back the corrent MIME type for HMTL: “Content-Type: text/html [cr/lf] [cr/lf]” (see below).
Start MyListener and connect to it with Firefox as follows: Make valid webserver requests of MyListener by entering URLS such as http://localhost:2540/dog.txt, and http://localhost:2540/cat.html. Notice, and record, what the browser sends your MyListener program in each case (it is displayed on the server console). This is the HTTP stream that the browser sends when it is requesting files from a web server. You have now hacked it.
Capture the console output from MyListener into some file as well (or simply copy it from the console window and paste into a file), for submission as part of the assignment.
For example, following the above procedure, while running my listener at port 2540, I get the following information for a request of dog.txt in the root web server directory.
C:\dp\435\java>java MyListener
Clark Elliott’s Port listener running at 2540.
GET /dog.txt HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/vnd.ms-excel, application/vnd.ms-powerpoint, application/msword, application/x-shockwave-flash, */*
Accept-Language: en-us
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.0.3705; .NET CLR 1.1.4322)
Host: localhost:2540
Connection: Keep-Alive
(Note: you may wish to experiment with “Connection: close” with your
webserver if you are having buffering problems.)
Put this captured output into your http-streams.txt files for submission with the assignment. Copy and paste from the console is fine. We ONLY need the data from the http streams you’ve captured.
We are now going to use the HTTP stream we have just captured to manually retrieve files from a web server. As an example, you can retrieve files from my faculty account at:
condor.depaul.edu/elliott/dog.txt and condor.depaul.edu/elliott/cat.html (But note: Tech support regularly moves my directories around. If elliott does not work, try it with a tilde [~elliott]. Or, if you have a webserver on your PC you can just use that. Or you can install the SourceForge Uniform Server and run that, which is a version of the Apache server that runs on every unix machine. Or you can start the apache web server that runs on your Mac (sudo apachectl start?). But in all cases be careful because you are now serving files from your file system to the network!
Either use Wireshark, or use/modify a MyTelnetClient.java program by modifying your InetClient, or JokeClient, so that it allows you to type in an arbitrary text string, and send this (via port 80) to some webserver. Note that while telnet is disabled on Windows by default it is still there and can be activated.
Use your MyTelnet program to manually enter into a dialog with the condor.depaul.edu (or some other) web server. Write the appropriate input and output for your MyTelnetClient program to a log file (or copy it from your console window, or capture it in Wireshark), for later submission to D2L as part of your http-streams.txt file, but I don’t need to see your source code for this simple program either.
We are working with condor.depaul.edu for convenience because that is where we put our files. However, we could just as easily manually get files from the web server at www.cnn.com if our files were on that machine.
You will connect at port 80 instead of the default telnet port of 23, because you want to tak to the web server, instead of the telnet server.
Do this by entering the shell command,
MyTelnetClient condor.depaul.edu 80 <– or whichever server you are using The condor.depaul.edu web server is now waiting for input from you. You can use the following static files in the step below, or similar files that you have created on your own webserver: http://condor.depaul.edu/elliott/cat.html http://condor.depaul.edu/elliott/dog.txt Enter the valid HTTP request stream that you captured using your listener, for retrieving the file dog.txt from a web server. Note that you will have to be careful to include all of the necessary information, including carriage return / linefeeds (cr/lfs), and that you will have to make changes as needed for different servers. You could probably use copy and paste if you are clever, but unless you connect many times it is probably not worth it. Hint: some of the information, such as “Accept” and “User-Information” is not needed by the web server, and you can find what you can leave out through experimentation. If you enter the HTTP correctly the web server will now send your requested file back to you as a text stream response to your MyTelnetClient program. If you enter it incorrectly you will still usually get some kind of valid response, albeit one containing an error message. Here is a sample session, yours will be similar, but may differ in some of the details, depending which webserver you are using, on which machine. (Note: server configurations change, so you may have to vary what you send to get a response. Follow what your browser sends. My account on condor moves all the time and you may only get a (valid!) error message.) > java MyTelnetClient condor.depaul.edu
Clark Elliott’s MyTelnet Client, 1.0.
Using server: condor.depaul.edu, Port: 80
Enter text to send to the server, to end: GET /elliott/dog.txt HTTP/1.1
Enter text to send to the server, to end: Host: condor.depaul.edu:80
Enter text to send to the server, to end:
Enter text to send to the server, to end:
Enter text to send to the server, to end: stop
HTTP/1.1 200 OK
Date: Wed, 03 Oct 2018 20:40:45 GMT
Server: Apache/2.2.3 (Red Hat)
Last-Modified: Wed, 07 Oct 2015 20:29:55 GMT
ETag: “8a1bfc-30-521899bff76c0”
Accept-Ranges: bytes
Content-Length: 48
Content-Type: text/plain
Connection: close
This is Elliott’s dog file on condor. Good job!
Note: you may get a different response. What we are looking for is SOME HTTP / HTML response from the webserver. For example, if the file has been moved somewhere else, you might get back a well-formed error message. This is fine. In either case you are successfully talking with the webserver.
Put your captured output into http-streams.txt for submission with the assignment.
[Note: You can use Wireshark, and also the Firefox browser console to see network traffic. In the past Firefox has allowed you to download and install a plug-in called HTTPFox (tools -> add-ons -> get add-ons). After HTTPFox is installed you’ll see a small icon in the bottom right corner of your browser window. With HTTPFox you will be able to see all outgoing traffic from your web browser, as well as all of the server responses coming back. (Similar to Fiddler for IE) (Thanks Arkadiusz)]
So, in summary: Create the simple files dog.txt, cat.html, in your home web directory (or use my files). Verify that they can be reached from the web. Retrieve your files manually using MyTelnetClient to port 80, or WireShark, or HTTPFox and add these to http-streams.txt along with your MyListener data.
You have now captured both the request coming from a web client, and the response coming from a web server. Ta-duh.
MIME headers
For this assignment we will use two mime types: Content-Type: text/plain and Content-Type: text/html. These must be followed by two cr/lf and then your data.
MIME types are determined by the server from the file extension of the files that are requested. .html will use text/html, and .txt and .java files will both use text/plain. (This is just a trick so we can view your java source code through your webserver.)
Modify your MultiThreaded server so that it becomes a simple web server.
Goal: Your web server must correctly return requests for files with extensions of .txt, and .html [and also .java which are treated as the same as .txt]. This means that it must return the correct MIME headers (That is, the Content-type [followed by two cr/lf], and Content-length headers), as well as the data. This is a server that operates on static data.
Copy your MyListener.java source into a file called MyWebServer.java.
Copy over your files dog.txt, cat.html to your local machine into the directory where you are developing your web server, for later use.
Using the manual responses you captured from the web server (see above), which contains ALL of the information that the web server sends back to a client, including, specifically the MIME type information (Content-Type:) and Content-Length:, modify your listener so that it becomes a valid web server by sending back a valid text stream, including headers, to the web client. See HTTP protocol for some hints.
In practice you need not send back all of the responses. You WILL want to include:
HTTP/1.1 200 OK
Content-Length: 47 [Where 47 is changed to the real length of the data —
but note that you might make initial tests by just setting this value high]
Content-Type: text/plain [Where text/plain might also be: text/html]
[followed by two carriage return / linefeeds (crlf), and then the data.]
Modern browsers handle the mini favicon files (the tiny logo that can appear in the URL window) requests different ways. If your Firefox browser sends a request for a favicon, you should write code to ignore it. That is, for this assignment we just want those requests to go away anyway we can manage it. If you put a favicon.ico file in your server’s root directory it may solve problems for you. Here is the WikiPedia article on favicons
The following end of line hints might be useful:
static final byte[] EOL = {(byte) ‘\r’, (byte) ‘\n’};
or:
outstream.writeBytes(“Content-Type: ” + ConType + “\r\n\r\n”);
or:
outstream.print(“\r\n\r\n”);
Configure your sever so that it sends back the correct MIME type headers for .txt, and .html files [text/plain, and text/html, respectively].
Use your MyListener, and the MyTelnetClient tricks, or WireShark, for debugging as needed.
Extend your server to include directories:
Goal: Extend your server so that it sends back dynamically constructed data: in this case the HTML-formatted current contents of a directory. This will now be a server that operates on dynamic data.
[Intermediate step: If you are struggling with this assignment, you might want to first simply create some dynamically created HTML, by sending back an very simple HTML file with dynamic data in it, such as the current time. This way you can at least say you have written back dynamic HTML to the client. Then once you are getting the text/html mime type working with dynamic data, go on to creating a directory listing.]
Note: Most webservers no longer allow the promiscuous display of a directory’s contents. But we will provide it from our server as an exercise.
See the ReadFiles.java program for hints on how to read the contents of a directory in Java. [Note: a directory is simply a more-or-less regular file that contains the names of other files in it, along with some associated information.]
Modify your webserver so that it correctly returns a promiscuous display of the server’s directory as requested by the client. Note that you may want to include some security here, since you WILL be writing a valid, albeit simple, web server. For example, you might want to restrict access to a certain subdirectory of where the server is running.
The first step is to simply send back a plain text listing of the files in the directory, along with a text/plain MIME header, and the length of your data.
The second step is to send back some kind of formatted HTML with a text/html MIME header.
The third step (really not that hard) is send back the names of the files as hot-link references such that “clicking-on” them in the browser will cause your server to send back the contents of that file.
Using our MyTelnetClient hack we used to be able to see what a regular server would send back as an html listing of hot-links for files. (For security reasons, most servers no longer give directory listings.) For example, for the condor.depaul.edu request “GET /elliott/435/.xyz/” condor we used to get back the following:
[…]
Index of /elliott/435/.xyz
Name Last modified Size Description
Parent Directory – dog.txt 16-Sep-2005 14:09 39 cat.html 16-Sep-2005 14:09 67 MyWebServer.class 16-Sep-2005 14:09 222 z-directory/ 16-Sep-2005 15:08 –
Which displays as:
Index of /elliott/435/.xyz
Icon Name Last modified Size Description
[DIR] Parent Directory –
[TXT] dog.txt 16-Sep-2005 14:09 39
[TXT] cat.html 16-Sep-2005 14:09 67
[TXT] MyWebServer.class 16-Sep-2005 14:09 222
[DIR] z-directory/ 16-Sep-2005 15:08 –
We can simplify this as follows:
Index of /elliott/435/.xyz
Parent Directory
dog.txt
cat.html
MyWebServer.class
z-directory/
Which displays as:
Index of /elliott/435/.xyz
Parent Directory
dog.txt
cat.html
MyWebServer.class
z-directory/
Lastly, modify the return from your server so that it sends back links to subdirectories as subdirectory URL hot links, if you have not already done so. The only hard part is identifying a file as a directory, and typically you can look for a trailing slash (“/”). For grading we will use the convention that if the URL ends in a slash (“/”) then the server will look for a subdirectory with that name. Thus, when listing subdirectories, you should send subdirectory hotlinks back to the web client with trailing slashes in your preared URL.
For some browsers, and browser settings, you may have some difficulties with the directories—e.g., you might have to send your request twice. We may also have trouble translating between the directory systems of Unix, Mac, and Windows operating systems. So be sure to show us that your directory traversal works in your serverlog.txt file. Also, you might want to experiment with: Connection: Keep-Alive / Connection: close.
Also, you may want to experiment with the socket.close() method if your browser is not displaying the data but all else is working.
You should now have a relatively complete, working, web server, that can return correct MIME types for different types of files, recurse subdirectories, and return dynamically-created html. Because it is multi-threaded it should be able to handle many hundreds of requests. Good work!
Server-Side scripting and program execution.
Goal: write simple code to run arbitrary program code on the server processing user input from the web, and send the results back to the web client.
In this section we add back-end programming capability to your server, or at least simulate it. We create a simple addnums web form , accept input from a user, pass this to our webserver, process the information, and return a computed response based on the input.
For those who are more ambitious you might look into java’s JNI, which allows us to call native code, by loading it into the virtual machine, and then running it. In this way we might write programs that actually run arbitrary scripts/programs under the web server.
Alternatively, for those writing in C, the “system()” function will execute any executables as subprocesses, making the running of programs and scripts trival. Note: be very security conscious of running user-input shell commands with the “system()” call, because, e.g., they might have you execute a command to erase all of your files!
Neither method is required. Instead, to keep the programming scope reasonable, we will only simulate the running of back-end scripts.
CGI (the Common Gateway Interface) has been around since the beginning of the web, so there are thousands of references on how to use it.
Use the given web form that accepts a name and two numbers. On the “action” statement, using the GET method, call a program with the extension “.fake-cgi” with a URL that points to your MyWebServer program. E.g., you might have…