The HTTP protocol is the most popular protocol in use in the TCP/IP arena. Every day billions of people use it in their internet sessions, when they surf the web.
In this article I am going to explain how this server works for those who need or want to understand this mechanism.
(The HTTP server needs to be installed in computers that hold html pages for the browsers to display).
The HTTP server opens a 'listening' socket for incoming connection to it. When a browser (the HTTP server's client) sends a request, it processes the request and sends back an answer. The browser request looks like this:
"GET /index.html HTTP/1.1
Host: qms.siptele.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.2.10) Firefox/3.6.10
Accept: image/png,image/*;q=0.8,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Connection: keep-alive
Referer: "
The HTTP server looks for a file name "index.html" in the HTTP root directory and sends it back if exists or inform that the file does not exists - error code "404" as some of us noted.
The request contains several lines with importance:
The "Host" line tells the HTTP server whose host is in reference. This field allows the HTTP server to handle several hosts (or domains) in one server. It just picks up this value and turn to the proper root directory for this host.
The "User-Agent" tells the server which browser is in use. In our case it is "Firefox" browser. This field has no special importance, it just allows us to get statistics about the browsers percentage in use.
The accept fields inform the server about the browser capabilities. The server attempts to send back content that the browser can handle.
The "keep alive" tells the server that the browser wants to use the current socket for up to 115 times for requests/responses.
The "Referrer" field is the most important information for Internet marketers. It tells the server from which page the browser came from. This information is logged and informs us things like:
a. What search phrases where used in the search engine (like Google) to find us.
b. Which ad of ours gets clicked.
c. Which article/page points to our site generated this visit.
This information is priceless. It tells us how are marketing efforts doing. If we run ads in search engines for example we can know which ad is performing better then others, and focus on it.
The first HTTP servers where capable of locating files and sending them to the browsers. Later on the need to access databases arouse and brought to the creation of "CGI" (Common Gateway Interface) programs. The CGI is basically a native program that runs by the HTTP server in a special process environment, gets some request parameters from the server and processes it.
After the processing it returns the information to the HTTP server which sends it back to the browser.
Having a native program running on the server opens many options to the programmer. He can access and process information in databases, create dynamic behavior of the system and open whole new ways of system capabilities.
Opening the system also increased the vulnerability of the computer to hacking...
After several penetration incidents, a new restricting set of rules have been developed for the server. The server now has privileges of a restricted account and group, so it could only run in the predefine directories allocated to it, and not access the entire system. Having restricted account also ensures that an intruder gaining shell access to the server (after crashing the HTTP server) will not be able to see and utilize system information to gain control over the computer itself.
There was a demand for running a script language to ease the developing time. This demand was answered by company called "Zend" that developed a scripting language called "PHP" which stands for "Personal Home Page". When I say "scripting language" I mean a language that is interpreted line by line at execution time. Such languages take more time to parse and execute compatible to native programs (that just need to be run), but the rapid increase in computer performance makes it irrelevant.
PHP gained a huge user-base and is one of the top scripting languages in use today. To run it, the HTTP server needs to have a PHP interpreter to process it. When the HTTP server requested to handle a PHP program it run the PHP interpreter as a CGI program, and this interpreter gets the PHP script and processes it.
A new mechanism was invented to keep information in the browser, which are called "cookies". Cookies are short amount of information sent from the HTTP server and kept by the browser. The browser keeps this information and sends it every time it accesses the HTTP server. This information allows keeping state information for a long time. The information often contains username and session -id so people don't have to fill their username and password every time they access the HTTP server. This is how Gmail "remembers" the user and session that users have and allow them to open the proper Email page without asking for credentials every time.
Now days the HTTP server are very sophisticated. Web 2.0 allows sending many requests to the browser and get responses without the need to refresh the whole screen. This makes it easy to process information inside the page without affecting the whole page. This makes it easy and interactive to exchange information quickly in sites like Facebook.
I have explained here the operation and evolvement of the HTTP server. This description should give bird-eye overview about the way an HTTP server works and allow programmers to understand the reasons of creating things as they are.
Hello. My name is Michael. I am veteran computer programmer in the Internet telecom. I have developed SIP phone and proper iPBX for it. I am also Internet entrepreneur always on the lookup for a new ventures. My main business is SipTelecom.
Article Source: http://EzineArticles.com/?expert=Michael_Koroy
No comments:
Post a Comment