q
Here is a link to a description of all of the HTTP headers
The only important request type is GET. Other request types besides GET include HEAD, POST (http/1.0), ( RFC 1945, (1996)
HTTP/1.1 (RFC 2161, 1999) added more request types, including DELETE, TRACE, and PUT, that are rarely used.
The POST request is used to send data from a client to a server. The data itself is included after two CRLFs.
HTTP 1.0 vs 1.1
HTTP 1.0 defines 16 headers, though none are required. HTTP 1.1 defines 46 headers, and one (Host:) is required in requests. The Host request-header field specifies the Internet host and port number of the resource being requested, as obtained from the original URI given by the user or referring resource (generally an HTTP URL). This allows two or more URLs to be at the same IP address.
For example, a request on the origin server for <http://www.w3.org/pub/WWW/> would properly include:
GET /pub/WWW/ HTTP/1.1
Host: www.w3.org
HTTP 1.1 must support both persistent and non-persistent connections. In the latter case, each document requested opens its own TCP connection. Each connection is closed after a single document is sent, although the browser usually opens multiple parallel connections to download multiple documents. The client can send a Connection: close header if it does not want to use this.
HTTP/1.1 also allows chunking, in which a large, dynamically generated document can be sent before the server knows its size.
URL encoding
Data sent as part of a GET request has to be URL encoded.
HTML form data is usually URL-encoded to package it in a GET or POST submission. In a nutshell, here's how you URL-encode the name-value pairs of the form data:
name1=value1&name2=value2&name3=value3
Proxy Servers
A Proxy Server is a network entity that satisfies HTTP requests on behalf of an original web server.
It caches recent documents. If the document is cached, it returns it immediately. Otherwise it passes the request on to the server. The server sends the document back to the proxy. The proxy stores a copy of the document and passes it on to the requesting client.
Note that it is both a client and a server at the same time.
In practice hit rates range from .2 to .7, thus dramatically reducing network traffic.
Proxy servers also allow filtering. A company can block access to certain web sites.
Here is the demonstration of html forms
and here is the C code which is compiled to generate the output.
Cookie technology has four components
IP addresses don't work as well because of NAT, DHCP, etc.
A user contacts a commercial web site for the first time. The web site creates a unique id for her and creates an entry in the database. In its initial reply, it has a header
Set-cookie: ID=123456
Her browser creates a new line in the client cookie database
Each subsequent request to the same site contains this header
Cookie: ID=123456
If the user returns to the site a week later, the browser will continue to send the Cookie header. This allows the server to recommend merchandise or provide one-click shopping or otherwise customize the window,
Can be used by portal designers to see how many visitors go to which pages, and from where they come from Distinct visitors vs simply hits
A cookie can contain up to five fields
Domain (www.yahoo.com) Path (/) Content (UserID=123456;team=jets) Expires (28-2-05 23:59) Secure (yes or no)If the expires field is absent, the cookie expires when the browser exits (a non-persistent cookie)
The shopping cart info can be stored in the cookie itself (a list of things bought).
There is a lot of misinformation about cookies. Cookies cannot contain viruses, they cannot erase stuff on your hard drive, they cannot read files on your hard drive. However, they are stored and used without the user's consent or knowledge
Cookies allow sites to track not only their own users, but also visitors to other sites. The technology for this is sometimes called a Web Beacon or a tracking bug. The company which is best known for this is DoubleClick.
Other commercial sites are doubleclick enabled. When you go to a site, there is an invisible image which sends a request to DoubleClick with the doubleclick cookie. Now doubleclick is able to track your web browsing on many sites, and this information about your shopping preferences is shared among doubleclick customers. All of this is invisible to the user.
If you and I log into the same web site and we have never been there before, we might see different ads. I could see cameras, you could see sports paraphernalia.
Required Reading
Here is a good on-line tutorial on HTTP I probably should have asked you to read this before doing the first two assignments.
Here is a good description of cookies. Read "The Cookie Concept" and "The Dark Side". The other articles listed on the left side of the page make good reading too.