How the Web Works
Lawrence Abrams
- April 2, 2004
- Read 67,765 times
Introduction
For many of us, the Internet and Web Browsing has become a daily activity. Whether it is for checking stock prices, buying food, doing work, ordering books and music, or just to browse a favorite site, web browsing has become an institution in our lives much the way television is. Have you ever wondered how this whole web thing works, though? This tutorial is designed to explain the history and concepts of the Web and how it works technically. After you browse to a site, you will understand actually how it is done and how your computer retrieves this information. Our first stop, is the history of the Web.
History of the Web
The Web finds its roots at CERN, the European Organization for Particle Physics Research, in 1989 when Tim Berners-Lee and Robert Cailliau designed a system called Enquire. This system would allow documents to have links between different pieces of data whether they be files on the local computer or stored on a remote computer. The main motivation is said to have been the ability to access library information that was spread across multiple servers at CERN.
On November 12th, 1990, Tim Berners-Lee published a formal proposal called "Information Management: A Proposal" that outlined the World Wide Web as we know it today by using a system for displaying information called HyperText, which was first described 1945 by a man named Vannevar Bush, to link documents into a large scale information pool. The following day on November 13th, 1990, Tim Berners-Lee created the first web page and that following December wrote the first web browser and web server. The name of this program that was created, was called the WorldWideWeb. Thus we have the name we use today.
As development of the WorldWideWeb continued, more people from around the world started to get involved , until in 1992 one of the first web browsers that supported graphics was introduced called Pei-Yuan Wei's Viola. This led to Marc Andreessen of NCSA, releasing in 1993 a program for UNIX called Mosaic. Mosaic was the spark that marked the rise in popularity of the World Wide Web and no longer kept it confined in the academic circles. Marc Andreesen went on to form Mosaic Communications, which then evolved into Netscape Communications. Netscape was the first mainstream graphical Web Browser.
As time went on, more features started to be added to the browser, more companies got on the Internet, and personal homepages started springing up everywhere, and the Web as we know it was created.
The Technology behind the Web
The web works on three standards. These standards are generally adhered to by all companies that make products that work with the World Wide Web.
These standards are:
URL (Uniform Resource Locator): These are the addresses that you enter into your web browser to connect to a web site. The URL is broken up into 4 parts which are the protocol, the hostname, the port number, and the path that you are requesting.
- Protocol:
- The protocol part of an URL is the funny string of characters that you see before the hostname. Examples are http, ftp, telnet:, etc. They are separated from the hostname with a colon and two forward slashes ( :// ). These protocols tell your browser what type of service to use when you connect with the web browser to the hostname. If you leave the protocol off your address, by default the Web Browser will assume you are using the HTTP protocol, which is for connecting to web sites, so there is no need to type in the http:// every time you go to a web site. If you specify another protocol like ftp, then the browser will act as an ftp client that will enable you to connect to a ftp server to download files.
- Hostname:
- The hostname is the address you are going to. For example, if you are going to the address https://www.bleepingcomputer.com, then www.bleepingcomputer.com is the hostname.
- Port Number:
- The port number is a number that you can append to the hostname with a colon ( : ) between them. For example https://www.bleepingcomputer.com:80. If you leave the port number off, which almost everyone does, then the browser will automatically use port 80 as that is the default port for the http protocol.
- Path:
- This is the path on the server, culminating with the filename you are trying to reach. For example, the URL https://www.bleepingcomputer.com/examples/example1.html. The path in this case is /examples/example1.html. This path corresponds to an actual directory structure on the web server. So on the web server there is a root directory, an examples directory underneath that root directory, and a file called example1.html underneath that.
HTTP (Hyper Text Transfer Protocol): This is a defined process of how to transfer information between a web browser and a web server. All web browsers and web servers follow this process.
HTML (Hyper Text Markup Language): This is the language used in web pages to format text, images, and page layout. This language is in pure text and is entered into a file that has an ending of html. It is possible to put HTML in documents that do not end in html, but for the purpose of this tutorial, we are only focusing on pure HTML documents. The text in these documents contain special codes, called tags, that tell the web browser when it reads the file how to format the text. Lets try an example below.
If you were to create a file called helloworld.html and save it on your hard drive, you could then open this file with your browser and have it displayed. The contents of this file will have the following text:
Hello World!!!!
If you were to open up this document in your browser you would see the following:
Hello World!!!!
As you can see the text, Hello World, has been shown to you in bold print. This was because we enclosed the words in the tags , which means any text after it will be bold, and then the ending means this is the end of the bold formatting. All tags in HTML have a beginning tag, that starts the formatting, and an ending tag, that stops the formatting. There are many many more tags available to use in HTML, the bold ( ) tag being just one of them.
Web Browser and Web Servers
In order for the Web to work you need web browsers and web servers which work hand in hand. The web browser is a piece of software that is used to interpret the information found in an HTML document and display the content of that document based upon the HTML tags found within it. A web server is a computer that stores HTML documents, otherwise known as web pages, and waits for connections from web browsers. When a web browser connects to a web server, the web server sends the requested document, if it exists, back to the web browser for display.
Actually Browsing a Web Site
Now that you understand the basics behind how the Web works, lets walk you through the actual process of how your computer goes to a web site and displays it in your browser.
The first step of course is to open your web browser, whether that be Netscape, Internet Explorer, or Mozilla. When your browser opens, you have the option of connecting to another web site. In the address field, type the location of where you would like to go. For this example, lets go to www.bleepingcomputer.com.
You type https://www.bleepingcomputer.com, or www.bleepingcomputer.com as the http:// is optional, in the address field and press enter or go. The below diagram explains what happens:
As you can see, when you try to connect to a site, your web browser opens an Internet connection and tries to connect to the web server specified in the host portion of the URL. If it connects, the web browser sends the web server the path portion of the URL. If that path exists on the web server, the web server sends the content of the HTML file back to your browser. Your browser reads through the HTML of the document, following the instructions found there as it displays the information on your screen.
That is all there is to it to retrieving a web page from a remote computer.
Conclusion
I hope you have enjoyed this tutorial and as always if you have any questions, do not hesitate to ask us them in the forums.
--
Lawrence Abrams
Bleeping Computer Basic Internet Concepts Series
BleepingComputer.com: Computer Support & Tutorials for the beginning computer user.