Computing and the Internet


Summary

This document is the outline of a short workshop given at Northwestern University on 3/14/97 at the AT Techfest The workshop gave an introduction to networks, networking, the distinction between local area networks (LANs) and wide area networks (WANs), and the Internet. An overview of network protocols and services was given A brief history of the microprocessor in honor of its 25th anniversary and a discussion of trends in computing concluded the workshop. All links used in the workshop can be found in this document. Some of the discussions are included as well.

Warren A. Kibbe, Ph.D., © 1997 Northwestern University

Welcome and Introduction

What I would like everyone to leave this talk with is a better understanding of what makes a network, how information is sent on a network, and how common Internet protocols like the web operate, and how the Internet can be useful to you in your research and your teaching. First off, this presentation is given via the web, and one of the major advantages that giving a presentation over the web offers is that the talk is accessible afterward. That is, you can go back to the web, and find the presentation and follow along with it at your own pace, and even expand upon points that you don't have time for in the formal presentation. As I am sure you are all aware, the world wide web, or simply `the Web' has become an incredibly pervasive part of the American culture in the past two years. Part of that is because of the flexibility of the web, the fact that the web is platform independent (it does not depend on a particular operating system or software), and its ability to convey information graphically. With most browsers, you can either "Open" a location, meaning you type in the uniform resource locator, or URL, or click on a "link" that contains the URL already. During this talk I will rely on existing links, since it is much quicker than typing them in.

For instance, the Medical School's Integrated Graduate Program brochure is on the web:

IGP Bulletin

Northwestern also makes a great deal of information about its own networks available on the web:

NUNet Maps

I make some additional information available that is specific to the Medical School:

NUMS Subnets

Finally, there are lots of fun services that are freely available on the web:

Greeting Card

Networks

So, how does all this work? At the most basic level, for information to travel between two computers there must be either a dedicated connection between the computers, or a "virtual" connection. A dedicated connection would be a single cable connecting two devices, where each computer would be sure that anything it sent on that cable would be received by the other computer. This isn't very flexible, and if you tried to hook up more than just a few computers to each other with dedicated connections, you would quickly end up with a enormous number of cables and ports on each computer. What makes more sense is to send out each bit of information with an address imbedded in it, serving to open a virtual connection between two computers rather than a physically exclusive connection. With the Internet, the address of each device on the Internet is known as an IP number, or Internet Protocol number.

The current IP system uses 4 bytes to describe the address of the sending computer (or device, since printers and other pieces of hardware can send a receive information on a network) and 4 bytes for the address of the receiving device. IP numbers are normally written in decimal as xxx.xxx.xxx.xxx where each set of three xes are a number from 0 to 255 (2 to the 8th power, since one byte is 8 bits, a a bit can either have a value of 0 or 1). In the early days of the Internet, when it was developed by funding from the government agency ARPA (and called arpanet, which later became bitnet, and finally evolved into the Internet), 4 bytes, or 2 to the 32nd power (4,294,967,296 addresses) was thought to be more than enough for any conceivable network.

Now there are plans underway to upgrade the IP4 system to IP6 (6 bytes or 2 to the 48th power), and include a security layer at the packet level. I don't really want to discuss security and the Internet, but one downside of the Internet, particularly for businesses, is that any computer on a physical subnet or LAN will see all the traffic destined for any device on that subnet. That means that someone could easily write a program to "sniff" the packets on a subnet, and get information being sent on it.

To get back to the main point, on the Internet, information is directed by IP address. A single message sent between two devices on the Internet is sent inside a packet. A packet contains the IP address and routing information in a header . Next comes a description of the information included in the packet, and finally the actual information in the packet. Without muddying the waters too much, packets are sent to ports on a computer. This port concept allows a single computer to run multiple network services simultaneously. For instance, some common services (also called protocols, just to confuse you) are Telnet, FTP, gopher, and of course the world wide web. By common convention, Telnet uses port 23, FTP port 21, gopher port 70, PH uses 105, and the web (http) uses port 80.

Cisco has a very complete description of IP addresses and subnets at http://www.cisco.com/univercd/data/doc/cintrnet/idg3/idgvlsm.htm.

IP numbers to names

For those of you who have used the web, ftp or telnet, or any other service on the web, you have realized that when I give an example of a URL, it uses a "human readable" form, rather than four digits. For instance, the ip number for the computer where this document resides is 165.124.225.182, but when I write the URL for the server, I write www.basic.nwu.edu. The reason I can use the name and the number interchangably is something called the Domain Name Service, or DNS. There is a big look-up table for all the computers on the Internet that maps IP numbers to DNS entries, and vice versa. At NU, many services, such as the Windows FTP server nuns.acns.nwu.edu and many library services require that computers logging into these services have valid DNS entries and come from the nwu.edu domain for security and also licensing reasons.

WANs and LANs

The main difference between WANs and LANs is the number of computers connected to a LAN and a WAN, and bandwidth of the connection. For instance, our campus backbone, NUNet, is a WAN with multiple high bandwidth segments. Most LANs are a single subnet, and connect a department, or a floor of a single building. At NU, our subnets and NUNet allow the transmission of TCP/IP packets (the full name for Internet packets), AppleTalk (Macintosh) packets, IPX (Netware) packets, and NetBIOS (Windows) packets. If your computer supports one of those protocols, then you can "talk" to other computers using those protocols anywhere on NUNet. With TCP/IP, you can talk to computers anywhere in the world, since we have a connection between NUNet and the Internet.

Map of NUNet WAN


Service protocols over TCP/IP

I know I have run through a lot of material in a very short period of time. To recap, the Internet standard for sending information between two computers is TCP/IP. Two devices (like two computers, or a computer and a printer) can send information to each other in TCP/IP packets if they have known and unique IP addresses. Built into TCP/IP is the concept of virtual ports, where each port can have different services bound to it. For instance, the web is generally bound to port 80, so all http: requests that are sent out go to port 80 by default.

The role of computers: Coping with Information in our world

We've talked about networking issues so far. Why should you care? One reason for investing in computers and computer infrastructure is for the access, management and manipulation of information. As I am sure that everyone today is aware, there is an increasing amount of published material in practically every discipline. How much information is available? According to a study by Ernst and Young, the volume of published material doubled from 1880 to 1930, again from 1930 to 1960, doubled from 1960 to 1970, and is currently doubling every 18 months. Keeping track of the mass of publication, let along accessing and analyzing information in this sea of data is difficult at best. To help manage this overwhelming mass of data, computing solutions using search engines and "intelligent agents" have emerged to let us skim and peruse huge databases of data, looking for the few kernels of relevant information from the warehouses full of data fodder. The promise of these new technologies is great - they should allow us to have access to just the information we are looking for, without having to wade through enormous amounts of similar or related material for the fact that we are seeking. Along with this promise is a danger - we do not want to filter out related or tangential information that may give a key insight into a problem.

The Sheer volume of information available

One of the benefits of life after World War II in the United States is that we have experienced 50 years of uninterrupted posterity and growth. As a part of that growth, we have much better access to information, and as the cost of publication plummets, our information output skyrockets. For instance, the amount of information available in printed and electronic form is shown in the graph below.

The only technology that we currently have that has the promise of dealing with this explosion of information in the near future is digital computing. Fortunately, along with the exponential growth of information, we are experiencing an exponential growth of digital storage and computational power. The cost of storage and computer memory is plotted below:

New storage technologies, such as DVD, should continue to push the cost of storage down, with storage costs less than a penny per megabyte by the turn of the century. Similarly, the cost of computation, or computational power, has dramatically declined. The computing ability of a commodity desktop computer today approaches that of a midrange ($50K-100K) workstation of just a few years ago. Likewise, the graphical display capabilities of today's Macintosh and Windows computers rivals that of a high-end Evans and Sutherland graphics workstation a decade ago. All of this computational and imaging horsepower on your desk for about $3,000!

Driving the revolution: the personal computer

Since the first microprocessor was fabricated by Intel in 1971, we have seen a tremendous revolution in the way that computers are used for research, education, and business. In 1971, the mainframe was the way you acquired computing power. They were expensive, large, and required a lot of maintanence. Today, for less than $3000 you can have a computer on your desktop that has more computational power than a 1972 IBM 370, and has a crisp 17" color monitor. Processing power is generally not the bottleneck for desktop computers today - I/O throughput and storage is more of the issue, especially for modelling and imaging applications.

CPU advances, both in manufacturing practices and archetecture, are driving much of the increase in performance seen in this graph. According to "Moore's Law" the complexity and performance of integrated circuits will progress at an exponential rate, roughly doubling every 18 months. This observation was made in 1966 by Gordon Moore while at Fairchild Semiconductor, based on 10 years of watching the semiconductor industry. In 1968, Moore, Bob Noyce and Andy Grove left Fairchild Semiconductor (where Noyce had invented the integrated circuit in 1959) and formed Intel. Grove is currently the CEO of Intel, and 30 years later Moore's predictions have been borne out by 25 years of growth in the microprocessor industry. As you can see from the following graph, processing power continues to increase, and the cost of computing continues to decline.

Links to a few cpu manufacturers

Searching the Internet

Searching & Info servers

NU: Searching the Internet
Alta Vista: Main Page
Infoseek Home Page
Internet Search
Lycos Search Form
WebCrawler Searching
Yahoo

Resources for the Life Sciences

I hope that this presentation helps you better understand the nature of the Internet, and at least introduce you to a few of the advantages that the Internet offers. For a compilation of my favorite hundred or so sites, see:

http://www.basic.nwu.edu/bookmarks.html

Disclaimer:
This article does not constitute an official Northwestern University policy or represent university views. No guarantee, warranty or claim is made or should be implied by the content or absence of content.

Any services, facilities, or other sites listed in these pages are strictly listed as a convenience to the reader. No endorsement of the content of listed sites or identity of any published organization is implied or intended.

last modified 3/13/97

Please contact Warren Kibbe if you have comments on the content of this page or others at www.basic.nwu.edu.
The END