Personal Blog of Aurobindo Saxena: May 2006

Monday, May 29, 2006

Computer Virus: The Threat is Real

It is not overstating the case to say that viruses could interrupt the free flow of information that has been built up by the personal computer in the last 10 years. Indeed, the prevalence of viruses has ushered in a new era of safe computer to the point where those that ignore the guidelines run grave risks. Considering the extreme warnings of danger and the incidents already on record, it is a mystery that there are those in the computing industry who claim news reports of viruses are exaggerated.

A computer virus is a program designed to replicate and spread on its own, mostly without your knowledge. Computer viruses spread by attaching themselves to another program (such as your word processing or spreadsheet programs) or to the boot sector of a diskette. When an infected file is executed, or the computer is started from an infected disk, the virus itself is executed. Often, it lurks in memory, waiting to infect the next program that is run, or the next disk that is accessed. In addition, many viruses also perform a trigger event, such as displaying a message on a certain date, or deleting files after the infected program is run a certain number of times. While some of these trigger events are benign (such as those that display messages), other can be detrimental. The majority of viruses are harmless, displaying messages or pictures, or doing nothing at all. Other viruses are annoying, slowing down system performance, or causing minor changes to the screen display of your computer. Some viruses, however, are truly menacing, causing system crashes, damaged files and lost data.

A virus is inactive until the infected program is run or boot record is read. As the virus is activated it loads into the computers memory where it can perform a triggered event or spread itself. Disks used in an infected system can then carry the virus to another machine. Programs downloaded from bulletin boards can also spread a virus. Data files, however, cannot transfer a virus but they can become damaged.

Viruses spread when you launch an infected application or start up your computer from a disk that has infected system files. For example, if a word processing program contains a virus, the virus activates when you run the program. Once a virus is in memory, it usually infects any application you run, including network applications (if you have write access to network folders or disks).

Different viruses behave differently. Some viruses stay active in memory until you turn off your computer. Other viruses stay active only as long as the infected application is running. Turning off your computer or exiting the application removes the virus from memory, but does not remove the virus from the infected file or disk. Hence, the virus will activate again the next time you run the application.

Virus attacks are growing rapidly these days. According to Business Week, the 76,404 assaults reported in the first half of 2003, which nearly match previous year's entire tally. As new anti-virus tools are emerging, the virus writers are also getting smarter with newer and creative ways to clog and bring down networked systems. Some common types of viruses are discussed as under:

1. Boot viruses: These viruses infect floppy disk boot records or master boot records in hard disks. They replace the boot record program (which is responsible for loading the operating system in memory) copying it elsewhere on the disk or overwriting it. Boot viruses load into memory if the computer tries to read the disk while it is booting. Some of the examples of this type of virus include: Disk Killer, Michelangelo, and Stone virus.

2. Program viruses: These viruses infect executable program files, such as those with extensions like .BIN, .COM, .EXE, .OVL, .DRV (driver) and .SYS (device driver). These programs are loaded in memory during execution, taking the virus with them. The virus becomes active in memory, making copies of it and infecting files on disk. Some of the examples of this type of virus include: Sunday and Cascade.

3. Multipartite viruses: These viruses are hybrid of Boot and Program viruses. They infect program files and when the infected program is executed, these viruses infect the boot record. When you boot the computer next time the virus from the boot record loads in memory and then starts infecting other program files on the disk. Some of the examples of this type of virus include: Invader, Flip, and Tequila.

4. Stealth viruses: These viruses use certain techniques to avoid detection. They may either redirect the disk head to read another sector instead of the one in which they reside or they may alter the reading of the infected file’s size shown in the directory listing. For instance, the Whale virus adds 9216 bytes to an infected file; then the virus subtracts the same number of bytes (9216) from the size given in the directory. Some of the examples of this type of virus include: Frodo, Joshi and Whale.

5. Polymorphic viruses: These viruses can encrypt their code in different ways so as to appear differently in each infection. These viruses are more difficult to detect. Some of the examples of this type of virus include: Involuntary, Stimulate, Cascade, Phoenix, Evil, Proud and Virus 101.

6. Macro Viruses: These viruses infect the macros within a document or template. When you open a word processing or spreadsheet document, the macro virus is activated and it infects the Normal template file that stores default document format settings. Every document you open refers to the Normal template, and hence gets infected with the macro virus. Since this virus attaches itself to documents, the infection can spread if such documents are opened on other computers. Some of the examples of this type of virus include: DMV, Nuclear and Word Concept.

Some of the symptoms commonly reported after the virus attacks are as under:
• "My program takes longer to load suddenly."
• "The program size keeps changing."
• "My disk keeps running out of free space."
• "I keep getting 32 bit errors in Windows."
• "The drive light keeps flashing when I'm not doing anything."
• "I can't access the hard drive when booting from the A: drive."
• "I don't know where these files came from."
• "My files have strange names I don't recognize."
• "Clicking noises keep coming from my keyboard."
• "Letters look like they are falling to the bottom of the screen."
• "My computer doesn't remember CMOS settings, the battery is new."

In order to combat viruses, the software vendors should focus on making their products less vulnerable. This may ask for a trade-off between user-friendliness and security. In specific cases it may require line-by-line inspection, code retooling and even systems automation to bulletproof the installed programs.

Super Computers

There are a few different definitions of exactly what defines a super-computer; they do however all have one thing in common. The common theme being that a super-computer is a broad term for a mainframe computer that is among the largest, fastest, or most powerful of those available at a given time.

Supercomputers, just like any other typical computer, have two basic parts. The first one is the CPU, which executes the commands it needs to do. The other one is the memory which stores data. The only difference between an ordinary computer and supercomputers is that supercomputers have their CPUs opened at faster speeds than standard computers. This certain length of time determines the exact speed that a CPU can work. By using complex and state-of-the-art materials being connected as circuits, supercomputer designers optimize the functions of the machine. They also try to have smaller length of circuits connected as possible in order for the information from the memory reach the CPU at a lesser time.

There are effectively three types of super-computer, vector-based architecture, bus-based multiprocessors and parallel computers. All supercomputers make use of parallelism or vector processing either separately or combined to enhance work-rate. An increased demand for even higher rates of calculation brought about the advent of MPP (Massively-Parallel Processing) machines such as the Thinking Machine and Intel’s Hypercube. Vector based machines suit tasks that cannot easily be split up where as parallelism and clustering suits tasks that can be broke down into components (e.g. particle simulations where each computer can emulate a single particle). With the advent of faster and more efficient processors for home users, people can effectively build fairly cheap super computers in their own homes. An example of this is a Beowulf cluster based on the Linux operating system, which can harness parallel processing between computers with standard IBM-PC architectures.

The speed of computers processors is often denoted in Megahertz (Mhz) or Gigahertz (Ghz), but the processing power is measured by the amount of FLOPS (Floating-point Operations Per Second) a computer can perform. The power of home computers is usually expressed in terms of MegaFLOPS where as the power of super-computers is expressed in GigaFLOPS. To put this in simple terms the Cray T3E that has 256 parallel processors puts out 153.4 Gigaflops, that’s 153,400,000,000 mathematical calculations every second. This is equivalent to 25 times the worlds entire population each doing 1 calculation per second.

Supercomputers are typically used for high-end number crunching, which encompasses tasks such as:
• Scientific simulations
• Graphics & Animation
• Analysis of geological or geographical data
• Structural analysis
• Fluid dynamics
• Physics calculations
• Chemistry modelling
• Electronic design & research
• Nuclear energy research
• Meteorology.

The best known and one of the longest standing super-computer manufacturers is Cray Research. Cray Research is the market leader for super-computers and is especially well known as they don’t make any entry level computers, they only focus on super-computers.

Seven-Eleven Japan Co. Ltd.: Integrating E-Commerce With Traditional Retailing – A Case Study*

Introduction
Electronic Commerce is about doing business electronically. It is based on the electronic processing and transmission of the data, including text, sound, and video. It encompasses many diverse activities including electronic trading of goods and services, online delivery of digital content, electronic fund transfers, electronic share trading, electronic bills of lading, commercial auctions, collaborative design and engineering, online sourcing, public procurement, direct consumer marketing, and after sales service. It involves both products (e.g. consumer goods, specialized medical equipment) and services (e.g. information services, financial and legal services); traditional activities (e.g. health care, education) and new activities (e.g. virtual malls)

– A definition by European Commission.

E-Commerce is still in its nascent stages in India. Meanwhile, we have already witnessed the dot com debacle. The biggest reason for the dot-com debacle was the widespread misapplication of a deeply flawed idea – New Growth Theory (NGT). NGT argued that people possess an almost infinite ability to combine physical resources into value-creating ideas and that these recipes are a key source of economic growth. NGT went wrong when companies bet trillions of dollars to build the very expensive first copies of these recipes on NGT’s failed premise that companies would reap tens of trillions in profits as demand soared while the cost of incremental copies converged on zero. Another reason of the dot com debacle was that many entrepreneurs and investors didn't consider principles of consumer behavior when they were developing their internet-based businesses.

Inspite of all this, we have learned some precious lessons from it. One of the most important lessons is that the use of technology itself cannot guarantee a successful business venture. It has to be backed by a sustainable business model as well.

In these testing times where most e-commerce/business ventures have failed to succeed, the world has witnessed a few success stories as well. Seven Eleven Japan Co. Ltd. is one of them. One should appreciate that we can learn a lot from its experiences with regard to developing sustainable e-commerce/business models in the Indian context. With this end in view the e-tailing model adopted by Seven Eleven Japan is discussed as under.

About Japan
Although comparing Japan with India in absolute terms would not be correct; moreover, the lessons learned there cannot be strictly applied in the Indian context. However, since Japan resembles India to a great extent in terms of the consumer behaviour, the way of life, culture, high cost of connecting to Internet etc. it would not be completely out of context to discuss a Japanese e-tailing model. But before we delve deeper into the topic let us first have a brief look at the history of Japan.

After its defeat in World War II, Japan recovered to become an economic superpower and a staunch ally of the US. While the emperor retains his throne as a symbol of national unity, actual power rests in networks of powerful politicians, bureaucrats, and business executives. The economy experienced a major slowdown starting in the 1990s following three decades of unprecedented growth. Today Japan is amongst world's largest and technologically advanced producers of motor vehicles, electronic equipment, machine tools, steel and nonferrous metals, ships, chemicals, textiles and processed foods.

Japanese consumers, just like their Indian counterparts like to visit a number of shops before buying a product. Moreover, they also like to have a feel of the product in person before buying and generally do not trust Internet sites for sharing their credit card numbers and personal information. Last but not the least, they love to walk into a convenience store, and this can be confirmed from what Makoto Usui, Director, Seven-Eleven Japan had to say on this:

“The Japanese person who does not pass a convenience store on the way back home from the train station does not exist.”

In May 2000, the Economist Intelligence Unit (EIU) surveyed 60 countries and ranked them on a ten-point scale on the basis of their readiness for e-business. The countries were divided into four categories namely:

E-business leaders
These countries already have most of the elements of "e-readiness" in place, though there are still some concerns about regulatory safeguards.

E-business contenders
These countries have both a satisfactory infrastructure and a good business environment. But parts of the e-business equation are still lacking.

E-business followers
These countries--the largest group in our rankings--have begun to create an environment conducive to e-business, but have a great deal of work still to do.

E-business laggards
These countries risk being left behind, and face major obstacles to e-business growth, primarily in the area of connectivity.

The top ten countries according to the survey were US, Australia, UK, Canada, Norway, Sweden, Singapore, Finland, Denmark and Netherlands. Japan was placed in the 18th position and was categorized as an E-business Contender. This was a very bad performance for a G7 country. India on the other hand was placed in the 45th position and was categorized as E-business follower. The same survey conducted in July 2002 placed Japan in the 25th position i.e. seven positions down from its previous position. However, India moved two positions up from 45th to 43rd position. However, the survey conducted in the year 2004 in 64 countries for e-readiness ranked Japan at 25th position and India at 46th position.

The dismal performance of Japan and India in the e-readiness rankings can somewhat be attributed to the high cost of telecom and ISP charges. According to eMarketer, in the year 2000 Japan had the world's highest combined telecom and ISP charges, at $67.12 per 20 hours, compared to the next highest, India, at $42.30, and the US at $30.05. The governments around the world have now realised that Internet can act as an important catalyst in the economic growth and development and as a result policies aiming at increasing the Internet penetration have been framed and implemented, bringing down the overall telecom and ISP charges.

7-Eleven, Inc. USA
7-Eleven, Inc. of US is the world's largest operator, franchisor and licensor of convenience stores with more than 27,500 stores worldwide. The company’s name was changed from The Southland Corporation after approval by shareholders in 1999. Founded in Dallas, Texas in 1927 as an ice company, 7-Eleven pioneered the convenience store concept during its early years when its ice docks began selling milk, bread and eggs as a convenience to customers. The name 7-Eleven originated in 1946 when the stores were open from 7 a.m. until 11 p.m. Today, offering customers 24-hour convenience, seven days a week is the cornerstone of 7-Eleven’s business.

Approximately 5,800 7-Eleven and other convenience stores are operated and franchised in the United States and Canada. Together, these stores serve approximately 6 million customers daily. Every store focuses on meeting the needs of convenience-oriented customers by providing fresh, high-quality products and services at everyday fair prices, speedy transactions and a clean, safe and friendly shopping environment. Each store's selection of up to 2,500 different products and services is tailored to meet the preferences of local customers. Stores typically vary in size from 2,400 to 3,000 square feet and are most often located on corners for the greatest visibility and easiest access. In addition, 7-Eleven offers a number of convenient services, including automated money orders, copiers, fax machines, ATMs, phone cards and, where available, lottery tickets.

Approximately 3,200 of 7-Eleven, Inc.’s 5,800 stores in North America are operated by franchisees, and approximately 485 are operated by licensees. The remainders are company-operated stores. 7-Eleven, Inc.'s licensees and affiliates operate more than 20,000 7-Eleven and other convenience stores in Japan, Australia, Mexico, Taiwan, Singapore, Philippines, United Kingdom, Sweden, Denmark, South Korea, Thailand, Norway, Turkey, Malaysia, China, Singapore and Guam.

Since 1991, IYG Holding Company along with Seven-Eleven Japan Co. Ltd holds around 74% of the outstanding shares of 7-Eleven, Inc. IYG Holding Company is the parent company of Seven-Eleven Japan. Therefore, it is also the parent company of 7-Eleven, Inc. Meanwhile, Seven-Eleven Japan alone owns 36.2 percent of the outstanding shares of 7-Eleven, Inc. Therefore, 7-Eleven; Inc. can be considered an affiliate of Seven-Eleven Japan.

Seven-Eleven Japan
Seven-Eleven Japan was set up in 1973. Seven-Eleven Japan cut the tape for its first store in Toyosu, Koto-ku, Tokyo, in May 1974. By the mid-1980s it had already replaced old-fashioned cash registers with point-of-sale (POS) systems that monitor customer purchases. By 1992 it had overhauled its information-technology systems four times. But the biggest overhaul of all took place in 1995. The new system that Seven-Eleven installed was based on proprietary technology—albeit state of the art—rather than on the still barely tested open structure of the Internet.

The new system allows Seven-Eleven to transfer multimedia content at high speeds and interconnect all its stores scattered across Japan. It built the system by procuring hardware from NEC and software from Microsoft. By 1998, the overhaul, which cost ¥60bn ($490m), was complete. The new system replaced the ragbag of systems used before. A pipeline to Microsoft’s offices in Seattle provided instant support. The software backup constantly monitored and automatically rebooted the system when it crashed, and alerted local maintenance firms if such errors occurred more than twice.

All Seven-Eleven stores now have a satellite dish. The company has used satellite dish as they are cheaper than using ground cables. Further, this is often the only option for shops in rural areas and in earthquake-prone Japan, the satellite dish provides an extra layer of safety on top of two sets of ISDN telephone lines, and separate mainframes in Tokyo and Osaka.

Seven-Eleven’s new technology gave it the following four advantages:
• The first was in monitoring customer needs, which were changing as deregulation made shoppers more demanding.
• Second, Seven-Eleven used sales data and software to improve quality control, pricing and product development. The company can collect sales information from all its stores three times a day, and analyse it in roughly 20 minutes.
• Third, technology has helped to predict daily trends. As customers become more fickle, product cycles are shortening.
• Finally, Seven-Eleven’s electronic investment has also improved the efficiency of its supply chain. Orders flow quickly and are electronically processed in less than seven minutes. They are sent to 230 distribution centres that work exclusively for Seven-Eleven. Truck drivers carry cards with bar codes that are scanned into store computers when they arrive with a delivery. If a driver is often late, the operator will review his route and might add another truck to lighten the load.

In the same way, Seven-Eleven helps vendors and manufacturers to control their inventories. It uses its database to instruct them on all sorts of small details, such as what sauce to put into its ready-made noodles in order to maximise sales. It is, however, hedging its bets by studying how international rivals such as Wal-Mart use the Internet for global product procurement.

Seven-Eleven is already using the Internet to lower its annual overhead costs of around ¥70bn. It plans to install an e-commerce software package offered by the Japanese arm of Ariba, an American e-procurement company, to bulk-buy goods and services such as office equipment and insurance policies for its employees.

The Payment Mechanism
The company has increased its customer traffic by turning shops into payment and pick-up points for Internet shoppers. This was a clever move in a country in which people are still wary of using credit cards over the Internet, preferring instead to pay cash at a store. Seven-Eleven’s stores now sell almost 50% more on average every day than those of its closest rival. Its Internet site, 7dream.com, was launched last July with seven other companies, including NEC and Nomura Research Institute. The site offers a wide range of goods and services, including books, CDs, concert tickets and travel. A customer can log on to the website and place the order. He can specify the Seven-Eleven store where he would like to pick up the merchandise, in case he does not want the company to ship the same at his address. The Order Centre notifies the pick up date to the customer via email. The customer has also been provided the facility to pay online for the merchandise or take a print out of the payment slip and pay personally at the store.

When it comes to running such online businesses, Seven-Eleven seems likely to have just as much difficulty as others have done: lots of costs, few customers. For the convenience store, as for other businesses, the real savings are likely to come from deploying the Internet as a management tool. It already knows how to cut costs by replacing paper with electronic delivery: it has trimmed ¥300m a year over the past decade by becoming a “paperless” business.

Conclusion
In only three decades since its establishment in 1973, Seven Eleven Japan Co. Ltd. has successfully popularized the convenience store business model in Japan and positioned itself as the convenience store sector’s undisputed leader.

The Indian retail sector, which is estimated to be around $180 billion, is witnessing tremendous growth with the changing demographics and an increase in the quality of life of urban people. However, 98 per cent of the sector constitutes of "traditional retailing" and much of the business being handled by local Kirana stores. Due to the fragmented structure it suffers from limited access to capital, labour and suitable real estate options. Therefore, at this moment, it is still premature to say that the Indian retail market will replicate the success stories of names such as Walt-Mart, Sainsbury and Tesco; but at least the sector has shown signs of aligning itself with global trends.

Secondary Storage Devices

Secondary storage devices are auxiliary storage devices that are used to store data and programs when they are not being processed. Secondary storage is more permanent than the main memory, as data and programs are retained even when the power is turned off. The need for secondary storage can vary greatly between users. A personal computer might only require 20 Mega bytes of secondary storage but large companies may require secondary storage devices that can store billions of characters. Because of such a variety of needs, a variety of storage devices are available in the market. Some of the secondary storage devices are discussed as under:

(i) Magnetic tapes
Magnetic tape is a one-half inch or one-quarter inch ribbon of plastic material on which data is recorded. The tape drive is an input/output device that reads, writes and erases data on tapes. Magnetic tapes are erasable, reusable and durable. They are made to store large quantities of data inexpensively and therefore are often used for backup. Magnetic tape is not suitable for data files that are revised or updated often because it stores data sequentially.

(ii) Magnetic disks
Magnetic disks are the most widely used storage medium for computers. A magnetic disk offers high storage capacity, reliability, and the capacity to directly access stored data. Magnetic disks hold more data in a small place and attain faster data access speeds. Types of magnetic disks include diskettes, hard disks, and removable disk cartridges.

(a) Diskettes: The diskette was introduced in the early 1970s by IBM as a new type of secondary storage. Originally they were eight inches in diameter and were thin and flexible which gave them the name floppy disks, or floppies. Diskettes are used as the principle medium of secondary storage for personal computers. They are available in two different sizes: 3 1/2 inch and 5 1/4 inch.

(b) Hard disks: Hard disks provide larger and faster secondary storage capabilities than diskettes. Usually hard disks are permanently mounted inside the computer and are not removable like diskettes. On minicomputers and mainframes, hard disks are often called fixed disks. They are also called direct-access storage devices (DASD).

(c) Disk Cartridges: Removable disk cartridges are another form of disk storage for personal computers. They offer the storage and fast access of hard disks and the portability of diskettes. They are often used when security is an issue since, when a person has finished using the computer, the disk cartridge can be removed and locked up leaving no data on the computer.

(d) Removable-Pack Disk Systems: It consists of hard disks stacked into a pack or an individual unit that can be mounted or removed as a unit. They are typically found on mainframe and minicomputer systems. A typical disk pack has 11 disks, each with two surfaces. Only 20 surfaces on the disk can be used for recording data, the top and the bottom surfaces are not used. Each surface area is divided into tracks, where the data is recorded.

(e) Winchester Disk Systems: These disks are hermetically sealed units that cannot be removed from the disk drive. They are typically used in microcomputers and have capacities in the range of 20—30 GB.

(f) Zip Disks: These are high-capacity floppy disk drives developed by Iomega Corporation. They are slightly larger than the conventional floppy disks, and are about twice as thick. They can hold 750 MB of data.

(g) Jaz Disk: These are removable disk drives. It has a 12-ms average seek time and a transfer rate of 5.5 Mbps. They can hold 1 GB of data.

(h) REV Drive: REV drives are the latest secondary storage devices launched by Iomega Corporation. It provides removable storage with hard disk performance and can store upto 90 GB of compressed data.

(i) USB Drives: USB drives use the USB port on the computer for data transfer. Mini Drives launched by Iomega Corporation can hold upto 1 GB data and are extremely small in size. The dimension of a mini drive is (2.22 cm Width, 7.30 cm Length, 1.11 cm Height)

(j) Optical Disks: Optical disk is a disk in which light is the medium used to record and read data. The disk is made of clear polycarbonate plastic, covered with a layer of dye, a thin layer of gold, which reflects the laser beam, and a protective layer over that. A recording is made by sending pulses from a laser beam, which make a pattern in the layer of dye. The recording is read later by directing a laser beam at the disk and interpreting the pattern of reflected light. CDs, CD-ROMs, and video discs, are commercially recorded optical disks and are not rewritable.

Recordable optical disks include WORM (write once read many) disks, and CD-Rs (CD-recordables), which can be written only once; and CD-Es (CD-erasables), which can be rewritten many times. A Digital Video Disk or DVD has a much larger capacity than a CD, even though both are the same size, physically. They both read and write data in similar ways, and all recordable DVD drives can also record CDs. But, a single DVD disc has the capability to store up to 13 times the amount of data contained on a CD - on one side alone. Since both sides of a DVD can be used for data storage that means DVDs can offer up to 26 times the storage of a Compact Disc

Currently, there are five different recordable formats: DVD-R, DVD-RW, DVD-RAM, DVD+R, and DVD+RW The following chart shows the capacities/sizes of various recordable media:

CD-R and CD-RW - 0.65GB to 0.7GB
DVD-R - 3.95GB or 4.7GB
DVD-RW - 4.7GB
DVD+R - 4.7GB
DVD+RW - 4.7GB
DVD-RAM - 2.6GB to 9GB

Search Engines

Search Engine is a program that searches documents for specified keywords and returns a list of the documents where the keywords were found. Although search engine is really a general class of programs, the term is often used to specifically describe systems like Google, Alta Vista and Excite that enable users to search for documents on the World Wide Web and USENET newsgroups.

Typically, a search engine works by sending out a robot, spider or crawler program to fetch as many documents as possible. A robot is a piece of software that automatically follows hyperlinks from one document to the next around the Web. Another program, called an indexer then reads these documents and creates an index based on the words contained in each document. Each search engine uses a proprietary algorithm to create its indices such that, ideally, only meaningful results are returned for each query. Broadly, there are two types of search engines:

1. Individual: Individual search engines compile their own searchable databases on the web.
2. Meta: Metasearchers do not compile databases. Instead, they search the databases of multiple sets of individual engines simultaneously.

Searching Techniques
Keywords are the words or phrase that search engines use to search for the relevant site. In addition to keywords, the user can use various techniques or operators for more accurate results.

1. Use quotes (“ ”): When more than one keyword is entered for search, the search engine takes them to be different and unrelated words. The results displayed may therefore not be accurate. Quotes operator can be used to search the exact term in the same order and thereby filter the information sought.

2. Use (+) or AND: The (+) or AND operators can be used to search more than one word appearing on a webpage not necessarily in the same order. Space must however, be provided after the first word as a matter of syntax.

3. Use (-) or NOT: The (-) or NOT operators can be used to search webpages where the first word is appearing independent of the second word. Space must however, be provided after the first word as a matter of syntax.

4. Use OR: The OR operator if placed between the keywords will display sites containing the either words alone. Space must however, be provided after the first word as a matter of syntax.

5. Use Wildcards (*): The Wildcard (*) operator if followed by atleast four characters would display sites containing the words, which have words whose initial characters are same as the characters entered before wildcard.

6. Use (~): The (~) operator can be used to search not only for a particular keyword, but also for its synonyms.

7. Use (Site: Site name): If one knows the website one wants to search but is not sure where the information is located within that site, one can use a search engine to search only that domain. This can be done by typing the keyword and following it by the word "site" and a colon followed by the domain name in which the search is to be performed.

The most popular search engine today is www.google.com. It has a database of over 8 billion webpages. Some of the facilities provided by the search engine are discussed as under:

1. Cached Links: Google takes a snapshot of each page examined as it crawls the web and caches these as a backup in case the original page is unavailable.

2. Calculator: Google's offers built-in calculator function which can be used to solve math problems involving basic arithmetic, more complicated math, units of measure and conversions, and physical constants.

3. Definitions: Google also provides the facility to see the definition for a word or phrase. This can be done by simply typing the word "define," then a space, and then the word(s) one wants to be defined.

4. File Types: Google provides file type search in 12 file formats other than the HTML file format. Google now searches Microsoft Office, PostScript, Corel WordPerfect, Lotus 1-2-3, and other file formats. The new file types will simply appear in Google search results whenever they are relevant to the user query. Google also offers the user the facility to "View as HTML", allowing users to examine the contents of these file formats even if the corresponding application is not installed. The "View as HTML" option also allows users to avoid viruses, which are sometimes carried in certain file formats.

5. Froogle: Froogle is the product search service provided by Google to search the information regarding particular products. These product search results are linked to the sites of merchants who participate in Froogle.

6. Local Search: Google Local enables one to search the entire web for just those stores and businesses in a specific neighborhood. This can be done by including a city or zip code in the search and Google displays relevant results from that region at the top of the search results.

7. News Headlines: When searching on Google one may see links at the top of the results marked "News". These links connects one to reports culled from numerous news services Google continuously monitors. The links appear if the terms one enter are words currently in the news and clicking on them will take one directly to the news service providers’ website.

8. Spell Checker: The Google spell checker software automatically analyses the keyword (s) entered and suggests common spellings for the keyword (s).

9. Webpage Translation: Google breaks the language barrier with this translation feature. Using machine translation technology, Google facilitates English speakers access to a variety of non-English web pages. This feature is currently available for pages published in Italian, French, Spanish, German, and Portuguese. If the search has non-English results, there will be a link to a version of that page translated into English.

10. Submit your site: Google provides the facility to the users to submit their websites to the Google’s Index. One may also add comments or keywords that describe the content of that particular page or site.

Microsoft has also recently entered the market of search engines and has launched its own web search technology to challenge Google's long dominance of the field with results tailored to a user's location and answers from its Encarta Encyclopedia. The Microsoft search engine, offered in 11 languages, is available on the "test" site (http://search.msn.com). Redmond-based Microsoft has long offered a search engine on its MSN website, but the technology behind was powered by subsidiaries of Yahoo. The company has admitted that it had erred by not developing their own search technology earlier. But now they have devoted $100-million in an aggressive catch-up effort. The company is also committed to clearly separate paid search results from those based purely on the relevancy. Microsoft’s site has more than five billion web pages at present.

With two big giants fighting out for the market leadership the Netizens are sure going to emerge as the final beneficiary.

Related terms in Database Management Systems

Aggregate operator: A function that produces a single value from multiple rows of a table. SQL Anis supports the following operators: avg (average of values), count (number of occurrences), max (highest value), min (lowest value) and sum (sum of all values). The aggregate operators are applied on columns and usually the queries with such operators are made to produce a single row output.

Attribute: An attribute is a part of the description of the entity. The entity itself is described by one or more attributes; together, they describe all things of importance about the entity. Example: Typical attributes for a customer would be name, address, telephone, etc.

Binary relationship: Relationship between two entities.

Business Rule: Specific business-related information that is associated with database objects. The information can be business restrictions (allowable values), facts, or calculation rules for given business situations, e.g. VAT shall be added to all products. Business rules shall be applied in the completed database, either as triggers/stored procedures, or implemented in the application code.

Candidate key: An attribute or set of attributes that uniquely identifies individual occurrences of an entity type.

Cardinality: The number of tuples that an entity or an attribute may generate. The cardinality of entities indicates if a relationship is a One-To-One, One-To-Many or Many-To-Many relationship. The cardinality of attributes indicates if it is optional or mandatory and if it is single or multi-valued.

Composite attribute: An attribute composed of multiple components, each with an independent existence.

Concurrency: With respect to the management of multiple users concurrently interacting with the system, the system should offer the same level of service as current database systems provide. It should therefore insure harmonious coexistence among users working simultaneously on the database. The system should therefore support the standard notion of atomicity of a sequence of operations and of controlled sharing. Serializability of operations should at least be offered, although less strict alternatives may be offered.

Constraint: Rules applied to validate the data.

Data model: It is the representation of the data manipulated by a system, consisted of three parts: structural part (definition of how the database is to be constructed), manipulative part (definition of the types of the operations that are allowed on the data) and rules part (to ensure that the data is accurate). These three parts later on are related with DDL, DML and integrity respectively.

Database: A shared collection of logically related data, designed to meet the information needs of multiple users in an organization. The term database is often erroneously referred to as a synonym for a “database management system (DBMS)”. They are not equivalent. A database is a store of data that describe entities and the relationships between the entities. A database management system is the software mechanism for managing that data.

Database Management System: A DBMS is a collection of computer programs and software for organizing the information in a database. A DBMS supports the structuring of the database in a standard format and provides tools for data input, verification, storage, retrieval, query, and manipulation.

Degree of a relationship: The number of participating entities in a relationship.

Derived attribute: An attribute that gets a value that is calculated or derived from the database.

DDL or Data Definition Language: Set of SQL commands used to support structure definition of databases. It is used to create, alter and delete (drop) tables (logically and physically). These commands are responsible for specifying attributes, types, constraints etc.

DML or Data Manipulation Language: Set of SQL commands used to insert, update and extract data from databases. The queries for these operations can be used with additional clauses in order to order or group data.

Domain: A set of all possible values that an attribute can assume. An attribute "gender", for example, has only two possible values: male or female. We say that the domain for the "gender" attribute is: {male, female}.

Encapsulation: This is the scheme used for defining objects in object-oriented approach. Encapsulation hides detailed internal specification of an object, and publishes only its external interfaces. Thus, users of an object only need to adhere to these interfaces. By encapsulation, the internal data and methods of an object can be changed without changing the way of how to use the object.

Entity: "Something" in the real world that is of importance to a user and that needs to be represented in a database so that information about the entity can be recorded. An entity may have physical existence (such as a student or building) or it may have conceptual existence (such as a course).

Entity set: A collection of all entities of a particular entity type.

Entity type: A set of entities of the same type.

First Normal Form (INF): Where the domain of all attributes in a table must include only atomic (simple, indivisible) values, and the value of any attribute in a tuple (or row) must be a single-valued from the domain of that attribute.

Foreign Key: An attribute that is a primary key of another relation (table). A foreign key is how relationships are implemented in relational databases.

Full participation: Where all of one entity set participates in a relationship.

Functional dependency: A relationship between two attributes in a relation. Attribute Y is functionally dependent on attribute X if attribute X identifies attribute Y. For every unique value of X, the same value of Y will always be found.

Generalization: The process of minimizing the differences between entities by identifying their common features and removing the common features into a superclass entity.

Identifying owner: The strong entity upon which a weak entity is dependent.

Index: An index is a physical mechanism applied to one (or a combination of) column(s). The purpose of the index is for the database system to use the index as a look-up mechanism instead of reading the whole row. Indexes are a prime resource for optimalization (and thereby increasing speed) of searches in the database.

Join: A query, which uses data from more than one table. These tables must have at least one common attribute (also known as linking attribute).

Join Relationship: A join relationship is a collection of information from two or more tables. The join is performed by relating columns, which are foreign key columns in one table with equivalent columns, which are primary key columns in the other table.

Key: An attribute or data item that uniquely identifies a record instance or tuple in a relation.

Mandatory relationship: Same as full participation; where all of one entity set participates in a relationship.

Many-to-many: Where many tuples (rows) of one relation can be related to many tuples (rows) in another relation.

Many-to-one: Where many tuples (rows) of one relation can be related to one tuple (row) in another relation.

Mapping: The process of choosing a logical model and then moving to a physical database file system from a conceptual model (the ER diagram).

Meta Data: 'Data about Data'. This is the documentation stored in the database repository, and which holds information about your database objects. In Oracle, for example, the table USER_TABLES holds vital information about your tables.

Multi-valued attribute: An attribute that may have multiple values for a single entity.

Normal forms: Rules applied to the table's structure. The goal of these rules is to reduce data redundancy and to improve the performance of the database. If the tables are following at least the first three normal forms we say that the data model is normalized and it is considered a relational model.

One-to-many: A relationship where one tuple (or row) of one relation can be related to more than one tuple (row) in another relation.
One-to-one: A relationship where one tuple (or row) of one relation can be related to only one tuple (row) in another relation.

Optional participation: A constraint that specifies whether the existence of an entity depends on its being related to another entity via a relationship type.

Open Database Connectivity (ODBC): A general interface for communication with different vendor-specific Relational Database Systems.

Partial key: The unique key in a dependent entity.

Partial participation: Where part of one entity set participates in a relationship.

Participation constraints (also known as optionality): Determines whether all or some of an entity occurrence is related to another entity.

Primary key: A column (or combination of columns) whose value(s) uniquely identify a row in a table. This is one of the most vital concepts in Relational Theory, and crucial to both identification and performance. A table should never be created before it's primary key is decided.

Query: A "question" to be made to the database. It is a SQL command that returns a subset of data in the database.

Record: It is the same as one line of the table.

Recursive relationship: Relationships among entities in the same class.

Relation: A table containing single-value entries and no duplicate rows. The meaning of the columns is the same in every row, and the order of the rows and columns is immaterial. Often, a relation is defined as a populated table.

Relationship: Link between entities. It is usually represented by a set of tuples containing all the data related among "N" entities. A relationship may define constraints.

Relational integrity: Relational integrity refers to the integrity of the foreign key references in a database. All foreign keys should refer to valid primary keys in other tables.

Referential Integrity: Referential integrity deals with governing data consistency. We mostly think of it as keeping the relations between tables valid; that is, an order may not have a customer id that does not exist; a transaction can not be posted for an illegal (non-existent) account.

Second Normal Form: A relation that is in first normal form and in which each non-key attribute is fully, functionally dependent on the primary key.

Simple attribute: Attribute composed of a single value.

Specialization: The process of maximizing the differences between members of a superclass entity by identifying their distinguishing characteristics.

Strong entity: An entity that is not dependent on another entity for its existence.

Structural constraints: Indicate how many of one type of record is related to another and whether the record must have such a relationship. The cardinality ratio and participation constraints, taken together, form the structural constraints.

Stored Procedure: A stored procedure is SQL (and procedural code, in most cases), placed in the database itself. It masks the business logic from the programmer. In addition, stored procedures represent a powerful tool to let all programmers have a generic interface to different access mechanisms to each table in the database.

Subclass: An entity type that has a distinct role and is also a member of a superclass.

Superclass: An entity type that includes distinct subclasses required to be represented in a data model.

SQL, "sequel" or Structured Query Language: Standard language for dealing with relational database systems. Its statements are responsible for specify and modify database schemas (DDL) as well as to manipulate the contents of the database (DML).

Table: The standard way to represent data in a relational database system. The collection of real-world objects with common properties (entities) are placed in the same table each one represented by one record (line, row, tuple) and its properties are represented by the columns of the table.

Third Normal Form: A relation that is in second normal form and in which no non-key attribute is functionally dependent on another non-key attribute (i.e., there are no transitive dependencies in the relation).

Trigger: A trigger is a stored procedure assigned to a given table. It ‘fires’ whenever you do an operation on that table (BEFORE/AFTER INSERT/UPDATE/DELETE etc.) Triggers are powerful, performance-enhancing mechanisms in the database.

Tuple: The formal way to represent the elements in a relation. A tuple may represent elements in a entity or in a relationship. It is an ordered pair which represents the relation itself.

Unique identifier: Any combination of attributes and/or relationships that serves to uniquely identify an occurrence of an entity.

View: An imaginary table: A view may be constructed to give the user/programmer access to a limited resultset from one or more tables. It is often used for security reasons; restricting access through views. However; it may also be a signal of insufficient design.

Waterfall model: A series of steps that software undergoes, from concept exploration through final retirement.

Weak entity: An entity that is dependent on some other entity for its existence.

Related Terms in Cyber Crime

Cyber Crime can be defined as criminal activities committed on the Internet. This is a broad term that includes electronic hacking, denial of service attacks, stealing a person's identity, selling contraband, stalking victims, disrupting operations with malevolent programs etc. Some of the common terms used in relation to cyber crime are discussed as under:

Dumpster Diving: Dumpster diving, or trashing, is a name given to a very simple type of security attack i.e. scavenging through materials that have been thrown away. All kinds of sensitive information turns up in the trash, and industrial spies through the years have used this method to get information about their competitors.

Wiretapping: There are a number of ways that physical methods can breach networks and communications. Telephone and network wiring is often not protected as well as it should be, both from intruders who can physically damage it and from wiretaps that can pick up the data flowing across the wires. Criminals sometimes use wiretapping methods to eavesdrop on communications. It's unfortunately quite easy to tap many types of network cabling.

Eavesdropping on Emanations: Electronic emanations from computer equipment is a risk one needs to be aware of, although this is mainly a concern for military and intelligence data. Computer equipment, like every other type of electrical equipment from hairdryers to stereos, emits electromagnetic impulses. Whenever one strikes a computer key, an electronic impulse is sent into the immediate area. Foreign intelligence services, commercial enterprises may take advantage of these electronic emanations by monitoring, intercepting, and decoding them. This may sound highly sophisticated, but there have been some embarrassingly easy cases.

Denial or Degradation of Service: There are many ways to disrupt service, including such physical means as shutting off power, air conditioning, or water (needed by air conditioning systems); or performing various kinds of electromagnetic disturbances. Natural disasters, like lightning and earthquakes, can also disrupt service. Actually, there are two quite different types of attacks in this category. Some cases of electronic sabotage involve the actual destruction or disabling of equipment or data. Turning off power or sending messages to system software telling it to stop processing are examples of the first type of attack. The other type of attack is known as flooding. In this type of attack, instead of shutting down service, the attacker puts more and more of a strain on the systems' ability to service requests, so eventually they can't function at all. Denial of service doesn't have to be a complex technical attack. Sometimes, it even occurs by accident. Suppose a new user starts printing a PostScript file as text on the company's only printer, and doesn't know how to stop the job.

Masquerading: Masquerading occurs when one person uses the identity of another to gain access to a computer. This may be done in person or remotely.
There are both physical and electronic forms of masquerading. In person, a criminal may use an authorized user's identity or access card to get into restricted areas where he will have access to computers and data. This may be as simple as signing someone else's name to a signin sheet at the door of a building. It may be as complex as playing back a voice recording of someone else to gain entry via a voice recognition system.

Social Engineering: Social engineering is the name given to a category of attacks in which someone manipulates others into revealing information that can be used to steal data or subvert systems. Such attacks can be very simple or very complex. For example, a man posing as the Managing Director of the Company simply asks for his password for logging on his computer, pretending that he has forgotten the password. He then uses that to steal important files from the system.

Harassment: Harassment is a particularly nasty kind of personnel breach that has been witnessed lately on the Internet. Sending threatening email messages and slandering people on bulletin board systems and newsgroups is a common example.

Software Piracy: Software piracy is an issue that spans the category boundaries and may be enforced in some organizations and not in others. Pirated computer programs are big business. Copying and selling off-the-shelf application programs in violation of the copyrights costs software vendors dearly. The problem is an international one, reaching epidemic proportions in some countries.

Data Attacks: There are many types of attacks on the confidentiality, integrity, and availability of data. Confidentiality keeps data secret from those not authorized to see it. Integrity keeps data safe from modification by those not authorized to change it. Availability on the other hand keeps data available for use. The theft, or unauthorized copying, of confidential data is an obvious attack that falls into this category. Espionage agents steal national defense information. Industrial spies steal their competitors' product information. Crackers steal passwords or other kinds of information by breaking into systems. Two terms commonly known in the context of data attacks are inference and leakage. With inference, a user legitimately views a number of small pieces of data, but by putting those small pieces together he is able to deduce some piece of non-obvious and secret data. With leakage, a user gains access to a flow of data via an unauthorized access route (e.g., through eavesdropping).

Traffic Analysis: Sometimes, the attacks on data might not be so obvious. Even data that appears quite ordinary may be valuable to a foreign or industrial spy. For example, travel itineraries for generals and other dignitaries help terrorists plan attacks against their victims. Accounts payable files tell outsiders what an organization has been purchasing and suggest what its future plans for expansion may be. Even the fact that two people are communicating may give away a secret. Traffic analysis is the name given to this type of analysis of communications.

Covert Channels: One somewhat obscure type of data leakage is called a covert channel. A clever insider can hide stolen data in otherwise innocent output. For example, a filename or the contents of a report could be changed slightly to include secret information that is obvious only to someone who is looking for it. A password, a launch code, or the location of sensitive information might be conveyed in this way. Even more obscure are the covert channels that convey information based on a system clock or other timed event. Information could, in theory, be conveyed by someone who controls system processing in such a way that the elapsed time of an event itself conveys secret information.

Logic Bombs: Logic bombs may also find their way into computer systems by way of Trojan horses. A typical logic bomb tells the computer to execute a set of instructions at a certain date and time or under certain specified conditions. The instructions may tell the computer to display "It is safe to shut down your computer now" on the screen, or it may tell the entire system to start erasing itself. Logic bombs often work in tandem with viruses. Whereas a simple virus infects a program and then replicates when the program starts to run, the logic bomb does not replicate, it merely waits for some pre-specified event or time to do its damage. Time is not the only criterion used to set off logic bombs. Some bombs do their damage after a particular program is run a certain number of times. Others are more creative. In several cases we've heard about, a programmer told the logic bomb to destroy data if the company payroll is run and his name is not on it; this is a sure-fire way to get back at the company if he is fired! The employee is fired, or may leave on his own, but does not remove the logic bomb. The next time the payroll is run and the computer searches for but doesn't find the employee's name, it crashes, destroying not only all of the employee payroll records, but the payroll application program as well.

Trap Doors: A trap door is a quick way into a program; it allows program developers to bypass all of the security built into the program now or in the future. If a programmer needs to modify the program sometime in the future, he can use the trap door instead of having to go through all of the normal, customer-directed protocols just to make the change. Trap doors of course should be closed or eliminated in the final version of the program after all testing is complete, but, intentionally or unintentionally, some are left in place. Other trap doors may be introduced by error and only later discovered by crackers who are roaming around, looking for a way into system programs and files.

Session Hijacking: Session hijacking is a relatively new type of attack in the communications category. Some types of hijacking have been around for a long time. In the simplest type, an unauthorized user gets up from his terminal to go get a cup of coffee. Someone lurking nearby probably a coworker who isn't authorized to use this particular system sits down to read or change files that he wouldn't ordinarily be able to access.

Tunneling: Tunneling uses one data transfer method to carry data for another method. Tunneling is an often legitimate way to transfer data over incompatible networks, but it is illegitimate when it is used to carry unauthorized data in legitimate data packets.

Timing Attacks: Timing attacks are another technically complex way to get unauthorized access to software or data. These include the abuse of race conditions and asynchronous attacks. In race conditions, there is a race between two processes operating on a system; the outcome depends on who wins the race. Although such conditions may sound theoretical, they can be abused in very real ways by attackers who know what they're doing. On certain types of UNIX systems, attackers could exploit a problem with files known as setuid shell files to gain superuser privileges. They did this by establishing links to a setuid shell file, then deleting the links quickly and pointing them at some other file of their own. If the operation is done quickly enough, the system can be made to run the attacker's file, not the real file. Asynchronous attacks are another way of taking advantage of dynamic system activity to get access. Computer systems are often called upon to do many things at the same time. In these cases, the operating system simply places user requests into a queue, then satisfies them according to a predetermined set of criteria; for example, certain users may always take precedence, or certain types of tasks may come before others. "Asynchronous" means that the computer doesn't simply satisfy requests in the order in which they were performed, but according to some other scheme. A skilled programmer can figure out how to penetrate the queue and modify the data that is waiting to be processed or printed. He might use his knowledge of the criteria to place his request in front of others waiting in the queue. He might change a queue entry to replace someone else's name or data with his own, or to subvert that user's data by replacing it. Or he could disrupt the entire system by changing commands so that data is lost, programs crash, or information from different programs is mixed as the data is analyzed or printed.

Trojan Horses: Trojan horses, viruses, worms, and their kin are all attacks on the integrity of the data that is stored in systems and communicated across networks. Because there should be procedures in place for preventing and detecting these menaces, they overlap with the operations security category as well. In the computer world, a trojan horse is a method for inserting instructions in a program so that program performs an unauthorized function while apparently performing a useful one. Trojan horses are a common technique for planting other problems in computers, including viruses, worms, logic bombs, and salami attacks. Trojan horses are a commonly used method for committing computer-based fraud and are very hard to detect.

Viruses and Worms: The easiest way to think of a computer virus is in terms of a biological virus. A biological virus is not strictly alive in its own right, at least in the sense that lay people usually view life. It needs a living host in order to operate. Viruses infect healthy living cells and cause them to replicate the virus. In this way, the virus spreads to other cells. Without the living cell, a virus cannot replicate. In a computer, a virus is a program which modifies other programs so they replicate the virus. In other words, the healthy living cell becomes the original program, and the virus affects the way the program operates. However, if a virus infects a program which is copied to a disk and transferred to another computer, it could also infect programs on that computer. This is how a computer virus spreads. The spread of a virus is simple and predictable and it can be prevented. Viruses are mainly a problem with PCs and Macintoshes. Virus infection is fortunately hard to accomplish on UNIX systems and mainframes. Unlike a virus, a worm is a standalone program in its own right. It exists independently of any other programs. To run, it does not need other programs. A worm simply replicates itself on one computer and tries to infect other computers that may be attached to the same network.

Salamis: The Trojan horse is also a technique for creating an automated form of computer abuse called the salami attack, which works on financial data. This technique causes small amounts of assets to be removed from a larger pool. The stolen assets are removed one slice at a time (hence the name salami). Usually, the amount stolen each time is so small that the victim of the salami fraud never even notices. One theoretical financial salami attack involves rounding off balances, crediting the rounded off amount to a specific account. Suppose that savings accounts in a bank earn 2.3%. Obviously, not all of the computations result in two-place decimals. In most cases, the new balance, after the interest is added, extends out to three, four, or five decimals. What happens to the remainders? Consider a bank account containing Rs. 22,500 at the beginning of the year. A year's worth of interest at 2.3% is Rs. 517.50, but after the first month the accumulated interest is Rs. 43.125. Is the customer credited with Rs. 43.12 or Rs. 43.13? Would most customers notice the difference? What if someone were funneling off this extra tenth of a penny from thousands of accounts every month? A clever thief can use a Trojan horse to hide a salami program that puts all of the rounded off values into his account. A tiny percentage of pennies may not sound like much until one add up thousands of accounts, month after month.

Data Diddling: Data diddling, sometimes called false data entry, involves modifying data before or after it is entered into the computer. Consider situations in which employees are able to falsify time cards before the data contained on the cards is entered into the computer for payroll computation. A timekeeping clerk in a 300-person company noticed that, although the data entered into the company's timekeeping and payroll systems included both the name and the employee number of each worker, the payroll system used only the employee's number to process payroll checks. There were no external safeguards or checks to audit the integrity of the data. She took advantage of this vulnerability and filled out forms for overtime hours for employees who usually worked overtime. The cards had the hardworking employees' names, but the time clerk's number. Payment for the overtime was credited to her.

IP Spoofing: IP stands for Internet Protocol, one of the communications protocols that underlie the Internet. Certain UNIX programs grant access based on IP addresses; essentially, the system running the program is authenticated, rather than the individual user. The attacker forges the addresses on the data packets he sends so they look as if they came from inside a network on which systems trust each other. Because the attacker's system looks like an inside system, he is never asked for a password or any other type of authentication. In fact, the attacker is using this method to penetrate the system from the outside.

Password Sniffing: Password sniffers are able to monitor all traffic on areas of a network. Crackers install them on networks that they especially want to penetrate, like telephone systems and network providers. Password sniffers are programs that simply collect the first 128 or more bytes of each network connection on the network that's being monitored. When a user types in a user name and a password as required when using certain common Internet services like FTP (which is used to transfer files from one machine to another) or Telnet (which lets the user log in remotely to another machine) the sniffer collects that information. Additional programs sift through the collected information, pull out the important pieces (e.g., the user names and passwords), and cover up the existence of the sniffers in an automated way.
Scanning: Scanning is a technique often used by novice crackers, called scanning or war dialing, also is one that ought to be prevented by good operations security. With scanning, a program known as a war dialer or demon dialer processes a series of sequentially changing information, such as a list of telephone numbers, passwords, or telephone calling card numbers. It tries each one in turn to see which ones succeed in getting a positive response.

Some of the technical terms used in cyber crime are as under:
• Arson - Targeting a computer center for damage by fire.
• Extortion - Threatening to damage a computer to obtain money.
• Burglary - Break-ins to steal computer parts.
• Conspiracy - People agreeing to commit an illegal act on computer.
• Espionage/Sabotage - Stealing secrets or destroying competitors’ records.
• Forgery - Issuing false documents or information via computer.
• Larceny/Theft - Theft of computer parts.
• Malicious destruction of property - Destroying computer hardware or software.
• Murder - Tampering with computerized life-sustaining equipment.
• Receiving stolen property - Accepting known stolen good or services via computer.
• Internet fraud - False advertising, credit card fraud, wire fraud, money laundering.
• Industrial espionage - Theft of proprietary information or trade secrets.
• National intelligence - Attempts by foreign governments to steal economic, political, or military secrets.
• Infowarfare - Cyber attacks by anyone on the nation's infrastructure to disrupt economic or military operations.

OSI Model and Networking Devices

Introduction
Until quite recently, there was a lack of sufficient standards for the interface between the hardware, software and communications channel of data communication networks. In response, computer manufactures have developed network architectures to support the development of advanced data communications networks.

The goal of network architectures is to promote an open, simple, flexible and efficient telecommunications environment. This is accomplished by the use of standard protocols, standard communications hardware and software interfaces and the design of a standard multi-level interface between end users and computer systems.

The International Standards Organisation (ISO) has developed a seven layer Open Systems Interconnection (OSI) model to serve as a standard model for network architectures. Examples of network architectures include IBM’s System Network Architecture (SNA) and DECnet by the Digital Equipment Corporation.

An important suite of protocols that has become so widely used that it is equivalent to a network architecture is the Internet’s Transmission Control Protocol/Internet Protocol also known as TCP/IP. Another example is the local area network architecture for automated factories sponsored by General Motors and other manufacturers called the Manufacturing Automation Protocol (MAP).

OSI Model
The function and operation of each layer of the OSI model is discussed hereunder:

Layer 1: The Physical Layer
This layer is concerned with transmitting an electrical signal representation of data over a communication link. Typical conventions would be: voltage levels used to represent a “1” and a “0”, duration of each bit, transmission rate, mode of transmission, and functions of pins in a connector.

Layer 2: The Data Link Layer
This layer is concerned with error-free transmission of data units. The data unit is an abbreviation of the official name of data-link-service-data-units; it is sometimes called the data frame. The function of the data link layer is to break the input data stream into data frames, transmit the frames sequentially, and process the acknowledgement frame sent back by the receiver. Data frames from this level when transferred to layer 3 are assumed to be error free.

Layer 3: The Network Layer
This layer is the network control layer, and is sometimes called the communication subnet layer. It is concerned with intra-network operation such as addressing and routing within the subnet. Basically, messages from the source host are converted to packets. The packets are then routed to their proper destinations.

Layer 4: The Transport Layer
This layer is a transport end-to-end control layer (i.e. source-to-destination). A program on the source computer communicates with a similar program on the destination computer using the message headers and control messages, whereas all the lower layers are only concerned with communication between a computer and its immediate neighbours, not the ultimate source and destination computers. The transport layer is often implemented as part of the operating system. The data link and physical layers are normally implemented in hardware.

Layer 5: The Session Layer
The session layer is the user’s interface into the network. This layer supports the dialogue through session control, if services can be allocated. A connection between users is usually called a session. A session might be used to allow a user to log into a system or to transfer files between two computers. A session can only be established if the user provides the remote addresses to be connected. The difference between session addresses and transport addresses is that session addresses are intended for users and their programs, whereas transport addresses are intended for transport stations.

Layer 6: The Presentation Layer
This layer is concerned with transformation of transferred information. The controls include message compression, encryption, peripheral device coding and formatting.

Layer 7: The Application Layer
This layer is concerned with the application and system activities. The content of the application layer is up to the individual user.

Networking devices
Networking devices are used to connect the segments of a network together or to connect networks to create an internetwork. These devices are classified into five categories namely switches, repeaters, bridges, routers and gateways. Each of these devices except the first one (switches) interacts with protocols at different layers of the OSI model.

Switches
A switched network consists of a series of interlinked switches. Switches are hardware/software devices capable of creating temporary connections between two or more devices to the switch but not to each other. Switching mechanisms are generally classified into three methods: circuit switching, packet switching and message switching.

(a) Circuit switching creates a direct physical connection between two devices such as telephones or computers. Once a connection is made between two systems, circuit
switching creates a dedicated path between two end users. The end users can use the path for as long as they want.

(b) Packet switching is one way to provide a reasonable solution for data transmission.
In a packet-switched network, data are transmitted in discrete units of variable-length blocks called packets. Each packet contains not only data, but also a header with control information. The packets are sent over the network node to node. At each node, the packet is stored briefly before being routed according to the information in its header.

In the datagram approach to packet switching, each packet is treated independently of all others as though it exists alone. In the virtual circuit approach to packet switching, if a single route is chosen between sender and receiver at the beginning of the session, all packets travel one after another along that route. Although these two approaches seem the same, there exists a fundamental difference between them. In circuit switching, the path between the two end users consists of only one channel.

In the virtual circuit, the line is not dedicated to two users. The line is divided into channels and each channel can use one of the channels in a link.

(c) Message switching is known as the store and forwarding method. In this approach, a computer (or a node) receives a message, stores it until the appropriate route is free, and then sends it out. This method has now been phased out.

Repeaters
A repeater is an electronic device that operates on the physical layer only of the OSI model. A repeater boosts the transmission signal from one segment and continues the signal to another segment. Thus, a repeater allows us to extend the physical length of a network. Signals that carry information can travel a limited distance within a network before degradation of the data integrity due to noise. A repeater receives the signal before attenuation, regenerates the original bit pattern and puts the restored copy back on to the link.

Bridges
Bridges operate in both the physical and the data link layers of the OSI model. A single bridge connects different types of networks together and promotes interconnectivity between networks. Bridges divide a large network into smaller segments. Unlike repeaters, bridges contain logic that allows them to keep separate the traffic for each segment. Bridges are intelligent enough to relay a frame towards the intended recipient so that traffic can be filtered. In fact, the filtering operation makes bridges useful for controlling congestion, isolating problem links and promoting security through the partitioning of traffic.

A bridge can access the physical addresses of all stations connected to it. When a frame enters a bridge, the bridge not only regenerates the signal but also checks the address of the destination and forwards the new copy to the segment to which the address belongs. When a bridge encounters a packet, it reads the address contained in the frame and compares that address with a table of all the stations on both segments. When it finds a match, it discovers to which segment the station belongs and relays the packet to that segment only.

Bridges can be programmed to reject packets from particular networks. Bridges forward all broadcast messages. Bridges do not normally allow connection of networks with different architectures. Only a special bridge called a translation bridge will allow two networks of different architectures to be connected.

Routers
Routers operate in the physical, data link and network layers of the OSI model. The
Internet is a combination of networks connected by routers. When a datagram (a TCP/IP packet containing data and a source and destination address) goes from a source to a destination, it passes through many routers until it reaches the router attached to the destination network. Routers determine the path a packet should take. Routers relay packets among multiple interconnected networks. In particular, an IP router forwards IP datagrams among the networks to which it connects.

Gateways
Gateways operate over the entire range in all seven layers of the OSI model. Internet routing devices have traditionally been called gateways. A gateway is a protocol converter, which connects two or more heterogeneous systems and translates among them. The gateway thus refers to a device that performs protocol translation between devices. A gateway can accept a packet formatted for one protocol and convert it to a packet formatted for another protocol before forwarding it. The gateway understands the protocol used by each network linked into the router and is therefore able to translate from one to another.

Comparison of Linux and Windows

Operating System (OS)
An operating System is a software, which interacts with the hardware of the computer in order to manage and direct computer resources. The basic objective of an operating system is to maximise productivity of a computer system by operating it in the most efficient manner. An operating system can manage a computer system in various modes i.e. batch processing mode; time-sharing mode or real time mode. Linux and Windows are two common operating systems.

The main difference between the two are discussed as under:

1. Versions: Both Windows and Linux come in many versions. All the versions of Windows come from Microsoft Inc., USA, the various distributions of Linux come from different companies i.e. Lindows, Lycoris, Red Hat, SuSE, Mandrake, Knoppix and Slackware. Windows has two main lines: "Win9x", which consists of Windows 95, 98, 98 second edition and Me, and "NT class" which consists of Windows NT, 2000 and XP. The versions of Linux are referred to as distributions (often shortened to "distros"). All the Linux distributions released around the same timeframe will use the same kernel. Both Linux and Windows come in desktop and server editions.

2. Graphical User Interface: Both Linux and Windows provide a GUI and a command line interface. The Windows GUI has changed from Windows 3.1 to Windows 95 (drastically) to Windows 2000 (slightly) to Windows XP (fairly large) and is slated to change again with the next version of Windows (Code name: Longhorn). Linux typically provides two GUIs, KDE and Gnome. Of the major Linux distributions, Lindows has made their user interface look more like Windows than the others.

3. Text Mode Interface: This is also known as a command interpreter. Windows users sometimes call it a DOS prompt. Linux users refer to it as a shell. Each version of Windows has a single command interpreter, but the different flavors of Windows have different interpreters. In general, the command interpreters in the Windows 9x series are very similar to each other and the NT class versions of Windows (NT, 2000, XP) also have similar command interpreters. Linux, like all versions of Unix, supports multiple command interpreters, but it usually uses one called BASH (Bourne Again Shell). Others are the Korn shell, the Bourne shell, ash and the C shell (pun, no doubt, intended).

4. Cost: For desktop or home use, Linux is very cheap or free, Windows is expensive. For server use, Linux is very cheap compared to Windows. Microsoft allows a single copy of Windows to be used on only one computer. Starting with Windows XP, they use software to enforce this rule. In contrast, once one has purchased Linux, one can run it on any number of computers for no additional charge.

5. Bugs: All software has and will have bugs (programming mistakes). Linux has a reputation for fewer bugs than Windows, but it certainly has its fair share. This is a difficult thing to judge and finding an impartial source on this subject is also difficult. The difference in OS development methodologies may explain why Linux is considered more stable. Windows is developed by faceless programmers whose mistakes are hidden from the outside world because Microsoft does not publish the underlying code for Windows. They consider it a trade secret. In contrast, Linux is developed by hundreds of programmers all over the world. They publish the source code for the operating system and any interested programmer, anywhere in the world can review it.

6. Software restrictions: A program written for Linux will not run under Windows and vice versa. This is the rule, but there are a fair number of exceptions. The most ambitious exceptions allow for installing one operating system under another. For example, on a computer running Linux (referred to as the host or native OS), one can install a copy of Windows (referred to, in this case, as the guest OS). In the Windows OS running under Linux, one can install any and all Windows programs.

7. Hardware devices supported by the OS: More hardware works with Windows than works with Linux. This is because hardware vendors write drivers for Windows more often than they do for Linux. When Windows XP came out however, many existing peripherals would not work with it because XP required new drivers and the vendors had little motivation to write drivers for old hardware.

8. Hardware the OS runs on: Linux runs on many different hardware platforms, not so with Windows. For example, Windows NT used to run on MIPS CPUs until Microsoft changed their mind. It also used to run on Alpha CPUs, again, until Microsoft changed their mind. No one gets to change their mind with Linux. It runs on a very wide range of computers, from the lowest of the low to the highest of the high. The supported range of computers is all but stunning. Because of its ability to run without a GUI, Linux can run on very old personal computers, 486 based machines for example. On the high end, Linux runs natively on IBM mainframes (the Z series) and on other high end IBM servers. On the small side, Debian Linux can run on a computer the size of a deck of playing cards (100mm by 55mm) with an ARM cpu. IBM's upcoming family of "Blue Gene" supercomputers, which will be used by Lawrence Livermore National Laboratory for nuclear weapons simulations, will run on Linux. Sony and Matsushita (parent company of Panasonic) will use Linux to build increasingly 'smart' microwave ovens, TVs and other consumer gizmos. Likewise MontaVista Software will release a version of its embedded Linux for use in consumer electronics devices. NEC is working on Linux-based cell phones and Motorola is going to make Linux its primary operating system for smart phones.

9. Multiple Users: Linux is a multi-user system, Windows is not. That is, Windows is designed to be used by one person at a time. Databases running under Windows allow concurrent access by multiple users, but the Operating System itself is designed to deal with a single human being at a time. Linux, like all Unix variants, is designed to handle multiple concurrent users. Windows, of course, can run many programs concurrently, as can Linux. There is a multi-user version of Windows called Terminal Server but this is not the Windows pre-installed on personal computers.

10. Networking: Both uses the TCP/IP. Linux can do Windows networking, which means that a Linux computer can appear on a network of Windows computers and share its files and printers.
11. Hard disk partitions: Windows must boot from a primary partition. Linux can boot from either a primary partition or a logical partition inside an extended partition. Windows must boot from the first hard disk. Linux can boot from any hard disk in the computer.

12. Swap files: Windows uses a hidden file for its swap file. Typically this file resides in the same partition as the OS (advanced users can opt to put the file in another partition). Linux uses a dedicated partition for its swap file (advanced users can opt to implement the swap file as a file in the same partition as the OS).

13. File Systems: Windows uses FAT12, FAT16, FAT32 and NTFS. Linux also has a number of its own native file systems. The default file system for Linux used to be ext2, now it is typically ext3. File systems can be either journaled or not. Non-journaled systems are subject to problems when stopped abruptly. All the FAT variants and ext2 are non-journaled. After a crash, they should be examined by their respective health check utilities (Scan Disk or Check Disk or fsck). In contrast, when a journaled file system is stopped abruptly, recovery is automatic at the next reboot. NTFS is journaled. Linux supports several journaled file systems: "ext3", "reiserfs" and "jfs". All the file systems use directories and subdirectories. Windows separates directories with a back slash, Linux uses a normal forward slash. Windows file names are not case sensitive. Linux file names are. For example "abc" and "aBc" are different files in Linux, whereas in Windows it would refer to the same file. As for crossing over, Linux can read/write FAT16 and FAT32. Some Linux distributions can read NTFS partitions, others can not. No version of Linux can write to NTFS. On its own, Windows can not read partitions formatted with any Linux file system.

14. File Hierarchy: Windows and Linux use different concepts for their file hierarchy. Windows uses a volume-based file hierarchy, Linux uses a unified scheme. Windows uses letters of the alphabet to represent different devices and different hard disk partitions. Under Windows, one needs to know what volume (C:, D:,...) a file resides on to select it, the file's physical location is part of it's name. In Linux all directories are attached to the root directory, which is identified by a forward-slash, "/". For example, below are some second-level directories:
/bin/ ---- system binaries, user programs with normal user permissions
/sbin --- executables that need root permission
/data/ --- a user defined directory
/dev/ ---- system device tree
/etc/ ---- system configuration
/home/ --- users' subdirectories
/home/{username} akin to the Windows My Documents folder
/tmp/ ---- system temporary files
/usr/ ---- applications software
/usr/bin - executables for programs with user permission
/var/ ---- system variables
/lib --- libraries needed for installed programs to run

Every device and hard disk partition is represented in the Linux file system as a subdirectory of the lone root directory. For example, the floppy disk drive in Linux might be /etc/floppy. The root directory lives in the root partition, but other directories (and the devices they represent) can reside anywhere. Removable devices and hard disk partitions other than the root are attached (i.e., "mounted") to subdirectories in the directory tree. This is done either at system initialization or in response to a mount command. There are no standards in Linux for which subdirectories are used for which devices. This contrasts with Windows where the A disk is always the floppy drive and the C disk is almost always the boot partition.

15. Hidden Files: Both operating systems support the concept of hidden files, which are files that, by default, are not shown to the user when listing files in a directory. Linux implements this with a filename that starts with a period. Windows tracks this as a file attribute in the file metadata (along with things like the last update date). In both OSs the user can over-ride the default behavior and force the system to list hidden files.

16. User Data: Windows allows programs to store user information (files and settings) anywhere. This makes it impossibly hard to backup user data files and settings and to switch to a new computer. In contrast, Linux stores all user data in the home directory making it much easier to migrate from an old computer to a new one. If home directories are segregated in their own partition, one can even upgrade from one version of Linux to another without having to migrate user data and settings.

17. Shutting Down: Both operating systems have to be told to shut down One shuts down Windows through the Start button, then selecting Shutdown. In both the KDE and Gnome GUIs for Linux, one can shut the system down by first logging out (equivalent to logging off in Windows). Further, Linux can also be shut down from a command prompt using the shutdown command. The shutdown command can either shut the system down immediately or be told to shut down at some time in the future, an option not available in Windows.

Internet 2: The Next Generation Internet

Introduction
Internet2 is a not-for-profit consortium being led by 207 universities of the United States of America working in partnership with over 60 leading companies and government to develop and deploy advanced network applications and technologies, accelerating the creation of tomorrow's Internet.

Internet2’s mission is to develop and deploy advanced network applications and technologies for research and higher education, accelerating the creation of tomorrow's Internet. The primary goal of Internet2 is to ensure the transfer of new network technologies and applications to the broader education and networking communities. Some of the major goals of Internet2 are to:

• Create a leading edge network capability for the national research community
• Enable revolutionary Internet applications
• Ensure the rapid transfer of new network services and applications to the broader Internet community.

Internet2 is not a separate physical network and will not replace the Internet. Internet2 brings together institutions and resources from academia, industry and government to develop new technologies and capabilities that can then be deployed in the global Internet. Close collaboration with Internet2 corporate members will ensure that new applications and technologies are rapidly deployed throughout the Internet. Just as email and the World Wide Web are legacies of earlier investments in academic and federal research networks, the legacy of Internet2 will be to expand the possibilities of the broader Internet.

Internet2 and its members are developing and testing new technologies, such as IPv6, multicasting and quality of service (QoS) that will enable revolutionary Internet applications. However, these applications require performance not possible on today's Internet. More than a faster Web or email, these new technologies will enable completely new applications such as digital libraries, virtual laboratories, distance-independent learning and tele-immersion.

The History of Internet
With the invention of nuclear weapons and the beginning of the Cold War, the American authorities faced a strategic problem: How could they ensure the reliability and stability of communications after a nuclear attack?

A governmental organization, known as RAND was entrusted with finding the appropriate solution to these problems. RAND’s research determined the necessity of building a decentralized communication network. The principles were simple: each node in the network would have the capability to originate, pass and receive messages. At a specified source node, the message would be divided into packets; each packet would be separately addressed, and would make its way through the network. At the specified destination node, the packets would reassemble to rebuild the original message. This message-delivery technique, called packet switching, proved vital to the construction of a computer network.

In 1969, ARPA (Advanced Research Project Agency), which is part of the US Department of Defense, started to build a leased network, which they called ARPANET (Advanced Research Project Agency Network). This network linked four nodes (four American Universities) that were using supercomputers. The impact of ARPANET was tremendous. Using NCP (Network Control Protocol), it allowed the transfer of information between nodes running on the same network, so scientists and researchers were able to access remote computer systems, and share computer resources at a rate of 50 kbps.

In 1972, Ray Tomlinson of BBN (Bolt Beranek and Newman) created the first e-mail program, which enabled electronic messages to be sent over decentralized networks. E-mail rapidly gained popularity, and became the most common means of communication over networks.

In the early 1970s, a group from ARPA began to develop a new protocol, called TCP/IP (Transmission Control Protocol/Internet Protocol), which allowed different computer networks to interconnect and communicate with each other; and in the paper about TCP, the term Internet was used for the first time. ARPANET also launched its first commercial application, called TELNET, during this era. Soon after, ARPANET expanded to connect universities and research centers in Europe, and eventually, became a global network. By the late 1970s, people were able to participate in discussions over networks through newsgroup services, such as USENET.

When other networks, such as CSNET (Computer Science Network), BITNET (Because its Time Network), NSFnet (National Science Foundation network) started to offer e-mail and FTP (File Transfer Protocol) services in the 1980s, inter-network connections became prevalent, so every network had to use the TCP/IP suite of protocols, which replaced NCP completely. The term Internet, which refers to the group of computers communicating via TCP/IP, started to become popular.

As time passed, more and more nodes were built. Data transmission speed became faster, especially when dedicated lines, such as TI carriers, were introduced. All of these developments contributed to the expansion of the Internet and triggered the formation of different organizations, such as IAB (Internet Activities Board) and IETF (Internet Engineering Task Force), who wanted to develop the Internet further.

Internet2 and Next Generation Internet (NGI)
Most research works in the US today are being funded either by NASA or by the US Department of Defense. US Authorities in order to maintain their supremacy in the field of Science and Technology in October 1996 started two parallel projects on the line of the ARPANET namely Internet2 and Next Generation Internet (NGI). The university-led Internet2 and the federally led NGI are parallel and complementary initiatives based.

Internet2 Universities are required to invest and provide high-performance networks from their universities and commit $60 million per year in investments, with corporate members investing another $30 million over the lifetime of a project. In addition, Internet2 member institutions may receive funding in the form of competitively awarded grants from the federal agencies participating in the federal Next Generation Internet initiative.

Internet2 is systematically swallowing up the National Science Foundation's Very High-Performance Backbone Network Service (vBNS). More than 50 Internet2 institutions have received competitively awarded vBNS grants under the NSF's High Performance Connections program.

In fact, vBNS could be considered the heart of Internet2, or at least its substantive launch pad. Begun in 1995, with an investment of $50 million under a five-year cooperative project with MCI, the service links six NSF supercomputer centers and was initially implemented to design and support "gigabit testbeds" for R&D of advanced networking technologies.

What does it offer?
Requiring state-of-the-art infrastructure, Internet2 universities are connected to the Abilene network backbone, which uses regional network aggregation points called gigaPoPs. Abilene supports transfer rates between 2.4 gigabits per second and 9.6 gigabits per second. The rate of data transfer of Internet2 can be realised by the fact that if it takes a 56 Kbps Modem 171 hours to download a particular file, it would take 74 hours to download the same file on an ISDN connection, 25 hours on DSL/Cable line, 6.4 hours on a T1 network and just 30 seconds on Internet2.

Some of the major applications of Internet2 would comprise:
1. Tele-immersion, which enables users at geographically distributed sites to collaborate in real time in a shared, simulated, hybrid environment as if they were in the same physical room.
2. New services and capabilities envisioned for Internet2 offer important opportunities to move the Digital Libraries program into new areas. Very high-bandwidth and bandwidth reservation will allow currently exotic materials such as continuous digital video and audio to move from research use to much broader use. Images, audio and video can, at least from a delivery point of view, move into the mainstream currently occupied almost exclusively by textual materials. This will also facilitate more extensive research in the difficult problems of organizing, indexing and providing intellectual access to these classes of materials.
3. A virtual laboratory, which enables a group of researchers located around the world to work together on a common set of projects. As with any other laboratory, the tools and techniques are specific to the domain of the research, but the basic infrastructure requirements are shared across disciplines. Although related to some of the applications of tele-immersion, the virtual laboratory does not assume a priori the need for a shared immersive environment.
4. Virtual Museums can be created as a result of high-bandwidth. With curators digitizing their collections, the wealth of assembled artifacts can be available to anyone with a high-speed connection.
5. Real-time access/visualization of simulation results. Faster access to remote computers that perform data analysis and simulations.
6. Access and control of remote precious devices, such as MRIs, telescopes etc.
7. Advanced network quality video conferencing and distant learning.
8. Setting up a War Information Network, synchronising forces on land, air and water and providing them real time satellite images/videos of the ground position.

Conclusion
Indian Universities and Corporates can definitely draw a leaf out of their US counterparts specially when India has a vision of becoming an IT surperpower by the year 2020. The Govt. of India should also play a proactive role in this regard.