Learning HTML

Last updated on August 24, 2000.


HTML

A Quick Example

What's new? HTML 4.01

Resources For Learning HTML

More Sample Pages


HTML stands for HyperText Markup Language. HTML is a member of the Standardized General Markup Languages ( SGML). There are three characteristics of HTML which are important to understand.

  1. It is a markup language that focuses on tagging the content of a page so that it can be displayed appropriately on different platforms by different software. Since the intention is to be platform independent, it does not attempt to provide font selections or highly detailed layout specifications.
  2. It is intended to be readable by both humans and machines. Thus it is a formal language with precise rules that humans can easily mess up. Most browsers will ignore tags they don't understand and will try to do something sensible with documents which are not syntactically correct. Unfortunately, this means that your document may look fine on your machine with your browser, and may look unintelligible on a different machine or a different browser. There are software editors, HoT-MetaL Pro and PageMill for example, which automatically generate tags and insure conformity to the HTML standard. There are also programs which verify HTML documents for conformance to the standard.
  3. Hypertext is the primary feature of HTML which distinguishes it from other SGML languages. An anchor tag can be used to indicate that a word or phrase should be displayed differently (typically underlined and colored blue) and then treated as a link to another document. Two key specifications were necessary to make this work - the Uniform Resource Locator (URL), and the HyperText Transport Protocol (HTTP).

A Quick Example

The following page, Intro to HTML, provides a very brief overview of the main features of HTML which are needed to produce documents which can effectively convey information on the web. To see the html select the View Page Source command in your browser. This document was one of the slides presented at the 1996 ICTCM Workshop, Internet Technologies and The Classroom.

What's new? XHTML1.0

On January 26, 2000, the World Wide Web Consortium (W3C) released the XHTML 1.0 specification as a W3C Recommendation. This new specification represents cross-industry and expert community agreement on the importance of XHTML 1.0 as a bridge to the Web of the future. A W3C Recommendation indicates that a specification is stable, contributes to Web interoperability, and has been reviewed by the W3C membership, who favor its adoption by the industry.

XHTML 1.0 Builds the Web of the Future, Now

HTML currently serves as the lingua franca for millions of people publishing hypertext on the Web. While that is the case today, the future of the Web is written in W3C's Extensible Markup Language (XML). XML is bringing the Web forward as an environment that better meets the needs of all its participants, allowing content creators to make structured data that can be easily processed and transformed to meet the varied needs of users and their devices.

In designing XHTML 1.0, the W3C HTML Working Group faced a number of challenges, including one capable of making or breaking the Web: how to design the next generation language for Web documents without obsoleting what's already on the Web, and how to create a markup language that supports device-independence. The answer was to take HTML 4, and rewrite it as an XML application. The first result is XHTML 1.0.

"XHTML 1.0 connects the present Web to the future Web," said Tim Berners-Lee, W3C Director. "It provides the bridge to page and site authors for entering the structured data, XML world, while still being able to maintain operability with user agents that support HTML 4."

XHTML 1.0 Combines the Familiarity of HTML with the Power of XML

XHTML 1.0 allows authors to create Web documents that work with current HTML browsers and that may be processed by XML-enabled software as well. Authors writing XHTML use the well-known elements of HTML 4 (to mark up paragraphs, links, tables, lists, etc.), but with XML syntax, which promotes markup conformance.

The benefits of XML syntax include extensibility and modularity. With HTML, authors had a fixed set of elements to use, with no variation. With XHTML 1.0, authors can mix and match known HTML 4 elements with elements from other XML languages, including those developed by W3C for multimedia (Synchronized Multimedia Integration Language - SMIL), mathematical expressions (MathML), two dimensional vector graphics (Scalable Vector Graphics - SVG), and metadata (Resource Description Framework - RDF).

W3C provides instruction and tools for making the transition from HTML 4 to XHTML 1.0 . The "HTML Compatibility Guidelines" section of the XHTML 1.0 Recommendation explains how to write XHTML 1.0 that will work with nearly all current HTML browsers. W3C offers validation services for both HTML and XHTML documents. W3C's Open Source software "Tidy" helps Web authors convert ordinary HTML 4 into XHTML and clean document markup at the same time.

XHTML 1.0 Provides a Foundation for Device-Independent Web Access

In addition to its extensibility, moving from HTML to XML via XHTML 1.0 lays the foundation for making Web content available to millions more users. People browsing the Web with cell phones or other mobile devices want Web content tailored to their needs. People with disabilities need ways to transform content into accessible formats.

XML documents can already be transformed using Extensible Stylesheet Language Transformations (XSLT), and rendered using independent style sheets such as CSS style sheets. XHTML 1.1, already under development, coupled with device-specific style sheets and Composite Capability/Preference Profiles (CC/PP) - a protocol which allows a user to describe both user preferences and device capabilities - will bring mobile and other devices to the Web as full participants.

HTML 4.01

HTML 4.01, which was released on December 24, 1999, was the W3C's final specification for HTML. HTML 4.0 was specified in 3 flavors:

The complete HTML 4.01 specification is available in English in several formats, including HTML, plain text Postscript, and PDF.

For more information about the World Wide Web Consortium's work related to HTML check the W3C Markup page.

Relative to the HTML 3.2 standard, HTML 4.0 added the following key features:

Relative to the HTML 2.0 standard, HTML 3.2 added widely deployed features such as:

A description of HTML 3.2 elements includes a few minor changes relative to HTML 2.0 as well.

W3C is continuing to work with vendors on extensions for multimedia objects, scripting, style sheets, layout, forms and math. W3C plans on incorporating this work in further versions of HTML. See The W3C Activity Statement on HTML for details.

What happened to HTML 3.0? HTML 3.0 was a proposal for extending HTML published in March 1995. The Arena browser was a testbed implementation, and a few other experimental implementations were developed (see: the Yahoo list of browsers, including UdiWWW, Emacs-W3, etc.). However, the difference between HTML 2.0 and HTML 3.0 was so large that standardization and deployment of the whole proposal proved unwieldy. The HTML 3.0 draft has expired, and is not being maintained.

Resources For Learning HTML

A Beginner's Guide to HTML
This is an excellent document for anyone who is starting to do serious web page authoring. This document is maintained by NCSA, and it reflects the most current specification--HTML Version 4.0-- plus some additional features that have been widely and consistently implemented in browsers.
The World Wide Web Consortium
The W3 Consortium exists to develop common standards for the evolution of the World Wide Web. It is an industrial consortium run by the Laboratory for Computer Science at the Massachusetts Institute of Technology. In Europe, MIT collaborates with CERN, the originators of the web, and INRIA, the European W3C center. W3C works with the global community to produce specifications and reference software. W3C is funded by industrial members, but its products are freely available to all.
The HTML 2.0 Standard
This site is maintained by the World Wide Web Consortium and contains the HTML 2.0 standard in several forms. It also includes the SGML source which defines the language and tools for translating between SGML and other formats such as TEX and RTF.
Document Type Definition
This defines the syntax of HTML as an SGML application. This is a draft which represents the consensus of an editorial review board within the W3C, as of April of 1996. They are still under revision, subject to W3C member review and public review.
HTML Validation Service
This link connects to the W3C site that provides HTML validation services.
An Instantaneous Introduction to CGI Scripts and HTML Forms
This document provides a quick introduction to HTML forms and the CGI Scripts needed to process them. This document is maintained by Michael Grobe, Academic Computing Services, The University of Kansas

More Sample Pages

 

Valid HTML 3.2!