Introduction to HTML
An Introduction to HTML.
HTML stands for "Hypertext markup language" and is the simple set of codes that are used to define the appearance and functioning of a World Wide Web page. HTML is a constantly evolving set of codes, which are also known as tags.
Do I have to learn this HTML stuff?
If you're not used to computers, some of the information presented here may seem a little daunting. But don't worry! You only need to learn HTML if you decide that your organization wants to write its own Web pages.
If you simply want your group's basic information on the CommunityNet without learning HTML, just register for an account and we will automatically generate a web page with your own URL, name, description of what your group does, address, phone number, fax number and email button from your registration. You don't have to do anything. This single web page can then be edited later if you want to.
However, if you want to build and maintain your own set of pages, this section of the manual is for you! Learning HTML is not much more difficult than learning a word processing program, and can be quite fun. Of course, don't forget that there are also lots of skilled HTML consultants in town to whom you can contract out Web page creation services.
About HTML files.
HTML files are always plain, unformatted text (ASCII) files. Such files can be created with any text editor or word processor or a web editor such as Arachnophilia. If you have a Macintosh you can use SimpleText or TeachText, which come free with your Mac, to edit text files. If you use a Windows PC you can use the Windows Notepad. Other computers have similar free text editors available.
By convention an HTML file will end with the suffix ".html". Also by convention, HTML file names are usually in lower-case. Note that Windows and DOS systems, which have severe file name restrictions, truncate this suffix to ".htm", and often have upper-case file names.
The HTML codes themselves are bracketed with angle brackets <like so>. Anything within angled brackets is assumed to be an HTML tag by the browser and thus is normally not displayed on-screen for the user to see. HTML codes are thus analogous to the codes used by word processors such as Wordperfect. (the "Reveal Codes" function)
This raw HTML - a document marked up with these tags - is called the "source" file. Your browser takes this source file and creates the final marked-up copy for you to read.
There are three basic ways to create an HTML file.
The first is to open a word processor or text editor and type in the tags manually. This gives you complete control over the appearance of the document, but obviously requires familiarity with HTML codes.
The second is to use a special HTML editing program such as Arachnolphilia. These programs show you the HTML code that is being produced but help you create lists, paragraphs and links by putting in the correct tags at the click of a button. These programs require basic knowledge of HTML but produce clean pages that are easy to view regardless of browser and also simple to update.
Finally, there are "wysiwyg" editors an acronym for "what you see is what you get". These programs let you manipulate text on-screen, just like a word processor. The editor then generates the HTML code. You have probably noticed that you can save word processed documents to HTML with newer software. These can be easy to use but obviously tend to be somewhat restricting.
Viewing the Source.
If you use Lynx, pressing the "\" key lets you view the raw HTML source for a document, which is a great learning tool. Netscape and Explorer and most other graphical browsers have similar "View Source" functions that let you see the unformatted HTML that makes up a given Web page. One of the best ways to learn how Web pages work is to look at a page you like and examine its source.
What is a CGI?
CGI stands for Common Gateway Interface. Basically a CGI is a software program or tool that links a Web server with some external service. For example, you might have a little program that keeps track of the number of times your page is viewed. This page counter could be a CGI.
Here at the CommunityNet we make extensive use of CGIs on our system, and provide a small library of free CGIs for the use of any organization that wants one. The CGIs are documented in our Cool Stuff file, at:
HTML tags generally have two parts - the opening tag and the closing tag. Take the example below.
This is <B>boldface</B> text!
On a browser this HTML code would look like this:
This is boldface text!
Notice how this works. The opening tag, <B>, tells the browser to make the text boldface. The closing tag, </B> tells the browser to stop using boldface. In other words the attributes of the text between the opening and closing tag are defined by the opening tag.
HTML is not case sensitive. Therefore <B> and <b> are equivalent tags. HTML is often typed in uppercase for legibility, however. Uppercase tags are easier to spot in a document than lowercase or mixed case tags.
Indicates the document is an HTML document. Note that at the very end of the document is a </HTML> tag. This is a closing tag. Most HTML tags are followed by a closing tag. The text between the opening and closing tags is controlled by the opening tag. Think of the text as being within a container.
You don't have to include the <HTML> tag at the start and end of a document, but it's good form to do so.
The second tag indicates the header portion of the document. This is also optional, and is generally the section of the document used to store information about the file that's normally not directly viewed by the user, such as the document's title.
Each document should have an informative title explaining its purpose in life. The title of a document is usually displayed at the top of the screen or on the title of the window bar of the browser, separate from the text of the document itself. Any text enclosed in <TITLE> tags is the title. Titles should only be a brief line of text.
Now the actual document begins! The <BODY> tag is yet another optional tag, but is used to indicate the actual text of the document. This actual text of the document is displayed by the browser.
<Comments>…</COMMENTS> OR <!-- -->None.
At times you might want to put a comment into a document that you don't want others to see. For example, you may write the current date into every document so you know when you last changed it. There's no need to make this date visible to viewers, so you could put it in as an HTML comment. <!-- and --> has the same effect. You can only view the text when you view the source for the file.
Embeds information about the document. You can use the tag with the following attributes provided you use it within the bounds of the <HEAD> element:
<META NAME = "Description" CONTENT = "a description of page">
This gives a search engine a description to use.
<META NAME = "Keywords" CONTENT = "comma separated keywords">
This gives a search engine help for indexing
HTML supports six levels of headings. These headings are used to separate out important headings from the body of text of a document. To choose a heading level, replace the * with a number between 1 to 6. Here is a top level heading, Heading 1.
<H1>This is the first level heading</H1>
This heading will appear differently depending on what browser you have. Generally, level 1 headings are shown in large bold type onscreen. The remaining 5 headings are shown in correspondingly less dramatic type.
<H2>This is a level 2 heading</H2>
<FONT SIZE = "*">…</FONT>
You may specify an absolute height such as <FONT SIZE = "4"> or you may specify a size relative to the basefont using increments <FONT SIZE = "+2">
Like the <FONT> tag, this tag enlarges the base font size.
This is similar to our friend <B>. However the <I> tag italicizes text. At least, it does so in browsers capable of italicizing text. Some browsers can't display italics and so underline the text.
The <P> tag is a paragraph marker, and is a bit different from the preceding tags.
Normally HTML documents flow on and on and ignore any carriage returns you may put in. This is totally unlike most word processors. Not only that but your browser will collapse multiple spaces or blank lines into a single space. (you can put all the blank lines - carriage returns - you want into your HTML source file, and it'll appear as a single space when viewed by a browser. Often HTML documents are spacious and use tabs to make it easier to understand them when the time come to change them.)
If you want to indicate a blank line you must use the <P> tag. The closing </P> tag is not essential.
You've probably noticed that inserting a paragraph break puts in a whole blank line. This is obviously undesirable in some instances, where you simply want a new line to start. The line break tag is used for this purpose. Thus:
Vancouver Community Network<BR>
Sometimes you'd like to be able to indent text in from the left side of the screen, much as this document does throughout. Unfortunately basic HTML has no provisions for tabbing and so on, but does have a tag called "blockquote".
Block quotes are intended to be used when you quote a sizeable portion of someone else's work. Normally when you do that in a print document the text is indented from the left side. So, by using the blockquote tag you can move things in from the left margin. This is not using the code precisely as it was intended, but it achieves an effect on virtually all browsers, so why not?
<HR> … Horizontal Rules
Most browsers can draw horizontal lines across the screen. This is one of those tags with no closing tags.
<PRE>…</PRE> Preformatted Text.
Another useful trick is putting in preformatted text. As noted above, your browser will collapse any lines of spaces to a single space, and any blank lines of text to a single space. This can be a problem if you want to display something like a table of text that has been formatted with spaces to make things line up.
The answer? The <PRE> tag, which allows you to insert preformatted text. When you put in preformatted text the browser won't collapse any spaces. Graphical browsers also usually display preformatted text with a typewriter-style font.
<PRE> This is preformatted text. Notice how text lines up? </PRE>
A common element of many Web pages is a list of items, marked with bullets. (dots) HTML can do this for you automatically.
This code is displayed by the browser like this:
The <UL> tag specifies an "unordered list." The <LI> tag specifies a "list item". And the closing </UL> tag marks the end of the list.
Sometimes you want numbered lists, though. HTML supports them too!
- Combine ingredients
- Bake at 375 degrees until brown
- Cool before eating
Maybe you want a list of words with their definitions indented and following:
Since the angled brackets < and > and the quotation mark " have special meaning in HTML, how can you use these characters within a document? Obviously if you were to type an angled bracket into a document the browser would assume you were trying to put in a tag and things would break down.
HTML solves this problem by using "escape sequences". An escape sequence starts with the ampersand (&) character and ends with a semicolon (;). Unlike the rest of HTML, these escape sequences are case sensitive!
In other words, uppercase differs from lowercase.
Accents and Diacriticals.
One very useful feature for a "World Wide" Web is the ability to support non-English accents and diacritical characters. Since different computers - PCs, Macs, UNIX systems - all handle accents differently, HTML has its own way. HTML accents are similar to special characters. Here are a few HTML diacriticals used in French.
çis a lowercase c with a cedilla. Français.
For a complete list of all the non-English diacritical characters, check this URL:
HTML - Hyperlinks.
One of the most useful aspects of HTML -- the ability to embed links to other documents and servers within a file. Links are marked by the <A> tag, which indicates a hypertext anchor.
Linking to another document.
Say the document you want to link to is in the same directory ("folder" on a Mac) as your current file, and is named "page2.html". A link to it would look like this:
Let's go to <A HREF="page2.html">the second page.</A>
Here the tag, HREF, stands for hypertext reference. It indicates the name of the reference to which this anchor is pointing. When viewed with a text-only browser the "hot text," or text between the <A></A> tags, is highlighted. When viewed with a graphical browser the hot text is usually a different colour and underlined. In the example above the hot text is the phrase "the second page."
Linking to a URL.
In this example, the anchor can point to a URL.
The <A HREF="http://www.vcn.bc.ca/">Vancouver CommunityNet</A> Web site.
When pointing to a local page or a site on the other side of the world, always remember to close the anchor with a </A> tag! If you don't then the rest of your document will be an anchor. Also, be absolutely certain to enclose the filename or URL in double quotation marks. If you don't then the link will not work with many browsers!
A common application of a link is a hot piece of text that allows the user to send an email message to the author of the page. You simply use the A HREF technique as above, only you use a "mailto:" URL.
Send fan mail to <A HREF="mailto:email@example.com">theCommunityNet Web administrator.</A>
Plain Text Links - A quick and easy shortcut.
If you have a number of text files you want to make available on the Web, it can be annoying to have to convert those files to HTML first. Here's a quick shortcut.
All Web browsers are capable of diplaying ordinary text files as well as HTML files. Make sure the text file you want displayed ends in the suffix ".txt" and that it has line breaks (returns) inserted into it every 70 columns or so. Link the file in as you would any other file.
Display a <A HREF="plaintext.txt">plain text</A> file.
Your browser will then display the plain text file as-is. It's essential that you have line breaks already inserted into the file as browsers will not wrap long lines of text.
One of the most compelling features of HTML is the ability to embed small picture files that can be viewed using a graphical browser. This is done using the <IMG> tag, as shown below.
Notice, first of all, that this is a tag containing various options. In this case the SRC attribute specifies the source file, or the graphic to be displayed. Here the source is a picture named "picture.gif" that is in the same directory as the HTML file itself.
There are two common file formats for pictures that you are likely to see on the Web - GIF and JPEG.
GIF stands for Graphic Image Format. This is the most popular format for storing graphics on the Web at present. The inventors of the format pronounce its name "jiff" but some people say it "giff".
GIF files are compressed, so they take up a minimal amount of space on disk. A GIF picture can have up to 256 colours or as few as 2. Generally, GIF files bear the suffix ".gif".
JPEG is another standard for graphics. This is not as popular as GIF, but is growing rapidly in popularity. JPEG stands for Joint Photo Expert Group and, as its name implies, is suited for large quality photographic images. A JPEG file can contain thousands or even millions of colours, and uses sophisticated compression methods to make the picture as small as possible. JPEG files are suffixed as ".jpeg" or ".jpg" files.
A cautionary note about graphics.
It's very possible that some people will view your web page in a non-graphical way. They could be visually impaired, using a text browser or simply have turned off graphics to speed up their browsing. Whatever the case it is good form to always include a value for the "ALT" attribute of the "IMG" tag:
<IMG SRC="picture.gif" ALT="Welcome!">
People viewing this code with a graphical browser get to see a graphic, and people using a text-only browser see the text "Welcome!". Try to keep the text relevant to the image. Another useful trick is putting in blank or "null" ALT field. If you do this:
<IMG SRC="picture.gif" ALT="">
people with text-only browsers won't see anything at all! This is useful if the graphic is purely decorative and serves no other function. You have to put in an ALT tag with an empty quoted string - nothing between the quotation marks. If you omit the ALT tag then text-only browser users will see the message "[IMAGE]", which is rather ugly.
Aligns text around an image in a word wrap style. Can be TOP, MIDDLE, BOTTOM, RIGHT, or LEFT
<IMG SRC = "picture.gif" ALIGN = "LEFT">
Specifies the pixel size of the border that surrounds the image
<IMG SRC = "picture.gif" BORDER = "10">
Spcifies the height of the image in pixels
<IMG SRC = "picture.gif" HEIGHT = "202">
Specifies the horizontal margin around the image in pixels
<IMG SRC = "picture.gif" HSPACE = "5">
Specifies the vertical margin around the image in pixels
<IMG SRC = "picture.gif" VSPACE = "5">
Specifies the width of the image in pixels
<IMG SRC = "picture.gif" WIDTH = "202">
You can create a table to neatly display information on a Web page. One thing to keep in mind is that you may include any Body HTML (i.e. markup that can go in the Body tag) in a table cell, including tables. Therefore, you could make the contents of a cell bold or italic
Table Basic Tags
Specifies an HTML table. (By default the table will have no borders)
Specifies a Table Row
Specifies a Table Header Cell
Specifies a Table Data Cell
The <TABLE> tag comes with several attributes to make it look nicer. These attributes help you define your whole table whereas there are further attributes which further define table cells ( inside the <TH> and or <TD> tags) and more attributes which define table rows (inside the <TR> tags).
<TABLE> specifies the postion of the table with respect to the document. Valid values can be LEFT, RIGHT, and CENTER. Within the <TH> or <TD> tags, align specifies the alignment of text within the cell. Within the <TR> tag, align specifies the alignment of text within the cells in that row specifically.
<TABLE> colors the table background. Within the <TH> or <TD> tags, specifies the background color for the table cell. Within the <TR> tag, specifies the background color for the table cells in a specific row.
<TABLE> only. Specifies the pixel width of the border that divides table cells and the table itself.
<TABLE> only. CELLPADDING specifies the amount of space between the borders of the table and the actual data in a cell whereas CELLSPACING specifies the amount of space inserted between table cells.
<TABLE> specifies the height of the table in pixels or as a % of available space. Within the<TH> or <TD> tags, HEIGHT specifies the height of a cell.
<TABLE> only. Specifies the width of the table in absolute pixels or as a % of available space.
<TH> or <TD> only. Specifies the number of columns that a single cell should span.
<TH> or <TD> only. Specifies that the text within the cell should not be word-wrapped.
<TH> or <TD> only. Specifies the number of rows the cell should span.
<TR> only. Specifies the vertical alignment of text within the cells in the row. Valid values are TOP, BOTTOM, or CENTRE
Web Building Links
Below is a comprehensive list of links that will help you take Web design to the next level. If you're just starting out in building your Web site, check out Webmonkey and Builder.com, two of the best sites for general Web site tips.
Web Usage Statistics
XML, Style Sheets, Java and CGI
Writing for the Web
Images and Colours
Usability and Information Architecture
Web Site Promotion
HTML Tips and Tricks
Advanced HTML Tips
Size does matter (screen size, that is)
Browser Compatibility Chart
Cascading Style Sheet Compatibility Chart
Stuck with old browsers until 2003
HTML 3.2 Specifications from W3C
HTML Text Tips
Dynamic HTML Tips
Web Usage Statistics
Angus Reid: Canadians on the Internet
AC Neilson Canadian Internet Survey 1998
XML, Style Sheets, Java and CGI
Style Sheet Centre on Builder.com
20 Questions about XML
Introduction to XML
The CGI Resource Index: Programs and Scripts: Perl
Drag 'n' Drop CD-ROM Home Page
Matt's Script Archive
Writing for the Web
Love your labels
How to write links
How to write a Web Style Guide
Images and Colours
Web Graphics 101
Usability and Information Architecture
Why frames suck
Squishy's Crash Course in Information Architecture
The Navigation and Usability Guide
Organizing your site from A-Z
Web Site Promotion
On interpreting access statistics
Brian D. Davison's Web Caching Resources
Appendix A - Basic Template.
The following HTML code is a basic template for creating a single Web page on the CommunityNet. It's the basic code that's generated by our Instant Web Page Kits. If you want to create a simple Web page on our system, just take this text and copy and paste in the appropriate information for your organization. Everything that's in italics should be replaced by your information.
<HTML> <HEAD> <TITLE>Your IP's name</TITLE> <LINK REV="owner" HREF="mailto:Your email address"> <!-- created the date by your email address --> <!-- modified the date by your email address --> </HEAD> <BODY> <CENTER> <!--#include virtual="/includes/ip-hosted1.html"--> <HR> <IMG SRC="sample.gif" ALT="" WIDTH=190 HEIGHT=76 VSPACE=10><BR> <FONT SIZE="+2">- <B>Your IP Name</B> -</FONT> <P> <HR> </CENTER> <BLOCKQUOTE> <B>Mission:</B><BR> Your organization's mission statement. <P> <B>Services:</B><BR> Services, if appropriate. </BLOCKQUOTE> <P> <HR> <P> <BLOCKQUOTE> <B>Contact:</B><BR> Contact name, if appropriate. <P> <B>Your IP's name</B><BR> Street address<BR> City, Province. Postal code<BR> Canada. <P> <B>Phone:</B> Your phone number. <BR> <B>Fax:</B> Your fax number. <P> <B>Email:</B> <A HREF="mailto:Your email address">Your email</A> <P> This page last updated the date. <P> Copyright © 1996 Your IP's name. <P> </BLOCKQUOTE> <!--#include virtual="/includes/footer.html"--> </BODY> </HTML>
First, be very careful with the lines that begin with <!--#include. If you insert an additional space in the wrong place in those lines then they won't work. Sorry, but computers can be very fussy at times. Note also that the #include information will only appear once your file has been installed on the VCN computer.
Second, we have a couple of variants on the "We're hosted by CommunityNet" graphic. If you don't like the default one, try the other option:
Third, you can have our system automatically timestamp your files with the most recently modified date. That saves you the bother of having to update the date manually each time you make a change. Just put in the following text into your page:
This document was last modified on
<!--#config timefmt="%B %e, %Y"-->
Once again, be very careful if you use this code - it must be typed in precisely as show above, or it won't work. For more details on this, check out theCool Stuff page.
Appendix B - Coding Style
Coding Style by Example
Here is an example of "bad" code.
<HTML ><head><Title>Hello World</TITLE></Head><Body bgcolor=white text=black> This is a very simple web page. Notice that the browser does not pay attention to spaces that we add to our document unless you specify what type of spacing you want<p>Like when you use a paragraph tag or a <br> break line tag. <A HREF = "next.html"> click here to go on to the next page </A></BODY>
Here is an example of good code:
<BODY BGCOLOR = "FFFFFF" TEXT = "000000"
VLINK = "AAAAAA" ALINK = "564345">
<!-- CREATED NOVEMBER 25, 1999 -->
This is a very simple web page. Notice that the
browser does not pay attention to spaces that we
add to our document unless you specify what
type of spacing you want
Like when you use a paragraph tag or a
<BR>break line tag.
Continue<A HREF = "next.html">on</A> to go to the next page.
<!-- ENDING OF PAGE BODY -->
Appendix C - Top Ten Tips for Designing a Usable Web Site
Appendix D - Uploading/Downloading files.
Well, once you've created your splendid HTML documents, how do you get them to the CommunityNet? Here are the steps:
1. You can upload the files using our dialup lines.
Log into our dialup system and transfer the files using WS_FTP95 LE or other FTP programs - FTP (file transfer programs).
a) Log into the system using your VCN dialup network connection.
b) Once you have logged onto the VCN system, load your FTP program (instead of Netscape Communicator). You will then have to connect to our webserver using the address, username, and password given to you by us.
When you click on Connect, your screen should look like the diagram below:
c) Type in the following on the general menu as it appears above:
Profile Name: Vancouver CommunityNet
Host Name/Address: vcn.bc.ca
Host Type: Automatic detect
User ID: your vcn login ID
Password: your vcn password (click on save Pwd so you don't have to type this again)
d) Before connecting, click on the startup menu and type in the following:
Initial Remote Host Directory: webdata
Initial Host Directory: c:\mywebdata (or whatever directory you have set up on your local system to hold your website files)
e) Click on the connect button at the bottom left of the screen to make the connection.
Your screen should look like the diagram below:
f) Downloading (from the webserver to your personal computer):
Highlight the file(s) you want to download by clicking on them in the right column. Click on the left arrow key ( ß ) in the middle column to transfer the files to your c:\mywebpage directory.
You can now edit this file using either a webpage editor or a text based program.
g) Uploading (from your personal computer to the webserver)
Highlight the file(s) you want to upload by clicking on them in the left column. Click on the right arrow key ( à ) in the middle column to transfer the files to your webdata directory on the server.
h) Close your FTP connection by simply clicking on the close button (bottom left of screen).