LIS901N: HTML

43
LIS901N: HTML Thomas Krichel 2003-01-05

description

LIS901N: HTML. Thomas Krichel 2003-01-05. Structure of talk. HTML HTML standards and standard adherence much of which based on a paper by Brian Kelly, at http://www.ariadne.ac.uk/issue33/web-focus/. HTML and XHTML. HTML is the hypertext markup language - PowerPoint PPT Presentation

Transcript of LIS901N: HTML

Page 1: LIS901N: HTML

LIS901N: HTML

Thomas Krichel

2003-01-05

Page 2: LIS901N: HTML

Structure of talk

• HTML• HTML standards and standard adherence much

of which based on a paper by Brian Kelly, at http://www.ariadne.ac.uk/issue33/web-focus/

Page 3: LIS901N: HTML

HTML and XHTML

• HTML is the hypertext markup language• HTML is a markup language that is widely used

on the Word Wide Web (WWW)• The latest, and probably last version of HTML is

at http://www.w3.org/TR/html4/• The WC3, the standard making body for the

WWW, have issued XHTML, a replacement of HTML that is compatible with XML.

• We will ignore XHTML for the rest of the course.

Page 4: LIS901N: HTML

What is Markup?

• Everything in a document that is not content. It can be give in two ways

• 1: Procedural– Codes identify point size, style, font, etc.– Usually understood by defining tool– Example: M$ Word

• 2: Descriptive– Describes purpose of text within the document– Chapter head, Paragraph, Section Head, TOC– Structure and Style are kept separate– Example: LaTeX, SGML

Page 5: LIS901N: HTML

Procedural vs Descriptive

Page 6: LIS901N: HTML

SGML• Standard Generalized Markup Language• Descriptive approach with three separate layers

– structure: types of information in document– content: the information itself– style: matches typesetting with structure

• Document Type Definition (DTD)– Defines the structure

• Developed for the publishing industry by a group around Goldfarb.

• So complicated that no software implements it fully

Page 7: LIS901N: HTML

SGML Document Type Definition

• Describes information the document handles– e.g Title,TOC, Chapter, Section

• Relationships between fields– e.g. A Chapter contains Sections

• Consistency• Logical structure• Information defined by tags

Page 8: LIS901N: HTML

HTML

• HyperText Markup Language• Defines an SGML DTD

– Head, Title, Body, Paragraph, etc.– Headings, Bold, Italic, etc.– Table, List, Image, etc.– Links to other documents– Forms

• Style applied by Web Browser– User has some control

Page 9: LIS901N: HTML

HTML Tags• HTML markup is written as tags. Tags are written

as pairs (typically)– begin with <atag>– end with </atag>– atag is the tag name

• Can be nested • Can contain non-markup data• Tag names are case-insensitive, but it is best to

use the same case, consistently, for human readability.

Page 10: LIS901N: HTML

attributes to tags

• <atag attribute_name_one=“value_one” attribute_name_two=“value_two”>

• Here attribute_name_one and attribute_name_two are attribute names

• and value_one and value_two are attribute values.

Page 11: LIS901N: HTML

Common Tags

• Always include the <HTML>…</HTML> tags• Comments start with <!-- and end with --!> • HTML documents

– <HEAD> section• Info about the document• Info in header not generally rendered in display window• TITLE element names your Web page

– <BODY> section• Page content• Includes text, images, links, forms, etc.• Elements include backgrounds, link colors and font faces• P element forms a paragraph, blank line before and after

Page 12: LIS901N: HTML

common frame for pages

• Put the following in your pages:<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0

Transitional//EN" "http://www.w3.org/TR/REChtml40/loose.dtd">

<HTML><HEAD><TITLE></TITLE></HEAD><BODY></BODY></HTML>

• The first three lines are the SGML document type declaration that says which kind of HTML it is, we use version 4.0

• Close nested tags properly.

Page 13: LIS901N: HTML

Headers

• Headers <H1> to <H6>– Simple form of text formatting – Vary text size based on the header’s “level”– Actual size of text of header element is selected by

browser – Can vary significantly between browsers

• <CENTER> element – Centers material horizontally– Most elements are left adjusted by default

Page 14: LIS901N: HTML

Text Styling• Underline style

– <U>…</U>

• Emphasis (italics) style– <EM>…</EM>

• Strong (bold) style– <STRONG>…</STRONG>

• <B> and <I> tags deprecated – Overstep boundary between content and presentation

• Strikethrough with <DEL>• Superscript: <SUP> element• Subscript: <SUB> element

Page 15: LIS901N: HTML

Line break

• Use <p> to create a new paragraph• Use <br> to create a line break!• To have several <P> use &nbsp; • Align elements with ALIGN attribute

– right, left or center

• Example <p align="center”> </p>• You do not need to close <p> and <br>

Page 16: LIS901N: HTML

Linking

• Links inserted using the A (anchor) element– Requires HREF attribute which specifies the URL you

would like to link to• <A HREF = “address”>…</A>• Can link to email addresses, using• <A HREF = “mailto: emailaddress”>…</A>• Note quotation mark placement

• Example:

<a href=“http://openlib.org/home/krichel/”>

Thomas Krichel</a>

Page 17: LIS901N: HTML

Uniform Resource Locator (URL)

http://arcano.openlib.org/~krichel/sae.html”

URL can be • Absolute – contain all parts of URL;• Relative – present path and file name relatively current

file.

Scheme Server name Pass File name

Page 18: LIS901N: HTML

Scheme

• http – Hypertext Transfer Protocol to access Web-pages

• ftp – File Transfer Protocol to download the file from the net

• mailto – to send electronic mail• File – to access file on a local hard disk (File

scheme uses ///).• and others…

Page 19: LIS901N: HTML

Relative URL (examples)

• A file from the same folder as current file:“file.html”

• A file from a subfolder of current folder:“images/picture.gif”

• A file from another folder at the same hierarchical level:

“../info/data.html”• same conventions as in UNIX!

Page 20: LIS901N: HTML

Links inside document: anchors

• Place the cursor in the desirable part of a page, where the link should bring visitors

• Create an anchor

<A NAME=“anchor_name”>Label text</A>• Label text is a text or image that should be

referenced, i.e. where the link should bring the visitor to.

• To link to the anchor, use– <A HREF=“#anchor_name”>Label text </A> or – <A HREF=“URL#anchor_name”>Label text </A>

Page 21: LIS901N: HTML

Images

• Insert image into page with the <IMG> tag, attributes:– SRC = location– BORDER (in pixels black by default)– ALT (text description for browsers that have images turned off or

cannot view images, required)

• location can be any URL, or a file name on the server machine

• Pixel– Stands for “picture element”– Each pixel represents one addressable dot of color on the

screen

Page 22: LIS901N: HTML

Color

• Preset colors (white, black, blue, red, etc.)• Hexadecimal code

– First two characters for amount of red– Second two characters for amount of green– Last two characters for amount of blue– 00 is the weakest a color can get– FF is the strongest a color can get– Ex. black = #000000

Page 23: LIS901N: HTML

background

• Image background– <BODY BACKGROUND = “background_image_file”>– Image does not need to be large because the browser

tiles the image across and down the screen

• Color background– <body bgcolor=“color”>– color is an indication of color as previously explained.

Page 24: LIS901N: HTML

Formatting Text With <FONT>• <FONT> allows to change font if browser

allows it. <FONT> attributes:– COLOR="color"

– SIZE• To make text larger, set SIZE = “+x”• To make text smaller, set SIZE = “-x”• x is the number of font point sizes• x is between 1 and 3

– FACE• Font of the text you are formatting• Be careful to use common fonts like Times, Arial, Courier

and Helvetica• Browser will display default if unable to display specified

font

Page 25: LIS901N: HTML

Special Characters

• Inserted as an entity reference– Format can be &code;

• Ex. &amp; – Insert an ampersand

– Codes often abbreviated forms of the character– Codes can be in hex form

• Ex. &#38; to insert an ampersand

http://www.w3.org/TR/REC-html40/sgml/entities.html has the list

Page 26: LIS901N: HTML

Horizontal Rules

• <HR> tag Inserts a line break directly below it• HR attributes:

– WIDTH• Adjusts the width of the rule• Either a number (in pixels) or a percentage

– SIZE• Determines the height of the horizontal rule• In pixels

– ALIGN• Either left, right or center

– NOSHADE• Eliminates default shading effect and displays horizontal rule as

a solid-color bar

Page 27: LIS901N: HTML

Slides prepared by K.Clarck

Tables

A table is a matrix formed by the intersection of a number of horizontal rows and vertical columns.

Column 1 Column 2 Column 3

Row 1

Row 2

Row 3

Page 28: LIS901N: HTML

Slides prepard by K.Clark

Tables (continue…)

The intersection of a column and row is called a cell. Cells in the same row or column are usually logically related in some way.

Column 1 Column 2 Column 3

Row 1

Row 2

Row 3

CellCellCell

CellCellCell

CellCellCell

Page 29: LIS901N: HTML

Tables (continue…)

Container

<TABLE> … </TABLE>

Attributes:

BORDER= n – the border thickness in pixels

WIDTH=x – width of the table or a cell within the table in pixels or relative size to the screen display (0% to 100%)

Page 30: LIS901N: HTML

Tables (continue…)

• A table is formed row by row • To define a row

<TR>…</TR> is used• Within a row table cells with data is determined by

<TD>…</TD> or with headers by

<TH>…</TH>

Page 31: LIS901N: HTML

Simple Table (example)

<TABLE><TR> <TH>Month</TH> <TH>Quantity</TH> </TR><TR> <TD>January</TD> <TD>130</TD></TR><TR> <TD>February</TD> <TD>125</TD> </TR><TR> <TD>March</TD> <TD>135</TD> </TR>

</TABLE>

Page 32: LIS901N: HTML

Tables (more complicated)

• To span a cell across a few columns, use the attribute

COLSPAN=n, where n is number of columns is used

• To span a cell across a few rows use the attribute

ROWSPAN=n, where n number of rows is used

Page 33: LIS901N: HTML

Cell Attributes

• FONT – establishes the font of a cell• ALIGN – determines horizontal alignment of cell

content, accept values: “left”, “center”, or “right”• VALIGN - determines vertical alignment of cell

content, accept values: “top”, “middle”, “bottom”, or “base line”

Page 34: LIS901N: HTML

Purposes to use tables

• To present tabular data• To create multicolumn text• To create captions for images• To create side barsCells may contain various HTML containers:Images, Hyperlinks, Text, Objects, even Tables

Page 35: LIS901N: HTML

why standards matter

• Avoiding Browser Lock-in • Maximize Access To Browsers • Maximize Accessibility • Enhance Interoperability • Enhance Performance • Facilitate Debugging • Facilitate Migration

“Arguing that a resource is almost compliant is like describing someone as almost a virgin!”

Page 36: LIS901N: HTML

Components to add

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

<html>

<head>

<title></title>

<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/> </head> <body> </body> </html>

Page 37: LIS901N: HTML

use validator service

• http://validator.w3.org/• or do it on wotan, Thomas has a validator

installed there.

Page 38: LIS901N: HTML

problem: Ampersand in URL

• <!-- This is invalid! --> <a href="foo.cgi?chapter=1&section=2">...</a>

• This example generates an error for "unknown entity section" because the "&" is assumed to begin an entity.

• To avoid problems with both validators and browsers, always use &amp; in place of &:

• <a href="foo.cgi?chapter=1&amp;section=2">...</a>

Page 39: LIS901N: HTML

problem: incorrect nesting

• Elements in HTML cannot overlap each other. The following is invalid:

• <B><I>Incorrect nesting</B></I>

• The following is valid:

• <B><I>Correct nesting</I></B>

Page 40: LIS901N: HTML

problem: casing in doctype

• In a doctype, the formal public identifier--the quoted string that appears after the PUBLIC keyword--is case sensitive. A common error is to use the following:

• <!doctype html public "-//w3c//dtd html 4.0 transitional//en">

• Transitional uses different case:• <!doctype html public "-//W3C//DTD HTML 4.0

Transitional//EN">

Page 41: LIS901N: HTML

problem: name attribute

• The HTML 4.0 Specification did not allow a NAME attribute for a FORM or IMG element. However, the HTML 4.01 Specification allows them Thus, you can now use the following document type declaration if you use those attributes:

• <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN” "http://www.w3.org/TR/html40/loose.dtd">

• Using a href as scr for an image is a stupid idea.

Page 42: LIS901N: HTML

Using special characters

Compose the document entirely with US-ASCII characters.

Represent other than ASCII characters using character references of the form &#number; where number is the code number of the character in ISO 10646 (Unicode) in decimal notation.

Configure things so that the Web server sends the document with the HTTP headerContent-type: text/html;charset=utf-8

Page 43: LIS901N: HTML

http://openlib.org/home/krichel

Thank you for your attention!