ECT 250: Survey of e-commerce technology
description
Transcript of ECT 250: Survey of e-commerce technology
ECT 250: Survey of e-commerce technology
Searching, images, frames, and markup languages
2
Searching the WWW• Exploring the Web can be very time-consuming.• Search engines and directories enable you to locate
relevant web pages more quickly and efficiently.• A search engine is software that allows you to type
in keywords. The engine scans a database of Web pages and displays a list of pages that meetyour criteria.
• A directory organizes Web pages into categories.You can click on appropriate categories until youfind a Web page that matches your chosen topic.
3
Search engines/directories• Altavista (http://www.altavista.com)• Excite (http://www.excite.com)• DirectHit (http://www.directhit.com/)• Fast Search (http://www.ussc.alltheweb.com/)• Go (http://www.go.com)• Google (http://www.google.com)• HotBot (http://www.hotbot.com)• Northern Light (http://www.northernlight.com)• Yahoo (http://www.yahoo.com)• Web Crawler (http://www.webcrawler.com)
4
Naïve searches• A single keyword search can yield thousands
of sites, many of which are irrelevant.Example: A search for climbing yields 2,400,000 hits.
• Multiple keywords can help.Example: Illinois, Wisconsin, climbing yieldsonly 32,500 hits.
• To save time and effort it pays to construct amore sophisticated search that will yield fewerhits with a higher percentage of relevant pages.
5
Searching tips• Use a directory to find information on a general
topic. Use keywords in a search engine forspecific information or narrow topics.
• Use the searching tips to construct a precise query.• Use multiple, specific keywords and synonyms.• Use advanced search features to make your query
more focused.• Try multiple search engines/directories or use a
meta-search engine (e.g. DogPile).• Use a specialized search engine (e.g. Business
search engine)
6
Advanced search options• Special operators (and, or, not, near)• Search for phrases, not just keywords• Domain specific searches: include or exclude
pages based on their domain• Specify the language of the search• Page specific searches: pages that link to or are
similar to a given page• Give a bound on the most recent update• Specify whether the site contains images, audio,
or visual informationExample: www.google.com
7
LimitationsSearch engines examine only a fraction of the web pages available on the World Wide Web.
A study released in 1998 estimated that the best engines indexed only 33% of the publicly indexableWeb. The 1999 follow-up study found the coveragehad decreased to only 16%.
More important are the techniques used by the searchengine in ranking and updating pages.
8
• Most Web pages contain graphical images toadd interest, make navigation easier, and toconvey necessary information.
• Most Web users will wait only a short time fora page to load, so efficiency considerations are important.
Loading efficiency
9
• Graphic formats are usually referred to by their fileextensions, such as .tif, .bmp, .gif, .jpg, and .png.
• Web page images are commonly in either the .gif.jpg, or .png format.
• Graphic formats are usually compressed. Filecompression can either by lossless, which doesnot decrease image quality, or lossy, which doeslose image quality.
Graphic formats
10
• The Graphics Interchange Format (GIF) is the standard format for Web page images and issupported by all browsers that display images.
• It is an efficient, compressed format that allowsup to 256 colors. It uses lossless compression.
• GIF images are always rectangular, but atransparent background can be used to makethe images appear to be non-rectangular.
• GIF images can be interlaced, which means thatthe image is displayed initially at low resolutionand its quality is increased as it downloads.
GIF
11
• The Joint Photographic Experts Group (JPEG)format is supported by most browsers thatdisplay images.
• JPEG images use lossy compression. The amountof compression ranges from 0% to 100%. Thehigher the compression, the smaller the file sizeand the lower the image quality.
• JPEG cannot be made transparent, but it can bespecified as a progressive JPEG, which is loadedthe same way as an interlaced GIF.
JPEG
12
• The Portable Network Graphics (PNG) format isa new(ish) format created for Web page images.
• It is expected that it will eventually replace GIF.• PNG images use a lossless compression that is
more efficient than GIF.• It can use a color palette of 256 colors or less like
GIF or support true color like JPEG images.• PNG images can be interlaced and transparent.
PNG
13
• The GIF or PNG format is usually used for lineart such as clip art, logos, etc.
• JPEG is chosen for photographs because true color is desirable and selecting the amount ofcompression can result in smaller sized files.
• One approach is to save an image in severalformats and choose the one with the smallestfile size that produces acceptable quality.
Selecting a format
14
• GIF, JPEG, and PNG images are all bitmappedformats, which means that the images aremade of a rectangular grid of pixels.
• Web images are measured in pixels.Example: 500 x 55
• Do not make images too wide. Images that donot fit into a single screen will force scrolling.
• For efficiency considerations, you may chooseto create a thumbnail image. This is a smallerversion of an image that allows a preview ofthe picture. Example: LLBean
Size considerations
15
Frames allow more than one Web page to bedisplayed within the browser window at a time.
When frames are used, the page opened in thebrowser is a special page containing instructionsabout how the browser window is to be dividedinto separate regions and which page should beinitially displayed into each region. This special page is called the frames page or frameset.
Frames
16
When frames are used, clicking on a link in oneframe can:• Change the contents of that frame• Change the contents of a different frame• Display a page without using the frames page
An application of frames is for a table of contentsor a navigation bar. Frames allow the contents ornavigation bar to be visible at all times.
Navigating with frames
17
Sites that use frames:• Macromedia: www.macromedia.com• National Discount Brokers: www.ndb.com• XSL Tutorial:http://www.zvon.org/xxl/XSLTutorial/Books/Book1/index.html
• A personal page: Jim Jacobson
Some sites that do not use frames:• Amazon: www.amazon.com• DePaul CTI: www.cs.depaul.edu• Gap: www.gap.com• NY Times: www.nytimes.com
Examples
18
There is a significant controversy about whetherthe use of frames is a good or bad thing.
What are some of the issues surrounding frames?
For a longer discussion of some of the issues see:• Aren’t frames bad?
http://www.gooddocuments.com/techniques/areframesbad.htm• Web design: frames – good or bad?
http://dionaea.com/web/frames.html
Frames: good or evil?
19
• Search engines do not deal well with frames• Printing becomes more difficult• Saving pages is more complicated• Creating browser bookmarks may not work• Frames can require large resolution
Why use frames at all?
Some problems with frames
20
• Navigation can be easier• Easier updating of pages
Many of the problems given on the previous pageare technology issues. Once a solution is found,frames may become more attractive.
Example: MS IE 5.0 supports frames better thanprevious versions.
Benefits of frames
21
• Use frames only when the benefits outweigh thedisadvantages.
• Tables or shared borders can be used instead of frames to place a navigation bar, table of contents, or other item on the edge of the page.
• Frames have become much less popular at largeweb sites.
Conclusions about frames
22
Markup languages• FrontPage is an HTML editor.• HTML stands for hypertext markup language.• It is an example of a markup language.• Historically markup has described annotations
and handwritten notes found on manuscriptpages that tell a typist how a particular pageshould be laid out or typeset.
• Electronic markup languages are marked withtags to govern the display, formatting, andorganization of text elements.
23
Three markup languagesThree markup languages are of particular interest:1. SGML (Standard Generalized Markup Language)
is the parent language from which the other twoare derived. It is a meta language used to defineother markup languages.
2. HTML (Hypertext Markup Language)3. XML (Extensible Markup Language) is another
descendent of SGML. It defines data structuresimportant for a wide range of data exchangeactivities.
24
HTMLAn HTML document contains both document content and tags.• The content consists of all the information that
appears in the browser window, includingtext, graphics, and video.
• Tags are the HTML codes that specify how a the document should be formatted.
Example: http://facweb.cs.depaul.edu/asettle/
25
HTML tags• Each HTML tag is enclosed in angle brackets.• Two-sided HTML tags come in pairs.
The general form of a two-sided tag is:<tagname properties>Content</tagname>The opening tag is <tagname properties>.The closing tag is </tagname>.
• Some HTML tags are one-sided, requiring only the opening tag.
• Tags are not case-sensitive.
26
Types of tagsThere are a large number of tags. Some examples:• Document tags: specify the parts of the document
such as the heading, title, body.<title></title>, <html></html>
• Text structure tags: determine the layout of the text found in the body of the document.<h1></h1>, <p></p>, <br>
• Style tags: specify how text will be shown by thebrowser. <center></center>, <em></em>
• Image tag: <img src=“name” other-attributes>• Anchor tag: <a href = “URL”></a>
27
The meta tagSearch engines catalog sites by following links frompage to page and saving identification information for each page visited.
The main HTML element that interacts with searchengines is the Meta tag.
Using the Meta tag you can list information aboutyour page that allows a search engine to betterclassify the contents of your page.
28
Attributes of the meta tagThe Meta tag has two attributes that should alwaysbe used:1. The Name attribute identifies the type of Meta
tag you are including.2. The Content attribute provides information the
search engine will be cataloging about your site.Example:<Meta Name = “keywords” Content = “algorithms,complexity, quantum, information, retrieval, kolmogorov, security, arrays, cryptography, faculty,combinatorics”>
29
History of HTML• HTML 1.0: Introduced in 1991 by Berners-Lee.
At that time there was no standard for HTML.• HTML 2.0: Released in 1995.
Began to move to a standard. Released at thesame time were MS IE 2.0 and Netscape’sNavigator 2.0.
Recall that the World Wide Web Consortium(W3C) serves as a leader in maintaining Webstandards and common protocols. It was foundedin 1994.
30
History of HTML• HTML 3.2: Introduced in 1997 by the W3C.
Supported tables, complex numbers, and textflow around images.
• HTML 4.0: Released by W3C in 1997.Included support for cascading style sheets,and added international features such as theability to render text right to left.
• HTML 4.01: Released by W3C in 1999.Supported more multimedia options, scripting languages, and documents more accessible to users with disabilities
31
History of HTML• XHTML Basic: Released in December 2000 by
W3C, incorporating elements of XML intoHTML to allow development on a wider set of devices such as TVs, PDAs, pagers, andcellular phones.
• Coming soon from W3C: XHTML 1.0, which isa reformulation of HTML 4.0 in XML.
32
SGML• Work on the definition of a Generalized Markup
Language for describing electronic documentsand their format was begun in the 1960s.
• In 1986, the International Standards Organization(ISO) adopted a version of the standard calledStandard Generalized Markup Language.
• SGML includes a standard that defines device-independent and machine-independent methodsfor representing electronic documents.
33
Advantages of SGML• SGML is good for organizations with special or
complex requirements for the management ofdocuments. Examples: U.S. DOD, HP
• It is stable since it was standardized in 1986.• It is platform independent and will outlive most
current applications.• It supports user-defined tags and architecture.
Why is SGML not used by everyone?
34
Disadvantages of SGML• SGML’s tools are relatively expensive when
compared to HTML.• SGML has a steep learning curve.• It is costly to set up and maintain, requiring
extensive training and expertise.• Creating document type definitions with SGML
can be expensive in terms of human labor.
35
XML• Extensible Markup Language is also derived from
SGML, although it is newer than HTML.• It represents an effort to define what information
is on a Web page. This contrasts with HTMLwhere the emphasis is on the format of the data.
• XML allows designers to easily describe anddeliver structured data from any application ina standard, consistent way.
36
Idea behind XML• XML is both a markup language and meta
markup language.• XML allows you to create new tags for each
type of document you are storing.• In this way, XML stores information in a
structured manner.• It is also interoperable with both HTML and
SGML. This allows data stored in XML tobe displayed (using HTML) and integratedwith SGML documents.
37
XML example I<article>
<title>Some XML</title><date>April 25, 2001</date><author>
<mname>Amber</mname><lname>Settle</lname>
</author><summary>Sample XML</summary><content>XML is not for displaying information
but for managing information.</content>
</article>
38
XML example II<list>
<employee><fname>Simone</fname><lname>Settle</lname><ssn>123-00-5454</ssn><salary>70000</salary><position>network administrator</position><hire-year>1999</hire-year>
</employee><employee><fname>Joon</fname>
<lname>Elam</lname><ssn>456-88-7654</ssn><salary>62000</salary><position>web designer</position><hire-year>2000</hire-year>
</employee></list>