Embeddedsystems Www.revastudents.info

Downloaded from www.pencilji.com


Preface

Embedded Systems are dedicated systems that do specialized tasks and often

contain hardware, software and many other subsystems. These computing

components are embedded in many applications and situations that cover virtually

every aspect of our daily lives, ranging from the use of simple household appliances

(such as alarm clocks, toaster ovens, room temperature control panels, microwave

ovens) to transportation systems (e.g., automobiles, traffic lights, airplanes), and many

communication, recreation and entertainment products (e.g., cellular phones,

electronic organizers, exercise machines, video games, TV set-top boxes). To

appreciate these advances, and to productively contribute to future advances of such

systems, a critical appreciation of the underlying principles of embedded systems is

necessary. The goal of this course is to develop a comprehensive understanding of the

basic concepts involving the development of embedded systems. The students will

develop an appreciation of the technology capabilities and limitations of the software

and hardware components for building embedded systems.

Unit 1 gives an introduction to embedded systems and explores the

characterstics, functions, classification and elements of embedded systems. Unit 2

takes up one of the elements of embedded systems viz microcontrollers and deals with

the characterstics and applications of microcontroller systems. Unit 3 introduces the

basic tools required to build an embedded program along with an example program.

Unit 4 deals with embedded operating system elements and also one subtype of an

embedded operating system, namely real time operating system. This course is of

introductory nature and deals with the basic topics of embedded systems. The students

are encouraged to explore each of the topics in detail.



UNIT 1

Introduction to Embedded systems

Contents

1.1 Introduction

1.2 Objectives

1.3 Embedded Systems and their Characteristics

1.3.1 What is an Embedded System?

1.3.2 History and Future

1.3.3 Functions of Embedded Systems

1.3.4 Embedded System Types

1.3.5 Example Embedded System Types

1.3.6 Characterstics of Embedded Systems

1.4 Real Time Systems

1.5 Classification and Requirements of Embedded Systems

1.5.1 Classification of Embedded Systems

1.5.2 Typical Embedded System Constraints

1.5.3 Embedded Processor Types

1.5.4 Programming Languages and Environments

1.5.5 Variations in Embedded Systems

1.5.6 Implementations of Embedded Systems

1.6 Embedded Systems Design World - a View

1.7 Major Components of an Embedded System

1.8 Summary

1.9 Self Test

1.10 Questions



1.1 Introduction

An Embedded system is a specialized computer system that is part of a larger

system or machine. It is embedded into some device for some specific purpose other

than to provide general purpose computing. A typical embedded system consists of a

single-board microcomputer with software in ROM, which starts running some

special purpose application program as soon as it is turned on and will not stop until it

is turned off (if ever).

An embedded system may include some kind of operating system but often it

will be simple enough to be written as a single program. It will not usually have any

of the normal peripherals such as a keyboard, monitor, serial connections, mass

storage, etc. or any kind of user interface software unless these are required by the

overall system of which it is a part. Often it must provide real-time response.

Many of the electronic devices in our kitchens (bread machines, food

processors, and microwave ovens), living rooms (televisions, stereos, and remote

controls), and workplaces fax machines, pagers, laser printers, cash registers, and

credit card readers) are embedded systems.

1.2 Objectives

In this unit you will learn

What are embedded systems and their characterstics?

Functions and classification of embedded systems

Types of embedded systems

Real time systems

Embedded system design requirements

Embedded system implementation types



1.3 Embedded systems and their characteristics

1.3.1 What is an Embedded System?

An embedded system is a combination of computer hardware and software,

and perhaps additional mechanical or other parts designed to perform a specific

function. A good example is the microwave oven. Almost every household has one,

and tens of millions of them are used every day, but very few people realize that a

processor and software are involved in the preparation of their lunch or dinner

A general-purpose definition of embedded systems is that they are devices

used to control, monitor or assist the operation of equipment, machinery or plant.

"Embedded" reflects the fact that they are an integral part of the system. In

many cases their embeddedness may be such that their presence is far from obvious to

the casual observer and even the more technically skilled might need to examine the

operation of a piece of equipment for some time before being able to conclude that an

embedded control system was involved in its functioning. At the other extreme a

general-purpose computer may be used to control the operation of a large complex

processing plant, and its presence will be obvious.

Embedded can mean:

Instructions (code or logic) are permanently loaded into the processor.

Dedicated purpose (not general purpose).

The components of the system are internal to the device that contains them,

and not necessarily visible, even to experts.

System can mean:

A set of one or more microelectronic devices with some, or all, of the

capacities of a computer.

A processor (such as a mainframe computer, minicomputer or PC) dedicated

to one purpose, such as controlling a plant‘s operations.



It is a system where exactly there is no need for a computer system, but there

is a requirement for some computing. So this computing capability can be

programmed and put onto a chip & could be equipped with the existing system. For

eg. In a washing machine, this sort of embedded system is used to compute the no. of

ticks elapsed to stop the dryer automatically. As we do have the limited resources

such as RAM & ROM in an embedded system the entire program logic will be

implemented on a single processor chip. When the computational job for the

embedded system becomes complex, then we should have the RTOS(Real Time

Operating System) to be a part of the embedded system. Examples: Aircraft

Turbulance System, Printers, Barcode Scanner.. In some cases, these embedded

systems are connected by some sort of communication network, but that is certainly

not a requirement.

This is in direct contrast to the personal computer. It too is comprised of

computer hardware and software and mechanical components (disk drives, for

example), However, a personal computer is not designed to perform a specific

function. Rather, it is able to do many different things. Many people use the term

general-purpose computer to make this distinction clear. As shipped, a general-

purpose computer is a blank slate; the manufacturer does not know what the customer

will do with it. One customer may use it for a network file server, another may use it

exclusively for playing games, and a third may use it to write and execute some

programs.

A general-purpose computer is itself made up of numerous embedded systems.

For example, the computer consists of a keyboard, mouse, video card, modem, hard

drive, floppy drive, and sound card-each of which is an embedded system. Each of

these devices contain a processor and software and is designed to perform a specific

function. For example, the modem is designed to send and receive digital data over an

analog telephone line. That's it. And all of the other devices can be summarized in a

single sentence as well.

If an embedded system is designed well, the existence of the processor and

software could be completely unnoticed by a user of the device. Such is the case for a

microwave oven, VCR, or alarm clock. In some cases, it would even be possible to



build an equivalent device that does not contain the processor and software. This

could be done by replacing the combination with a custom integrated circuit that

performs the same functions in hardware. However, a lot of flexibility is lost when a

design is hard-coded in this way. It is much easier, and cheaper, to change a few lines

in software than to redesign a piece of custom hardware.

A definition of an embedded system is as follows:

An embedded system is a system whose principal function is not computational, but

which is controlled by a computer embedded within it.

The computer is likely to be a microprocessor or microcontroller. The word

embedded implies that it lies inside the overall system, hidden from view, forming an

integral part of a greater whole. One consequence of this is that the user may be

unaware of the computer‘s existence. Another is that the computer is usually purpose

designed, or at least customized, for the single function of controlling its system. If

removed from the system it would be an odd assortment of printed circuit boards

and/or integrated circuits, recognizable only to the specialist as something, which

might be called a computer.

Applying this definition tells us that a personal computer, even though it

contains a microprocessor, is not an embedded system. Its end function is to

compute. Even if the same computer was connected to a set of instruments, which it

then controlled, that would not be an embedded system. If, however, the same

computer was built permanently into an identifiable system, and customized so that its

sole purpose was to control the one system (which may mean losing such apparently

essential features as its case, keyboard, screen, or disk drives), then it would form part

of an embedded system.

Embedded systems come in many forms. They are extremely common in the

home, the motor vehicle and the workplace. Most modern domestic appliances –

washing machines, dishwashers, ovens, central heating and burglar alarms – are

embedded systems. The motorcar is full of them, in engine management, security (for

example locking and anti-theft devices), air-conditioning, brakes, radio, and so on.

They are found across industry and commerce, in machine control, factory



automation, robotics, electronic commerce and office equipment. The list has almost

no end, and it continues to grow.

Figure 1.1 re-expresses the embedded system definition as a simple block

diagram. There is a set of inputs from the controlled system. Based on information

supplied from these inputs, the controller computes certain outputs, which are

connected to actuators within the system. There may be interaction with a user, e.g.

via keypad and display, and there may be interaction with other sub-systems

elsewhere, though neither of these is essential to the general concept.

Figure 1.1: The essence of the embedded system.

Embedded System Organization

Figure below shows one possible organization for an embedded system.



Figure 1.2: An embedded system encompasses the CPU as well as many other resources.

In addition to the CPU and memory hierarchy, there are a variety of interfaces

that enable the system to measure, manipulate, and otherwise interact with the

external environment. Some differences with desktop computing may be:

The human interface may be as simple as a flashing light or as complicated as

real-time robotic vision.

The diagnostic port may be used for diagnosing the system that is being

controlled -- not just for diagnosing the computer.

Special-purpose Field Programmable Gate Array (FPGA), Application

Specific IC (ASIC), or even non-digital hardware may be used to increase

performance or safety.

Software often has a fixed function, and is specific to the application.

In addition to the emphasis on interaction with the external world, embedded

systems also provide functionality, specific to their applications. Instead of executing

spreadsheets, word processing and engineering analysis, embedded systems typically

execute control laws, finite state machines, and signal processing algorithms. They

must often detect and react to faults in both the computing and surrounding

electromechanical systems, and must manipulate application-specific user interface

devices.



1.3.2 History and Future

Given the definition of embedded systems earlier in this unit, the first such

systems could not possibly have appeared before 1971. That was the year Intel

introduced the world's first microprocessor. This chip, the 4004, was designed for use

in a line of business calculators produced by the Japanese company Busicom. In 1969,

Busicom asked Intel to design a set of custom integrated circuits-one for each of their

new calculator models. The 4004 was Intel's response. Rather than design custom

hardware for each calculator, Intel proposed a general-purpose circuit that could be

used throughout the entire line of calculators. This general-purpose processor was

designed to read and execute a set of instructions--software-stored in an external

memory chip. Intel's idea was that the software would give each calculator its unique

set of features.

The microprocessor was an overnight success, and its use increased steadily

over the next decade. Early embedded applications included unmanned space probes,

computerized traffic lights, and aircraft flight control systems. In the 1980s,

embedded systems quietly rode the waves of the microcomputer age and brought

microprocessors into every part of our personal and professional lives.

Very early in their development, and certainly by the end of the 1970s, two

trends were emerging for these remarkable devices. One was to scale down, in size if

not computing power, the general-purpose computer; this led quickly to the first

desktop machines. The other, much more revolutionary, was to place the

microprocessor in products, which apparently had nothing to do with computing.

They began to find their way into photocopiers, grocery scales, washing machines,

and a host of other products, wherever there was a requirement to exercise some

control function. While the first trend led to an inexorable demand for faster a nd

bigger processors with increasingly sophisticated mathematical capability, the second

placed lower demands on computational power and speed. It wanted physically small

and cheap devices, with as much functionality of the system as possible squeezed onto

one integrated circuit.



It seems inevitable that the number of embedded systems will continue to

increase rapidly. Already there are promising new embedded devices that have

enormous market potential: light switches and thermostats that can be controlled by a

central computer, intelligent air-bag systems that don't inflate when children or small

adults are present, palm-sized electronic organizers and personal digital assistants

(PDAs), digital cameras, and dashboard navigation systems. Clearly, individua ls who

possess the skills and desire to design the next generation of embedded systems will

be in demand for quite some time.

1.3.3 Functions of Embedded Systems

By now you must have realized the importance of the Embedded systems and

their functions. Following are some of the functions expected out of the embedded

systems

• Monitoring of the environment

–Reading the sensor inputs

–Processing of the inputs

–Displaying

• Controlling the environment

–Generating and transmitting commands for the actuators

• Transforming the information

–e.g. Data compression/decompression

.

Types of Embedded System Functions

Control Laws

• PID (Proportional-Integral-Derivative) control

• Fuzzy logic

Sequencing logic

• Finite state machines

• Switching modes between control laws

Signal processing

• Multimedia data compression

• Digital filtering



Application- specific interfacing

• Buttons, bells, lights

• High- speed I/ O

Fault response

• Detection & reconfiguration

• Diagnosis

1.3.4 Embedded System Types

1. General Computing

• Applications similar to desktop computing, but in an embedded package

• Video games, set-top boxes, wearable computers, automatic tellers

2. Control Systems

• Closed- loop feedback control of real- time system

• Vehicle engines, chemical processes, nuclear power, flight control

3. Signal Processing

• Computations involving large data streams

• Radar, Sonar, video compression

4. Communication & Networking

• Switching and information transmission

• Telephone system, Internet

1.3.5 Example Embedded Systems

Figure 1.3: Block diagram of a DVD



The figure 1.3 shows the block diagram of a DVD, which is an embedded

system. The figure 1.4 shows general block diagram of a typical embedded system.

Figure 1.4: A typical embedded system

Table 1.1: Four example embedded systems with approximate attributes.



In order to make the discussion more concrete, we shall discuss four example

systems (Table 1.1). Each example portrays a real system in current production, but

has been slightly genericized to represent a broader cross-section of applications as

well as protect proprietary interests. The four examples are a Signal Processing

system, a Mission Critical control system, a Distributed control system, and a Small

consumer electronic system. The Signal Processing and Mission Critical systems are

representative of traditional military/aerospace embedded systems, but in fact are

becoming more applicable to general commercial applications over time.

Using these four examples to illustrate points, the following sections describe

the different areas of concern for embedded system design: computer design, system-

level design, life-cycle support, business model support, and design culture

adaptation.

Desktop computing design methodology and tool support is to a large degree

concerned with initial design of the digital system itself. To be sure, experienced

designers are cognizant of other aspects, but with the recent emphasis on quantitative

design life-cycle issues that aren't readily quantified could be left out of the

optimization process. However, such an approach is insufficient to create embedded

systems that can effectively compete in the marketplace. This is because in many

cases the issue is not whether design of an immensely complex system is feasible, but

rather whether a relatively modest system can be highly optimized for life-cycle cost

and effectiveness.

While traditional digital design (CAD, Computer Aided Design) tools can

make a computer designer more efficient, they may not deal with the central issue --

embedded design is about the system, not about the computer. In desktop computing,

design often focuses on building the fastest CPU, then supporting it as required for

maximum computing speed. In embedded systems the combination of the external

interfaces (sensors, actuators) and the control or sequencing algorithms is or primary

importance. The CPU simply exists as a way to implement those functions. The

following experiment should serve to illustrate this point: ask a roomful of peop le

what kind of CPU is in the personal computer or workstation they use. Then ask the



same people which CPU is used for the engine controller in their car (and whether the

CPU type influenced the purchasing decision).

In high-end embedded systems, the tools used for desktop computer design are

invaluable. However, many embedded systems both large and small must meet

additional requirements that are beyond the scope of what is typically handled by

design automation. These additional needs fall into the categories of special computer

design requirements, system-level requirements, life-cycle support issues, business

model compatibility, and design culture issues.

Embedded Systems Products

Computer Peripheral Devices

–printer, multimedia subsystems, graphical subsystems…

Communication

-modems, fax, cellular phones…

Home Appliances

–CD player, VCR, microwave oven…

Control Systems

–automobile, robotics, satellite control…

1.3.6 Characteristics of embedded systems

Constituents of the embedded computer: hardware and software

As with all computer systems, the embedded computer is made up of hardware

and software, as symbolized in Figure 1.1. In the early days of microprocessors much

of the design time was spent on the hardware, in defining address decoding, memory

map, input/output and so on. When the hardware design was completed, a

comparatively simple program was developed, limited in size and complexity by

restricted program memory size, and the development tools available. Since then there

have been huge strides in hardware development. Much of the hardware system is

now contained on a single chip, in the form of a microcontroller, and developments in

memory technology allow the use of much longer and more sophisticated programs.

Hardware design of the computing core of the embedded system is now in many cases



viewed as a comparatively straightforward affair. The design attention has shifted to

some extent towards software development, with advanced languages and tools

available to develop sophisticated programs.

Timeliness

The microcontroller in an embedded system must be able to respond fast

enough to keep its operation within a safe region. This is a characteristic of operating

in ‗real time‘; the controller must be able to respond to inputs as they happe n and

make responses within the time frame set by the controlled system. This style of

operation is different from the mode of operation, for example, of a personal

computer. While it may be annoying, you can tolerate waiting for your computer to

refresh the graphics display or complete a computation. You cannot tolerate waiting

while your car‘s antiskid braking system decides whether or not to apply the brakes!

Some embedded systems operate within absolutely rigid time demands; for others the

demands are less stringent. They all, however, exhibit the characteristics of

timeliness: a need for the designer to understand fully the time demands of the

controlled system and be responsive to them.

System interconnection

While some embedded systems clearly need only one controller, others are

likely to use several or many, each to control one sub-system. Necessary shared

information is then passed between them by a simple network, devised to suit the

needs of the overall system. A good example of this is the modern motor car. Though

each of the ‗embedded sub-systems‘ in it may be controlled by one microcontroller,

they can all be linked together to form one overall interconnected system. This

approach is made more attractive due to the extremely low cost of most co mmercial

microcontrollers. A network of low-cost microcontrollers is often cheaper, and

simpler to develop, than a single complex computer undertaking many tasks. With the

advent of the Internet, a generation of Internet-compatible embedded systems is

emerging. The cooker, television and washing machine may soon be communicating

together! It is anticipated that within a few years even the most simple of devices may

be Internet- linked. The truly standalone device will then exist in a dwindling minority.



Reliability

Suppliers of software packages designed to run on Personal Computers release

them on the market knowing that they are likely to contain software errors (bugs). It is

vitally important to get them to market early, and fixes can always be distributed after

the faults have been discovered. Suppliers of most embedded systems cannot afford

this luxury. One significant software error in a car model could destroy the reputation

of the manufacturer for ever. Therefore the embedded system designer must develop a

good grasp of reliability issues, and how a reliable system can be achieved. This

implies good design procedures in both hardware and software, coupled with

systematic testing and commissioning.

The market-place

The market that the embedded system sells into is very competitive. As with

other ‗hi-tech‘ markets, the challenge is increased greatly by the very rapid advances

of technology. New products may quickly be rendered obsolete by technological

change, and thus potentially have very short life cycles. This lays the stress on

excellent design and development strategy.

Adding all these features together, a second definition of the embedded system

now follows:

An embedded system is a microcontroller-based, software-driven, reliable, real-time

control system, autonomous, or human or network interactive, operating on diverse

physical variables and in diverse environments, and sold into a competitive and cost

conscious market.

Thus embedded systems are application specific systems, meant for mass

production, usually with a static structure. They may be DSP systems, Real Time

Systems, Reactive systems or Distributed systems.

Summarizing the above features we can say that an embedded application

developed, should exhibit the following features:

Efficiency: fast and compact.

Reusability: encapsulate specific hardware related code, modular and easy to

read



Reliability: use simple and well-understood algorithms and structures.

Timing predictability: you can assure that deadlines are met.

1.4 Real Time Systems

One subclass of embedded systems is worthy of an introduction at this point.

As commonly defined, a real-time system is a computer system that has timing

constraints. In other words, a real- time system is partly specified in terms of its ability

to make certain calculations or decisions in a timely manner. The calculations are said

to have deadlines for completion. And for all practical purposes, a missed deadline is

just as bad as a wrong answer.

A real-time system is a system with performance deadlines on computations

and actions; that is, system correctness depends on the timeliness of the results. An

embedded system is a system that exists within a larger system. Real-Time Service is

the service to be delivered within a time interval dictated by the environment. In real-

time computer system the correctness of system behavior depends NOT ONLY on the

logical results of computations, BUT ALSO on the physical time.

The issue of what happens if a deadline is missed is a crucial one. For

example, if the real-time system is part of an airplane's flight control system, it is

possible for the lives of the passengers and crew to be endangered by a single missed

deadline. However, if instead the system is involved in satellite communication, the

damage could be limited to a single corrupt data packet. The more severe the

consequences, the more likely it will be said that the deadline is "hard" and, thus, the

system a hard real-time system. Real-time systems at the other end of this continuum

are said to have "soft" deadlines.

All of the topics presented in this book are applicable to the designers of real-

time systems. However, the designer of a real-time system must be more diligent in

his work. He must guarantee reliable operation of the software and hardware under all

possible conditions. And, to the degree that human lives depend upon the system's

proper execution, this guarantee must be backed by engineering calculations and

descriptive paperwork.



Real-Time vs. On-Line

Real-Time On Line

Response Time strict soft

Pacing environment computer

Peak load performance predictable degraded

Error detection system user

Safety often critical non-critical

Redundancy active standby

Data Integrity short term long term

Components of a real-time system

Generally a real-time system consists of:

Hardware

Operating System

Static data structure (in ROM)

–application program code

–initialization data

Dynamic data structure (in RAM)

–information about current and past computations

– other dynamic data about the computing environment

Typical Real-Time Applications

•Digital Controllers

–Automotive Controllers

–Industrial Automation

•High-Level Controllers

–Command and Control Systems

–Air Traffic Control Systems

•Signal Processing

•Real-Time Databases and Multimedia



Real Time System Development

Figure 1.5 shows the real time systems development method. Usually the

applications are developed on a host and deployed in the target system.

Target System

Figure 1.5: The real time system development

What is a Real Time Operating System?

Real Time operating system is an operating system that guarantees specific

response times to events (Internal and External). The response usually falls within

some small upper limit of response time (typically milli- or micro-seconds). Usually

following are the functions of a RTOS:

o task management, o memory management,

o time management, o device drivers, o interrupt service.

RTOS Examples:

POSIX, VxWorks, OS- 9, pSOSystem, Linux, Eonics, Windows CE, QNX Neutrino

It is important to distinguish between a real-time system and a real-time

operating system (RTOS). The real-time system represents the set of all system

elements - the hardware, operating system, and applications - that are needed to meet

the system requirements. The RTOS is just one element of the complete real-time

Application Tasks Real-Time OS (pOSEK)

Hardware

(C167CR)

Compiler, Debugger, Loader, Simulator, Shell, vxSim, etc.

WinNT OS (or Solaris)

Pentium PC (SUN workstation)

Input

Output RS-232

Ethernet

Host



system and must provide sufficient functionality to enable the overall real-time

system to meet its requirements.

1.5 Classification and Requirements of Embedded Systems

1.5.1Classification of Embedded Systems

A State is a condition that persists for an interval of real time, a section of the

timeline. An Event is an occurrence at a particular point in real time, at a cut of the

timeline. Any State Change is an Event. State information describes the state at a

given point of observation (itself an event). Event information describes the

differences of the states before and after the event, and the time of occurrence. Only

consequences of an event can be observed.

Deadline is a point in time at which a result should have been made available

by the computer system.

Soft deadline: if result still has some utility after the deadline

Strict deadline: if result has no utility after the deadline

Hard deadline: if missing of a strict deadline can have catastrophic consequences

Hard vs. Soft systems

Hard Systems = Deadlines must not be violated

–A system is called Hard System when the system has to meet at least one hard

deadline. In such systems outputs produced after their deadlines are useless. The

System Failure occurs if a deadline is missed.

More precisely: If all deadlines are satisfied, then system failure will not occur else

system failure may occur

Examples: fly-by-wire, drive-by-wire, nuclear reactor control

For hard systems performance measures are Boolean (on-time / failed) and

worst-case execution time is the critical factor



Soft Systems = Deadlines may occasionally be violated

- A system is called Soft System When the system has to meet ONLY soft

deadlines. In such systems the outputs produced after their deadlines may still be

useful or have reduced usefulness or validity. System Failure does not occur if a

deadline is missed.

Examples: multimedia systems, elevator controller

For soft systems the performance measures like mean response time, number of

frames lost are continuous functions. The mean response time is usually the major

factor

Hard real-time vs soft real-time

characteristic hard RT soft RT (on-line)

response time hard - required soft - desired peak-load performance predictable degraded

control of pace environment computer safety often critical non-critical

size of data files small/medium large redundancy type active checkpoint-recovery data integrity short-term long-term

error detection autonomous user-assisted

Fail-safe vs. Fail operational

Fail-Safe = There exists a default state that is safe

The Fail-safe systems usually have a safe state in the environment that can be

reached in case of a system failure. In such systems error detection is more

important than diagnosis. If the system gets confused, it enter the fail- safe

state. In such systems the computer needs to have a high error detection

coverage.

Examples: ABS, Traffic lights

Nuclear reactor: shut-down & keep core covered

Elevator controller: stop and set the brake

Fail-Operational = There exists no known fail-safe state

Such systems are required when no safe state can be reached in case of a

system failure. The computer system has to provide a minimum level of

service, even after the occurrence of a fault. The main characteristic of such

systems is that they must be able to keep operating despite faults. Usually in



fail-operational systems diagnosis and isolation of faulty components is

critical

Examples: Fly by wire systems in airplanes, Computer controlled chemical

reactors.

Guaranteed-response vs. Best-effort

Safety-critical systems must guarantee response time:

- apriori (by design)

- in the worst peak- load case

- within the worst-case fault hypothesis

Given a specified load & fault-hypothesis

•Guaranteed response

–corectness substantiated by analytical arguments

–Adequate resources for peak load and faults.

–Required for hard real- time systems

•Best effort

–correctness based upon probabilistic arguments

–No adequate resources for peak load and faults

In Guaranteed Response systems the response to certain critical events is

guaranteed during all conditions. It is usually expensive. The design analysis

is very detailed and rigorous

The cheaper alternative to the Guaranteed response systems are Best effort

systems. Here the system tries its best to give the correct response, but it is not

guaranteed. Design of such systems avoids rigorous analysis of fault

hypothesis and design. But such systems are not guaranteed to work. Such a

design is acceptable for low-consequence systems.

Resource-adequate vs. Resource-inadequate

Resource Adequate

Resource adequate systems have sufficient resources to handle the worst-case

peak load under the worst case hypothesized fault scenario. It is necessary in

hard systems.



Resource inadequate

Resource inadequate systems have sufficient resources to handle the vast

majority of run-time conditions. But may not be able to handle peak load.

Such a design is OK for soft systems

Event-triggered vs. Time-triggered

Instant = one point on the real-time axis

Event = an occurrence at one instant on the time axis

Interval = the part of the time axis spanned by two events

Duration = the length of an interval

Trigger = an event that causes the controller to take some action

So, an Event is an occurrence at a particular point in real time, at a cut of the

timeline.

An embedded system is

•Time Triggered (TT) if the control signals, such as

–sending and receiving of messages

–recognition of an external state change

are derived from a global clock.

•Event Triggered (ET) if the control signals are derived from the occurrence of

events such as

–termination of a task

–receipt of a message

–occurrence of an external interrupt

In Event Triggered System the controller action is triggered by an event such

as the arrival of a signal. It generally causes the processor to interrupt.

Usually it can result in very fast response times. Also it can induce a great deal

of non-determinism as the interrupts tend to be asynchronous to processes,

response time can be very load-dependent, the peak load response can be

difficult to predict

In Time Triggered System, the only allowed trigger is a timer time-out. It

renerally employs cyclic execution of all tasks. The response time to some

event depends on: frequency of the task execution cycle,. instant within the



cycle when the event occurred. Such a system can produce very deterministic

responses as worst response time is load-independent and peak load response

can be predicted apriori.

1.5.2 Typical Embedded System Constraints

Embedded systems pose a lot of challenges for any embedded system

developer due to the following constraints:

Small Size, Low Weight

• Hand- held electronics

• Transportation applications -- weight costs money

Low Power

• Battery power for 8+ hours (laptops often last only 2 hours)

• Limited cooling may limit power even if AC power availab le

Harsh environment

• Heat, vibration, shock

• Power fluctuations, RF interference, lightning

• Water, corrosion, physical abuse

Safety- critical operation

• Must function correctly

•Must not function in correctly

Extreme cost sensitivity

• $. 05 adds up over 1,000, 000 units

1.5.3 Embedded Processor Types

8- bit microcontroller

» low cost applications

» include on- chip memory and I/ O controllers

16- bit microcontroller

» applications requiring longer word length

» include off- chip memory and I/ O controllers

32- bit RISC microprocessor

» high performance computationally intensive applications



» include on- chip caches

Digital signal processors (DSPs)

» MAC (Multiply-ACcumulate) intensive applications

» ROM for code, RAM for streaming data

Embedded Microprocessors

Intel 8051

Motorola 68HC11

Motorola Coldfire

Intel x86, i960

IBM Power PC

MIPS

ARM, Thumb

Intel StrongARM

1.5.4 Programming Languages and Environments

Some Major Programming Languages

One of the few constants across most of the embedded systems is the use of

the C programming language. More than any other, C has become the language of

embedded programmers. This has not always been the case, and it will not continue to

be so forever. However, at this time, C is the closest thing there is to a standard in the

embedded world.

Because successful software development is so frequently about selecting the

best language for a given project, it is surprising to find that one language has proven

itself appropriate for both 8-bit and 64-bit processors; in systems with bytes,

kilobytes, and megabytes of memory; and for development teams that consist of from

one to a dozen or more people. Yet this is precisely the range of projects in which C

has thrived.

Of course, C is not without advantages. It is small and fairly simple to learn,

compilers are available for almost every processor in use today, and there is a very

TI TMS320C10, C60

Hitachi SH2, SH3

SHARC DSP

Fujitsu FR- V

Sun Ultra- Sparc

National Geode SC1400

DSP Group Teak, Oak,

SandCraft SR1- GX



large body of experienced C programmers. In addition, C has the benefit of processor

independence, which allows programmers to concentrate on algorithms and

applications, rather than on details of a particular processor architecture. However,

many of these advantages apply equally to other high- level languages. So why has C

succeeded where so many other languages have largely failed?

Perhaps the greatest strength of C - and the thing that sets it apart from

languages like Pascal and FORTRAN - is that it is a very "low-level" high- level

language. C gives embedded programmers an extraordinary degree of direct hardware

control without sacrificing the benefits of high- level languages. The "low-level"

nature of C was a clear intention of the language's creators. In fact, Kernighan and

Ritchie included the following comment in the opening pages of their book The C

Programming Language:

C is a relatively "low level" language. This characterization is not pejorative; it

simply means that C deals with the same sort of objects that most computers do.

These may be combined and moved about with me arithmetic and logical operators

implemented by real machines.

Few popular high-level languages can compete with C in the production of

compact, efficient code for almost all processors. And, of these, only C allows

programmers to interact with the underlying hardware so easily.

Of course, C is not the only language used by embedded programmers. At

least three other languages-assembly, C++ , and Ada-are worth mentioning in greater

detail.

In the early days, embedded software was written exclusively in the assembly

language of the target processor. This gave programmers complete control of the

processor and other hardware, but at a price. Assembly languages have many

disadvantages, not the least of which are higher software development costs and a

lack of code portability. In addition, finding skilled assembly programmers has

become much more difficult in recent years. Assembly is now used primarily as an



adjunct to the high- level language, usually only for those small pieces of code that

must be extremely efficient or ultra-compact, or cannot be written in any other way.

C++ is an object-oriented superset of C that is increasingly popular among

embedded programmers. All of the core language features are the same as C, but C++

adds new functionality for better data abstraction and a more object-oriented style of

programming. These new features are very helpful to software developers, but some

of them do reduce the efficiency of the executable program. So C++ tends to be most

popular with large development teams, where the benefits to developers outweigh the

loss of program efficiency.

Ada is also an object-oriented language, though it is substantially different

than C++. Ada was originally designed by the u.s. Department of Defense for the

development of mission-critical military software. Despite being twice accepted as an

international standard (Ada83 and Ada95), it has not gained much of a foot-hold

outside of the defense and aerospace industries. And it is losing ground there in recent

years. This is unfortunate because the Ada language has many features that would

simplify embedded software development if used instead of C++ .

Assembly

- Processor specific, Small and fast (compared to high level languages)

- High development time and cost

C /C++ - Popular and C‘s efficiency is close to assembly

- Well established and suitable for general embedded applications.. Ada

- Was mandated by DoD for many of their projects.

- Chosen by companies for safety critical applications, e.g., Airbus and Boeing 777 flight control.

- Try to help enforce good software engineering practices. - Not widely supported.

Java - Simple, portable, multithreaded.

- Gets rid of some of worst ―features‖ of C. - Real time garbage collection are being solved

Programming Environment

Open standard RTOS: POSIX.4, e.g., Lynx OS and now Solaris.



Commercial RTOS: Vxworks, Windows CE

Language specific runtimes: embedded Java runtime (being developed), Ada

95 runtime

Real time scheduling analysis tools, e.g. PERTS and TimeSys

Distributed system integration tools: CORBA real time extensions

Self-hosting: development and target machine is the same one

Host + target: develop and compiled at the host, download and executed at the

target.

In real world embedded applications, you often develop and test in a self-host

environment and then download it to a target machine for real applications.

1.5.4 Variations in Embedded Systems

Unlike software designed for general-purpose computers, embedded software

can-not usually be run on other embedded systems without significant modification.

This is mainly because of the incredible variety in the underlying hardware. The

hardware in each embedded system is tailored specifically to the application, in order

to keep system costs low. As a result, unnecessary circuitry is eliminated and

hardware resources are shared wherever possible. In this section you will learn what

hardware features are common across all embedded systems and why there is so much

variation with respect to just about everything else.

By definition all embedded systems contain a processor and software, but

what other features do they have in common? Certainly, in order to have software,

there must be a place to store the executable code and temporary storage for runtime

data manipulation. These take the form of ROM and RAM, respectively; any

embedded system will have some of each. If only a small amount of memory is

required, it might be contained within the same chip as the processor. Otherwise, one

or both types of memory will reside in external memory chips.

All embedded systems also contain some type of inputs and outputs. For

example, in a microwave oven the inputs are the buttons on the front panel and a

temperature probe, and the outputs are the human-readable display and the microwave

radiation. It is almost always the case that the outputs of the embedded system are a



function of its inputs and several other factors (elapsed time, current temperature,

etc.). The inputs to .the system usually take the form of sensors and probes,

communication signals, or control knobs and buttons. The outputs are typically

displays, communication signals, or changes to the physical world. See Figure 1.6 for

a general example of an embedded system.

Figure 1.6: A generic embedded system

With the exception of these few common features, the rest of the embedded

hardware is usually unique. This variation is the result of many competing design

criteria. Each system must meet a completely different set of requirements, any or all

of which can affect the compromises and tradeoffs made during the development and

the product. For example, if the system must have a production cost of less than $10,

then other things--like processing power and system reliability-might need to be

sacrificed in order to meet that goal.

Of course, production cost is only one of the possible constraints under which

embedded hardware designers work. Other common design requirements include the

following:

Processing power

The amount of processing power necessary to get the job done. A common

way to compare processing power is the MIPS (millions of instructions per second)

rating. If two processors have ratings of 25 MIPS and 40 MIPS, the latter is said to be

the more powerful of the two. However, other important features of the processor

need to be considered. One of these is the register width, which typically ranges from

8 to 64 bits. Today's general-purpose computers use 32- and 64-bit processors



exclusively, but embedded systems, still commonly built with older and less costly 8-

and 16-bit processors.

Memory

The amount of memory (ROM and RAM) required to hold the executable

software and the data it manipulates. Here the hardware designer must usually make

his best estimate up front and be prepared to increase or decrease the actual amount as

the software is being developed. The amount of memory required can also affect the

processor selection. In general, the register width of a processor establishes the upper

limit of the amount of memory it can access (e.g., an 8-bit address register can select

one of only 256 unique memory locations).

Development Cost

The cost of the hardware and software design processes. This is a fixed,

onetime cost, so it might be that money is no object (usually for high-volume

products) or that this is the only accurate measure of system cost On the case of a

small number of units produced).

Number of units

The tradeoff between production cost and development cost is affected most

by the number of units expected to be produced and sold. For example, it is usually

undesirable to develop your own custom hardware components for low-volume

product.

Expected lifetime

How long must the system continue to function (on average)? A month, a

year, or a decade? This affects all sorts of design decisions from the selection of

hardware components to how much the system may cost to develop and produce.

Reliability

How reliable must the final product be? If it is a children's toy, it doesn't

always have to work right, but if it's a part of a space shuttle or a car, it had sure better

do what it is supposed to each and every time.



In addition to these general requirements, there are the detailed functional

requirements of the system itself. These are the things that give the embedded system

its unique identity as a microwave oven, pacemaker, or pager.

Table 1-1 illustrates the range of possible values for each of the previous

design requirements. These are only estimates and should not be taken too seriously.

In some cases, two or more of the criteria are linked. For example, increases in

processing power could lead to increased production costs. Conversely, we might

imagine that the same increase ip processing power would have the effect of

decreasing the development costs-by reducing the complexity of the hardware and

software design. So the values in a particular column do not necessarily go together.

Table 1.2: Common Design Requirements for Embedded Systems

Criterion Low Medium High

Processor 4- o r 8-bit 16-bit 32- or 64-b it

Memory < 16 KB 64 KB to 1 MB > 1 MB

Development cost < $100,000 $100,000 to > $1,000,000

$1,000,000

Production cost $ 10 $10 to $1,000 > $1,000

Number of un its < 100 100-10,000 > 10,000

Expected lifetime days, weeks, years decades

or months

Reliab ility may occasionally must work reliably must be

fail fail-proof

1.5.5 Implementations of ES

ASIC (Application Specific IC) based: hard-wired approach

–fast

–highly integrated

–difficult to design (rigidity)

Microprocessor based

–Control functions defined by S/W

–flexible: re-programmable, upgradable

–Slow but may be improved by using multiprocessor



Hardware/Software Co-Design

System-on-a-Chip

1.6 Embedded System Design World – a View

Embedded system design is a complex set of tradeoffs. There is a need to

optimize the systems for more than just speed. There are elements other than the

computer to be considered. Many things have to be taken in to considered while

designing an embedded system. The figure below shows the complexity involved in

designing an embedded system.

The skills of the embedded system designer

It is becoming clear that embedded systems have enormous variety, and call

upon many technical disciplines. This is indeed one of the attractions of working with

them. This multi-disciplinary nature is illustrated in Fig. 1.7. A full understanding of

the microcontrollers we will work with only comes with some knowledge of computer

architecture and integrated circuit design and manufacture. The need for control,

which inevitably implies measurement and actuation, leads us into further branches of

electrical and electronic engineering. Associated with the measurement, we find a

need for analogue as well as digital electronics. One could go on adding further

disciplines, for example Digital Signal Processing or Electromagnetic Compatibility,

to the diagram. These are also important to the embedded system.



Figure 1.7: Embedded system design calls on many disciplines.

1.7 Major Components in ES

Data Acquisition and Processing

Generally embedded systems have different data collecting devices or

subsystems whose main function is collect the raw data either in analog or

digital form and submit it to the system control unit. Sensors and ADC come

under this unit.

Communication

The communication system provides the interconnection between

different subsystems or different components of the embedded systems. An

efficient communication systems plays an important role in the functioning of

an embedded system, especially in a networked environment. RS232 is one

such system.

System Logic and Control Algorithm

This system is the main component of an embedded system and is

responsible for controlling the other components and co coordinating the

activities between them. Its also responsible for taking critical decisions.

Embedded processors like microcontrollers are usually part of this unit.



Interface

This subsystem is responsible for providing interface to the outside

world and may include devices like keyboard and monitors. How an

embedded system interfaces with the outside world or other embedded

systems in case of a networked environment plays crucial role. Different I/O

devices are part of this unit.

Auxiliary Units

Some auxiliary units like timer units, auxiliary memory systems are

necessary for the smooth functioning of the embedded systems

1.8 Summary

An embedded system incorporates a computing element, typically a

microprocessor or microcontroller, to perform a control function. Many embedded

systems are small and low-cost, and are aimed towards the volume market. They

apply recognized hardware and software principles to meet the particular

requirements of the embedded environment. An embedded system consists of

different subsystems like: Communication, Data Acquisition and Processing, System

Logic and Control Algorithm , Interface, Auxiliary Units

1.9 Self Test

1. A general-purpose computer is made up of numerous embedded systems

(True/False)

2. Which of the following is not an RTOS?

a) Linux b) Eonics

c) Windows CE d) QNX Neutrino e) none of these

3. Which of the following languages is popularly used in embedded systems

development?

a) c b)java c)assembly d)c++

4. In ……….. embedded systems there exists a default state that is safe.



5. In medium sized embedded systems the memory size is usually…..

6. In hard real time systems the size of data files is

a) small b)medium

c)small/medium d)large

7. In …….. implementation type the control functions are defined by the software.

8. In real-time systems the peak load performance is ………

a) predictable b) unpredictable

c)with in safe limits d) always under control

Answers

1. True

2. e

3. a,c

4. Fail-Safe

5. 64 KB to 1MB

6. c

7. microprocessor based

8. a

1.10 Questions

1. What is an Embedded System? Give examples

2. What are the functions of Embedded Systems?

3. What are the characterstics of Embedded Systems?

4. Explain any 3 classifications of Embedded Systems

5. What are the challenges of Embedded Systems design and development?

6. What are different implementation types of Embedded Systems?



UNIT 2

Microprocessor and Microcontroller Basics

Contents

2.1 Introduction

2.2 Objectives

2.3 Basic of Microprocessor

2.3.1 The microprocessor reviewed

2.3.2 More on instructions and the ALU

2.4 Some microprocessor design options

2.4.1 Von Neumann and Harvard

2.4.2 Instruction sets- CISC and RISC

2.4.3 Instruction pipelining

2.5 The microcontroller Basics

2.5.1 The microcontroller

2.5.2 Microcontroller memory

2.5.3 Input/output

2.5.4 Timer subsystems

2.5.5 Digital input/output

2.5.6 Analog to digital converters

2.5.7 Serial input/output

2.6 Microcontroller Characteristics & applications

2.7 Some example microcontrollers

2.8 Programming microcontrollers

2.9 Summary

2.10 Self Test

2.11 Questions

2.1 Introduction



A microprocessor is an integrated circuit on a tiny silicon chip that contains

thousands or millions of tiny on/off switches, known as transistors. The transistors are

laid out along microscopic lines made of superfine traces of aluminum that store or

manipulate data. These circuits manipulate data in certain patterns, patterns that can

be programmed by software to make machines do many useful tasks. One of the

biggest tasks microprocessors perform is acting as the brains inside a personal

computer.

A Microcontroller is a single chip computer system. It contains a

microprocessor core, typically RISC with on chip RAM, EEPROM and a useful array

of I/O. The study of the microcontroller will rely on having a reasonable knowledge

of microprocessors. We will briefly review this knowledge, to ensure a defined

starting point.

2.2 Objectives

In this unit you will learn:

Basics of microprocessor

Fundamental choices in microprocessor design

Features of a purpose microcontroller

Applications of a microcontroller

Different subsystems like timer, memory…

Examples of microcontrollers

2.3 Basics of microprocessor



2.3.1 The microprocessor reviewed

A microprocessor is a simple computer, contained more or less in one

integrated circuit (IC, also colloquially called a ‗chip‘). Like any computer it follows

a sequence of instructions, known as a program. Each instruction causes a very simple

action to take place, generally either a computation, a transfer of data or a decision.

The microprocessor can perform each instruction extremely fast, so that by building

on these very simple actions much more complex tasks can be undertaken.

A diagram of the hardware of a simple microprocessor-based system is shown

in Figure 2.1. The essential features are:

the microprocessor

a section of memory to store the program

another section to store temporary data

some contact with the outside world (through the input/output port)

a means of interconnecting these elements (i.e. data and address bus, together

with some control lines)

Figure 2.1: A simple microprocessor system.

Program memory is usually stored in a form of memory called ROM – Read-

Only Memory. Data memory is usually stored in a type of memory called RAM –

Random Access Memory. ROM retains its contents when the system is powered



down; RAM does not. Memories are defined according to size, generally in terms of

numbers of bytes. For this the prefixes K- and M-(or Mega) have gained ubiquitous

customary usage. These differ from the conventional decimal multipliers (e.g. the

kilo- of kilometre or kilogram). K- indicates a multiplier of 210, i.e. 1024, while

Mega is actually 1 048 576, i.e. 220. A memory of 4 Kbytes contains 4096 byte-sized

locations.

A block diagram of a ‗typical‘ imaginary microprocessor appears in Figure

2.2. The computing function takes place in the Arithmetic Logic Unit (ALU), where

arithmetic and logical operations take place. Part of the ALU is the accumulator. This

is the register where the operand, the number on which the operation is being

performed, is held. The size of the accumulator, in number of bits, determines the size

of number that the processor can operate on. It is reflected across the whole

microcomputer system, for example in the size of the data bus and memory locations.

The ALU, together with the control section around it, is known as the Central

Processing Unit (CPU).

Figure 2.2: A typical microprocessor

The action of the microprocessor is synchronized to the clock generator, often

based on a quartz crystal oscillator. Any microprocessor can only operate within a

certain range of clock frequencies, whose limits are set by the fabrication technology



of the device and specified by the manufacturer. Each has a maximum (for

microcontrollers usually in the range 4 MHz to around 30 MHz). Those based on

dynamic logic have a minimum as well. Those which can operate down to DC are

known as ‗fully static‘.

The clock oscillator frequency is divided down within the microprocessor

(generally by a factor between 4 and 12, depending on the microprocessor), giving a

lower internal operating frequency. One period of this internal frequency is sometimes

called a machine cycle, or an instruction cycle. All instruction execution is made up of

integer numbers of machine or instruction cycles.

In normal system operation the processor works down the list of instructions

which make up the program. It fetches each one from program memory, decodes it

with its Instruction Decode circuit, and then executes it. The instruction is in many

cases accompanied by further pieces of code, also stored in program memory, which

are treated as operand data, or addresses where the operand data may be found.

The microprocessor ‗keeps its place‘ in the program by means of the Program

Counter, which always holds the address of the next instruction to be executed. In

order to fetch the next instruction, the processor places the value held in the Program

Counter on the address bus, and signals through the control lines that it wishes to read

data. Memory corresponding to that address will, upon receiving the address and

control signals, place the instruction word on the data bus, which the processor can

then read. As each word is read from program memory, the Program Counter is

incremented.

Figure 2.3 illustrates this sequence of activities, for the processor of Figure 2.2

and for a certain instruction, as a timing diagram. It can be seen that there are four

clock cycles in each machine cycle. The first cycle shown is an ‗instruction fetch‘

cycle. The address of the instruction to be fetched is placed on the address bus, and

the R/W line indicates that the data transfer is to be a ‗read‘. In response the addressed

memory places data onto the bus. This is received by the microprocessor and decoded

by the Instruction Decode circuit.



Figure 2.3: The microprocessor fetch/execute cycle.

In the second machine cycle the instruction is executed; the example illustrates

a data move from processor to memory. The processor sets values on the address and

data buses, and signals a write by setting the R/W line low. The DAV line goes high

to indicate that the bus data is valid. The falling edge of this signal is used to latch the

data into memory. This particular instruction has taken two machine cycles to

complete. It is then followed by the Instruction Fetch cycle of the next instruction. It

follows that simple microprocessor operation can be seen as a relentless cycle of

instruction fetch, decode and execute.

2.3.2 More on instructions, and the ALU

A typical 8-bit ALU is able to perform the operations shown in Table 2.1.

Using combinations of these very simple operations, almost any other mathematical

function can be implemented, albeit sometimes laboriously.

Table 2.1 What an ALU can do.



Each processor (or processor family) has its own instruction set, from which

the program is written. Each instruction is a binary word, known individually as the

op code (operation code), or collectively as machine code. The processor CPU can

recognize and respond to these codes. The instruction set is the collection of all these

op codes. It uses the basic ALU operations listed earlier, and adds to these certain data

transfer and branch instructions. This gives an instruction set the following typical

instruction categories:

Data transfer: instructions, which move data from one register or memory

location to another.

Arithmetic: instructions, which perform arithmetic operations between

specified data words.

Logical: instructions which perform logical functions between specified data

bits or words, for example INVERT, AND, OR, Rotate.

Program branch: instructions which cause a program to deviate from simple

sequential execution of instructions held in program memory, for example as a

subroutine call or return, or conditional branch. (A conditional branch

instruction tests a certain condition of the microprocessor system, for example



a register bit. It transfers program operation to a different program section if

the test condition is met, and continues the program in sequence if it is not.)

The result of an operation undertaken in an accumulator frequently exceeds the

range of the number which can be held in the accumulator. Therefore associated with

the ALU is a ‗Flag Register‘; this contains a number of bits which give further

information about the result of the previous instruction. It is known as the Status

Register (Microchip Inc.), Condition Code Register (Motorola), or Programme Status

Word (Intel and Philips). These bits may include:

a zero bit, indicating whether the result was zero

a carry bit, indicating whether there was a carry from the most significant bit

(msb) of the accumulator, also used as a ‗borrow‘ in subtraction

a sign, or negative bit, indicating whether the result was negative (interpreting

the result in two’s complement arithmetic(‗Two‘s complement‘ is a means of

expressing negative numbers in binary. ) – hence this bit is simply set to the

msb of the result

a half-carry bit, indicating whether there was a carry between the lower and

higher nibbles of the result – this is useful for Binary Coded Decimal (BCD)

arithmetic

an overflow bit, indicating whether the two‘s complement range has been

exceeded. It is set if there has been a carry out of bit 7 but not bit 6, or a carry

out of bit 6 but not bit 7

a parity flag, indicating whether an odd or even number of 1 bits are in the

accumulator

As there are not usually enough ‗condition code‘ flags to fill an 8-bit register,

many processors use the remaining few bits for other purposes, for example interrupt

mask bits or register bank address bits.



Figure 2.4: Example addition of two binary numbers.

2.4 Some microprocessor design options

We go on to consider some aspects of microprocessor design which go beyond

the basic structure assumed so far. These aspects are discussed at a level appropriate

to the small-scale microprocessor or controller; their application in larger computers

is far more sophisticated.

2.4.1 Von Neumann and Harvard

In the conventional von Neumann architecture, program and data memory

share the same address and data buses, and are hence both within the same memory

map. This is illustrated in very simple form in Figure 2.5(a). This approach is simple,

robust and practical, and has been widely and successfully applied. If data memory is

being accessed, program memory lies idle, and vice versa.



FFiigguurree 22..55:: ((aa)) TThhee ccoonnvveenntt iioonnaall vvoonn NNeeuummaannnn ssttrruuccttuurree;; ((bb)) tthhee HHaarrvvaarrdd

structure.

Once the overall memory space is defined, it is up to the user to decide which area is

allocated to data, and which to program. The structure does however lead to the ‗von

Neumann bottleneck‘; time-sharing the data bus between both instruction and data

means that maximum speed of executing a program will always be limited, as each

has to use the bus in turn.

It is, however, possible to have more than one address and data bus, and hence

to place data and program memory in different memory maps. This approach,

sometimes called a Harvard structure, is shown in simple form in Figure 2.5(b).

Instructions can now be fetched independently from, and if necessary simultaneously

with instruction execution, thereby eliminating the von Neumann bottleneck. The two

data buses can now be of different sizes, as can the two address buses. This allows

each to be optimized for its own use, and has important implications in certain

processor structures. The structure facilitates pipelining (see below), and also

enhances program security. It is less likely that an errant processor will attempt to

overwrite its own program, or jump into data memory and start interpreting data as

instructions.

With its multiplicity of buses, this architecture does lead to a more complex

hardware realization than conventional von Neumann. Moreover, not every memory

use is clearly divided into ‗data‘ or ‗program‘. Look-up tables (i.e. tables of constant

data, defined within the program), for example, may be embedded in pro gram

memory, but required for use as data.



2.4.2 Instruction sets – CISC and RISC

In simple terms, the operations that the designer of the microprocessor has at

her initial disposal are those listed in Table 2.1. The microprocessor instruction set

could be based on these, and thus they would be available to the programmer in their

‗raw‘ form. Alternatively, it is possible to group them together in simple

combinations, so that an instruction from the instruction set is actually interpreted by

the CPU as a sequence of perhaps two or three of these primitive instructions. This

practice is known as microcoding, and the task of interpreting each program

instruction into instruction primitives is done by a ROM internal to the CPU.

Many early microprocessor designers adopted this practice, and tried to create

instructions for every possible eventuality. This appeared to approach a sophisticated

and ‗ideal‘ machine. A processor of this type gained the name Complex Instruction

Set Computer (CISC). One of its more obvious characteristics is that code for

different instructions can be of quite different lengths, and have widely differing

execution times. The CISC processor also occupies more space on the IC, due to the

requirement for internal ROM.

Studies of CISC instruction usage revealed, however, that in a ‗typical‘

program most of the instructions were not being used for most of the time (e.g. 80%

of programs were made up of 20% of the instruction set). It was therefore reasoned

that if the most-used instructions were optimized in terms of speed, and the others

removed (but with their function still achievable by combinations of those that

remained), then program execution time could be reduced and the CPU design

simplified. In parallel with this, technological advances, especially in the area of high-

density memory, meant that the pressure to minimize program code length was no

longer so great.

The result was a ‗back to basics‘ move, leading to the simpler but faster

Reduced Instruction Set Computer (RISC). This has the following characteristics:

1. The CPU does not make use of micro coding.

2. Memory is accessed via load and store instructions only ; the requirement to

operate on memory contents is achieved by multiple instructions.



3. All instructions are executed in one machine cycle; this means that each instruction

must be represented by one word only – hence all op codes must be equal to or within

the instruction bus size, and must include the operand within them.

RISC machines have the advantages of simplicity and speed, but carry the

apparent disadvantage that their program code is almost invariably longer and more

complex. With memory becoming ever cheaper and of higher density, and with more

efficient compilers for program code generation, this disadvantage is diminishing.

2.4.3 Instruction pipelining

As Figure 2.3 showed, conventional microprocessor program execution is a

relentless sequential cycle of instruction fetch, decode and execute. For a given

processor the only way of speeding up this operation is by speeding up the clock.

Consider an alternative: that as one instruction is being executed, the next is

already being fetched. If this is done, the instruction throughput can be dramatically

increased without reducing the actual instruction execute time. This is the basis of

pipelining – it‘s a simple idea which can make processors run much faster, but it does

place certain strict requirements on the nature of the instructions.

Figure 2.6: Pipelined instruction execution

In order to work, all the instructions of the processor must have the same

duration of execution, and it must be possible to split the fetch–decode–execute cycle



for all individual instructions into a number of stages of equal duration. Then, as any

one instruction enters its second stage, the following instruction enters its first. This is

illustrated in Figure 2.6, for instructions divided into just two stages (i.e. fetch and

execute). As one instruction is executing, the next is already being fetched. It can be

seen that the instruction throughput for the first three instructions is twice as fast as in

Figure 2.3. If the instructions had been broken into three stages, it would have been

three times as fast, and so on.

Simple pipelining fails at conditional program branches. When the processor

is executing a branch instruction, it is already fetching the next instruction in the

program, but if the branch does take place that next instruction is no longer needed.

So it must ‗flush out‘ that instruction, and fetch the one where the branch starts. This

is why branch instructions often take longer in a pipelined architecture. The example

of Figure 2.6 shows two instructions being successfully fetched and executed. The

third is a branch, and the fourth instruction, though fetched, is never executed, and

one machine cycle is lost. The fifth instruction shown is from the start of the program

section to where the branch has taken place.

2.5 The Microcontroller Basics

2.5.1 The microcontroller

A microcontroller differs from a microprocessor in several important ways.

The early name for a microcontroller was microcomputer. The big difference between

a microprocessor and a microcomputer/microcontroller is the completeness of the

machine each represents. A microprocessor is simply the ―heart‖ of a computer. To

put a microprocessor into use, the designer required memory, peripheral chips, and

serial and parallel ports to make a completely functional computer. By contrast, the

microcomputer was designed to be a complete computer on a single chip. Necessary

memory and peripheral components were integrated onto the chip so that a complete

computer-based system could be built with a minimum of external components.



We can say that microcontroller is a particular type of microprocessor, plus

some additional components optimized to perform control functions for the lowest

cost and at the smallest size possible. Generally microcontrollers are used in a

recognisably ‗embedded system‘ environment. A basic microcontroller is shown in

block diagram form in Figure 2.7.

Figure 2.7: A Typical Microcontroller Block Diagram

The central control unit of the microcontroller is the arithmetic logic unit

(ALU). Figure 2.7 shows that the ALU is connected to three different blocks. The first

is the input/output block (I/O), the second is the program memory, and the third is the

data memory. Most microcontrollers combine the last three blocks into one block. The

architecture shown in the figure is known as a Harvard architecture, as opposed to the

more common Von Neumann architecture. The Harvard architecture is a computer

configuration in which the memory area that contains the program instructions for the

computer is separated from the memory area in which data are stored. By contrast, the

Von Neumann architecture has just one memory space where both program and data

are stored.

The main functional difference between Harvard and Von Neumann

architectures is in their ultimate operating speeds. Both architectures require that the

ALU access memory once each instruction to get the next instruction to execute.

Often the instruction being executed will also require an access to memory. Reading

data into a register, storing data in a memory address, and accessing a location in

memory that is in fact an input/output register are examples of operations that require

memory accesses in addition to the normal memory fetches. As seen in Figure 2.7, the



Harvard architecture has two or more internal data busses over which these different

accesses can take place. There are usually two such internal busses: one for

instruction access, and one for other data access. The processor can easily tell which

data bus to use. If the access is to fetch an instruction, it is relative to the program

counter. These accesses will go to the program memory area. All other memory

accesses will go to the data memory area. It is entirely possible to have two or more

memory accesses simultaneously with a Harvard architecture.

The Von Neumann architecture is somewhat simpler than the Harvard

architecture. A Von Neumann processor has only one memory bus. All memory

accesses must go through this single path on the system. With such a system, the

processor can never process more than one memory access at a time and all memory

accesses—instruction, data, or input/output—must pass through a single data bus.

This is the origin of the term ―Von Neumann bottleneck.‖ The multiple accesses to

memory for each instruction ultimately limit the maximum speed of a Von Neumann

architecture processor. However, the speed of such processors can be many millions

of instructions per second, so there are numerous excellent, fast microcontrollers

constructed with the Von Neumann architecture. The Von Neumann architecture has

been the mainstay of microcontrollers and will be the only microcontroller

configuration available for the foreseeable future.

A microcontroller has its program stored internally, and the ALU reads an

instruction from memory. This instruction is decoded by the ALU and executed. At

the completion of the execution of the instruction, the next instruction is fetched from

memory and it is executed. This procedure is repeated until the end of the program is

found, or the program gets into a loop where it is instructed to branch back to a

beginning point. In this case, the machine will stay in the loop forever or until

something happens to release it from the never-ending loop.

There are three ways for a machine locked in a loop to be removed from the

loop so it can execute code outside of the loop. These operations are called

exceptions. The first is to reset the part with a reset signal. A reset signal usually

requires connecting the reset pin of the part to a logic low signal. A logic low is

usually ground. When this condition is detected, several internal registers are set to



predetermined values, and the microcontroller fetches the address of the reset routine

from a specific memory location. This address is placed in the program counter, and

the program starts to execute. There is a table in memory that contains the addresses

of several routines accessed when exceptions occur. These are the addresses of the

interrupt service routines, reset routines, etc. This table is called the vector table, and

the addresses are called vectors.

A second means of forcing the part out of the loop is for the part to detect an

external interrupt. An external interrupt occurs when the interrupt request (IRQ) pin

on the part is set low. This pin is tested at the beginning of the execution of each

instruction. Therefore, if an instruction is being executed when an IRQ is asserted, the

instruction will complete before the IRQ signal is processed. Processing for the IRQ

consists of first determining if IRQs are enabled. If they are, the status of the machine

is saved. All interrupts are disabled by setting the interrupt mask bit in the status

register of the microcontroller. Then the address stored in the IRQ vector location is

fetched. This address, the address of the interrupt service routine (ISR), is placed in

the program counter. The ISR then executes.

The process of saving the status of the machine is to push the contents of all

machine registers onto the machine stack. Therefore, the ISR can safely use any of the

central machine resources without disrupting the operation of the main line of code

when control is returned. When exiting an ISR, it is necessary to use a special

instruction called a return from interrupt or a return from exception. This instruction

restores the status of the machine from the stack and picks up execution of the code

from the instruction following the one where the interrupt occurred.

The third means for exiting the main loop of the program is from internal

interrupts. The microcontroller peripherals can often cause interrupts to occur. An

internal interrupt causes exactly the same sequence of operations to occur as an

external interrupt. Different interrupt vectors are used for each of the several internal

peripheral parts so the cause of the interrupt is generally known and control is directed

to the specific ISR for each of the several possible internal interrupts.



Data are transferred, information is passed, or events are handled either

synchronously or asynchronously. The difference between these two methods of data

transfer has mainly to do with how the clocking of the data is handled. The most

common form of synchronous data transfer is with a three-wire serial link. One of the

wires is a clock, and the other two are input data and output data, respectively. For a

synchronous transfer, the value of the input is usually sampled at one edge of the

clock signal (such as the fall of the clock) and the value of the bit to be sent out is

guaranteed to be correct at the fall of the clock signal. Any synchronous system must

set its output at such a time that it will be stable while the clock is high, and hold it in

that condition until the clock signal falls. To receive a bit, the condition of the input

line must be latched into the system as the clock signal falls from high to low.

Within the computer, there is another distinction for synchronous. Often an

input is allowed to set a bit when it occurs. If this happens, the program will not

expeditiously observe the fact that the bit is set. In fact, the program will test the state

of the bit according to the program timing requirements. This type of operation is also

called synchronous because the test is synchronized with the program.

Asynchronous operation, on the other hand, usually depends on a prearranged

series of events to cause the data transfer. Serial data communications is a common

example of asynchronous data transfer. Here, an input line can have two states: mark

and space. A line is held at the mark state whenever no data are being transferred.

When data are to be transferred, the data line is transitioned to the space state and held

there for a specified time. This period is called the start bit. From that time onward,

the data bits are placed on the line a bit at a time so that at the specified time intervals

the receiving device can examine the data line and determine the bit sequence.

Asynchronous operation means that there is no computer clock related

specification as to the time that events will occur. Another example of asynchronous

transfer occurs within the computer. Generally, events and data transfers that are

initiated by interrupts are considered to be asynchronous. Most of the peripheral

devices that are found on Motorola microcontrollers will allow either synchronous or

asynchronous notification of the program that the peripheral business is completed.



2.5.2 Microcontroller Memory

In a microcontroller, the program instructions are usually stored in a memory

type called read-only memory (ROM). ROM is usually programmed by a special

mask during the manufacture of the microcontroller and is called masked ROM. ROM

is the least expensive means of storing a program in a microcontroller, especially for

high volume manufacturing.

There are at least two means for the end user of the microcontroller to place

the program memory into the chip. The first is called erasable programmable read-

only memory (EPROM). EPROM is a memory technology that can be erased by

exposing it to high-energy ultraviolet light. The EPROM requires the application of a

high voltage to be programmed. The memory can be programmed with either a

development system or a special programming board designed specifically to program

the microcontroller.

Packages that contain EPROM have a quartz glass window through which the

ultraviolet light can pass with minimum attenuation. These packages are quite

expensive, and therefore, microcontrollers with EPROM are usually too expensive to

use. EPROM was used for development purposes in the past, but it is just too

expensive in light of more recent developments to be used for that purpose today.

For limited production purposes, a less expensive version of the EPROM chip

is available. This is the one time programmable (OTP) chip. An OTP chip has the

exact same silicon component as an EPROM, but it is packaged in a standard plastic

package. This package is much less expensive than the windowed package discussed

previously. However, once the chip has been programmed, the program contents

cannot be changed.

There is yet another means of storing programs or, in some instances, data in a

microcontroller. This technique is called electrically erasable programmable read-only

memory (EEPROM). EEPROM is programmable from instructions within the

microcontroller. EEPROM also requires a high programming voltage. If there are



large blocks of EEPROM on the chip, the programming voltage is usually applied

during the programming cycle through a pin connected to an external voltage source.

In cases where the amount of EEPROM to be programmed is relatively small, a

charge pump on the microcontroller chip will allow the EEPROM to be programmed

with no externally applied voltage. The amount of EEPROM that can be programmed

with an on-board charge pump is usually so small that it is not useful for storing

program instructions. But onboard EEPROM can be quite useful in the storage of data

generated by the program that must be saved through a power-down cycle. Sometimes

in the execution of the program, some data are generated that must be saved for later

use. These data are called volatile data or variables, and are usually stored in random

access memory (RAM). Careful design of a program will usually result in the need

for much less RAM than ROM. In most microcontrollers, the amount of RAM is

usually 60 to at most a few hundred bytes. The amount of ROM, EPROM or

EEPROM usually runs from 1000 bytes upwards to a few tens of thousands of bytes.

EEPROM is quite expensive, and has been replaced by a newer technology

called FLASH memory. FLASH programs in a manner similar to EEPROM and it is

inexpensive enough to allow rather large amounts of programmable memory on a

microcontroller chip. You will find chips with 30,000 bytes and more of FLASH and

the intent is to use these chips for production runs. The FLASH is programmed as part

of the production cycle.

The architecture of some (eg:Motorola) microcontrollers is strictly Von

Neumann. That is, within the microcontroller chip, there is only one data bus over

which all program, data, and input/output must pass. In a Harvard architecture system,

each of these different data types will have a dedicated bus over which the

information will pass. Therefore, the Harvard architecture microcontroller is able to

access data, program, and I/O simultaneously. The simultaneous availability of these

different data paths can result in a significant increase in overall processor speed. It

also increases the area of the microcontroller die and, hence, the cost of the

microcontroller. In general, most of the applications to which the microcontrollers are

directed do not require extreme speed. Thus, the Von Neumann architecture is

completely satisfactory.



2.5.3 Input/Output

Usually microcontrollers use an architecture called memory mapped I/O. Each

I/O device input and output registers, its control registers, and status registers are

mapped into memory locations. I/O transactions require no special computer

instructions. It is merely necessary to know the memory locations of the pertinent

registers and the uses of the register bits to be able to handle any I/O function. Listed

below are brief descriptions of several microcontroller I/O peripherals found on most

of the microcontrollers. Not all of these peripheral systems are found on each

microcontroller. It is possible to pick and choose between needs for the several

peripheral systems and select a microcontroller that has exactly those peripherals

required.

2.5.4 Timer Subsystems

There are four popular timer systems that you will find on different

microcontrollers. The first is a general-purpose timer. Motorola refers to the general-

purpose timers as either 8- or 15-bit timers. These timers are different. The 8-bit

system contains a prescaler that counts down from system clock. The output from the

prescaler is fed into a counter that counts down from it the value stored in it. When

the counter underflows, a flag is set, and an interrupt can be executed. The 15-bit

timer is a strictly Motorola name, and it is even simpler than the 8-bit timer. This

timer has a 15-bit minimally programmable prescaler. An interrupt can be taken from

two locations in this ripple counter.

A second class of timer is the 16-bit timer. This timer is often called a general

purpose timer. These timers contain a 16-bit counter that is clocked by the system

clock. There are two associated subsystems: the first is ca lled an input capture

system, and the second is the output compare.

The input capture system simply captures the value of the system timer

counter when an input occurs. These inputs can set a flag or request an interrupt so the

input can be processed either synchronously or asynchronously. The important fact is



that the exact time of the input relative to the 16-bit clock is saved when the input

occurs. Applications for input capture systems are interpulse period measurements or

frequency measurements.

The output compare system allows the programmer to specify a time relative

to the 16-bit counter when an output is to occur. This time is calculated by adding the

time offset value to the current value of the 16-bit counter. This result is stored in the

output compare register. When the 16-bit counter counts to the value in the output

compare register, the output occurs, a bit is set, and an interrupt can be processed if

desired.

Input capture and output compare functions are sometimes called high-speed

inputs and outputs. The number of input captures and output compare systems vary

from as few as one each to as many as 16 programmable timers, each of which can be

either input capture or output compare.

There is another style of timer subsystem that is used on high-end

microcontrollers. This system is called the timer processor unit (TPU). In most

conventional computers, the contents of a memory location are called an operand, and

the processor has built- in operators that operate on the operands. A TPU is also a

computer, but rather than using memory location contents as operands, time is the

main operand used by the TPU. Most TPUs contain many complex systems to

implement their operation. The TPU of the M68300 family and the M68HC16 family

contains sixteen registers, each of which can be operated as either an input capture or

an output compare. Each output compare can have its events coupled to other registers

to control intricate timing events with fine time resolution. We will not see the direct

programming of a TPU in this text, but we will see some of the types of events that

are controlled by the TPU programmed with the usual 16-bit timer.

On the newer computers, such as the MOORE architecture, a time-of-day

(TOD) clock has been introduced. This clock is based on a 32768-Hz watch crystal.

These crystals are readily available, small, very accurate, and quite inexpensive. Their



only problem is that they are slow, and are not very good for fine time measurements

unless the crystal is used as a time base to a frequency synthesizer.

Another timer function found on most microcontrollers is the computer

operating properly (COP) or watchdog timer. Most microcontrollers are placed in

embedded controls. That is, the microcontroller is a part of a larger system, and

usually a operator never deals directly with the microcontroller. Even though great

care has been taken in the design of the microcontroller, it is possible to cause these

devices to get lost from the program that they are executing. The power might dip, or

a large transient magnetic field might cause the part to go into abnormal operation. In

such a case, the easiest way to restore normal operation is to send the part through a

reset sequence. Such a sequence will restore all of the initial internal status of the

microcontroller, execute the initialization code procedure of the program, and restart

the execution of the application loop. A COP timer provides just this function. A COP

timer is a timer with a relatively long period. Once the COP timer is started, it is

necessary for the main program to reset the COP periodically prior to the expiration of

the COP period. The COP timer is never allowed to time out. If the computer gets

lost, the program no longer resets the COP, so the timer will eventually overflow, and

this operation causes the microcontroller to reset. Therefore, if the part ever gets lost

from its normal program sequence, the COP will force a reset and restore the normal

operation of the system.

2.5.5 Digital Input/Output

Most microcontrollers have several digital I/O ports. Usually a port consists of

eight or fewer bits, and the bits in these ports can be outputs, inputs, or often bit

programmable as either input or output bits. If a Input/Output port has programmable

I/O, it will have an associated data direction register—DDRA, DDRB, and so forth. The

ports are usually named PORTA, PORTB, and so forth. DDRA is associated with

PORTA. Each bit in DDRA has a corresponding bit in PORTA. If a bit in DDRA is set,

the corresponding bit in PORTA is an output. The same is true for PORTB, PORTC,

PORTD, PORTE, and so forth if these ports exist on the part.



A port pin can be made into an output. When this occurs, this pin becomes a

latched output. In other words, when this bit is set it will remain set until it is reset by

the program, and vice versa. Just because a port pin is designated to be an output does

not mean that its state cannot be read by the computer. When a port is read in, the

state of all of the outputs as well as the state of the inputs will be shown in the result.

Some I/O pins are multiplexed and serve multiple functions. For example,

microcontrollers with analog-to-digital converters, ADC, usually allow the ADC pins

to serve as digital input pins as well. In that case you need mere ly read the input port,

and those pins that are above the high threshold will indicate one, and those below the

low threshold will indicate zero. Reading the port does not affect the ADC operation

at all.

2.5.6 Analog-to-Digital Converters

The ADC subsystem on most microcontrollers consists of a single successive

approximation analog-to-digital converter preceded by an analog multiplexer that can

switch the converter to any of several input pins. The program controls this switching.

The electromagnetic environment of the surface of a microcontroller die is about as

bad as can be found anywhere. Therefore, attempts to do fine resolution

measurements of analog voltages in these parts is fraught with problems. Most ADCs

use a resistive ladder to act as a digital-to-analog converter. The inputs to this ladder

are sequenced in a prescribed manner to build a voltage that matches the voltage

being measured. The input to the D-to-A is then the digital equivalent to the voltage

being measured.

Precision resistors are very difficult to manufacture on silicon, and even

precision matching between resistors is extremely difficult. While making precision

capacitors is very difficult on a silicon die, it is possible to make several capacitors

with highly accurate ratios between the capacitor values. Therefore, the approach is to

use a set of matched capacitors and a charge balance technique to accomplish the

successive approximation of the analog voltage. This method works well, and 8- and



10-bit systems are available on microcontrollers with plus or minus one-half bit

accuracy.

2.5.7 Serial Input/Output

Where would the computer be without serial input/output? The serial system

was probably the first direct human interface with any computer system. It has

expanded, and today, relatively low-speed asynchronous serial interfaces are used for

terminal and modem and network interfaces. High-speed synchronous serial links are

used for all of the above plus inter-computer connections, hardware peripheral

communications, and other types of devices where high-speed, secure communication

is required.

Many microcontrollers have both asynchronous and synchronous

communications peripherals built in. Usually, an asynchronous interface is called a

serial communications interface (SCI) while the synchronous interface is called a

serial peripheral interface (SPI).

Typically SCI systems can communicate at any of the popular asynchronous

serial bit rates. These systems have built- in baud rate generators, double buffered

input and output registers, and all of the error detection found on a universal

asynchronous receiver-transmitter (UART) chip. These I/O devices can be either

polled or interrupt driven by the computer portion of the microcontroller.

The SPI is designed to communicate at high speeds with other

microcontrollers or perhaps with hardware devices with a synchronous serial

interface. These devices typically run at megabit per second rates. Since synchronous

systems require a system clock, each microcontroller SPI can act as either a master

or a slave. The main difference between the master and the slave is which chip

generates the system clock. The master generates the system clock, and the data are

clocked into and out of the slave by the system clock. Communications with the

microcontroller and the SPI can be either polled (synchronous) or via interrupt

controller (asynchronous).



Different Controllers

Not all of these peripheral systems are found on each microcontroller. It is

possible to pick and choose between needs for the several peripheral systems and

select a microcontroller that has exactly those peripherals required. The smallest

microcontroller has only a 15-bit timer, and the most complete MC68HC05 part has

everything but a SPI system. All varieties in between these extremes exist.

2.6 Microcontroller characteristics & applications

Arising from their ‗embedded control‘ environment, microcontrollers usually

have the following features:

input/output intensive, i.e. they are capable of direct interface to a significant

number of sensors and actuators

a high level of integration, with many peripheral3 devices included ‗on-chip‘

physically small

comparatively simple program and data storage requirements

ability to operate in the real-time environment

an instruction set optimized for the embedded environment, e.g. yielding

compact code, limited arithmetic and addressing capability, strong in bit

manipulation

low cost

In many microcontroller applications either or both of the following features

are also essential:

an ability to operate in hostile environments, for example of high or low

temperature, or high electromagnetic radiation;

a low power capability, and features which ease the use of battery power.

Microcontroller Applications

There is a huge range of microcontroller applications. Some are drawn from

volume markets – the motor car, domestic appliances, mobile phones and toys. These

applications are sold in such high volume that dedicated controllers are frequently

developed for them. Others, like medical or scientific instruments, are sold in smaller

numbers, and are more likely to make use of the wide variety of general-purpose



controllers that are available. At one extreme of complexity, simple (and very cheap)

controllers are used to replace ‗glue logic‘ in a digital system. At the other extreme,

advanced 32-bit controllers perform sophisticated signal processing activities.

2.7 Some example controllers

The manufacturer usually develops a family of microcontrollers all based

around one core, where the core contains the CPU and its surrounding control

features. The core defines the instruction set, and hence keeping the core design

constant ensures software compatibility between different members of a processor

family. To the core, and on the same IC, can be added the peripheral devices which

seem best to meet a particular need. Once a company has committed itself to

designing with a particular microcontroller family, it is reluctant to change, but looks

to the manufacturer to supply it with the necessary technologica l advances, based

around a familiar core. Infrequently, the manufacturer makes a step change by

introducing a new core.

Every microcontroller is different, and each has its own unique combination of

core and peripherals. Figure 2.8 shows, in block diagram form and with no

interconnections, the features, which might be found in a simple general-purpose

controller. The core is the element that remains constant for the whole family built

around it. Ideally all memory is on-chip, and several different memory technologies

may be applied to meet the differing needs of program and data storage.

Interconnection to the outside world is through a number of parallel and serial ports.

A counter/timer is available for event counting, or to measure or generate timing

intervals.

Figure 2.8: An example microcontroller block diagram.



Microchip Inc. and the PIC microcontroller

It was the General Instruments Corporation, back in the late 1970s, that first

produced the PIC microcontroller. In its early years it did not make a wide impact.

The design was later taken over by Microchip Inc., and PICs are now one of the

fastest moving families in the 8-bit arena, in more senses than one. First, they run very

fast; second, the family is growing at a tremendous rate. PICs cover a very wide range

of 8-bit operation. At the lower end, they are simpler, cheaper and smaller than most

devices that the competition can offer, and are thus used in situations where

controllers would not be thought of as the right solution, even down to simple glue

logic applications. At the high end, however, they are quite ready to take on the best

of the 8-bit competition, with sophisticated devices equipped with excellent

peripherals. PICs have made themselves particularly attractive to the student and low-

budget developer. Development tools (both hardware and software) are cheap and

readily available. Microchip offers five closely related families of microcontroller, as

shown in Table below. All PIC controllers use a RISC-like structure, with Harvard

architecture and pipelined instruction execution. This leads to one of the strengths of

the PIC family: a very high instruction throughput.

The Philips 80C552 microcontroller

As Intel was the first company to produce a microprocessor, it seems right that

it was also the first to produce a microcontroller. It did this in 1976 with the MCS-48

(appearing in three versions, the 8035, 8048 and 8748). In its time the MCS-48 was

revolutionary. The 8748 had on-chip ultraviolet erasable programmable read-only

memory (EPROM), 64 bytes of RAM, and three input/output ports. It attracted many



adherents. In 1980 Intel launched its successor, the 8051. The 8051 took up where the

48 left off, and has also become firmly embedded, in more senses than one, in the

microcontroller world. Though the 8051 itself is now an old device, many companies

(for example Atmel, Dallas, Philips and Siemens) offer controllers based on the 51

core, and further developments are repeatedly being produced. As one of the

manufacturers who have adopted the 8051 design, Philips has developed many

variants. The 80C51 is a CMOS version of the 8051, and Philips has extended this

into a wide-ranging family. Its practice has been to use the whole of the 80C51 as

core, and to add further peripherals to this.

The Motorola 68HC05/08 microcontrollers

Motorola was early in the microprocessor field, but was not the first. By the

time it entered, with the 6800, it was able to offer a device which enjoyed remarkable

longevity. From the 6800 it developed further 8-bit conventional microprocessors

(e.g. the 6809), and also a number of single-chip controllers, starting with the 6801.

These led to the 68HC11, a sophisticated and widely used microcontroller. The HC

infix indicates the new high-speed CMOS (Complementary Metal Oxide

Semiconductor) technology with which it is made. From the 68HC11 the 68HC12 and

68HC16, both 16-bit controllers, have been developed.

An indirect development of the 6800 family was the 6805 (M146805 in full),

available initially in HMOS (High-Density N-Channel MOS) and CMOS versions.

Here the CPU was simplified, for example by the removal of the second (B)

accumulator of the 6800, reduction in addressing capability, and consequent reduction

of certain register sizes. As one of the earlier CMOS controllers, the 6805 had a great

impact on low-power applications. The 6805 was subsequently upgrade and reissued

using ‗HC‘ CMOS technology. This has enjoyed very widespread use as a simple and

low-cost microcontroller. Motorola claims that over 2 billion (2 × 109) units of the

68HC05 have been sold. The number of variants are too many to list, but contain

devices targeted specifically for automotive, computer, consumer, industrial,

telecommunications, TV and video applications.

Since the late 1990s the 68HC05 has been in the process of being replaced by

the 68HC08, which provides a direct upgrade. Both the ‘05 and the ‘08 use the ‗7‘



infix to indicate EPROM or OTP (One-Time Programmable) memory version (e.g.

the 68HC705P) and the ‗9‘ infix to indicate Flash memory. All of the 68HCXX

microcontroller families have some similarity in architecture and instruction sets, so it

is a comparatively easy task to move from one to another, selecting the device most

appropriate for the job.

2.8 Programming Microcontrollers

Most programmers are used to having an operating system that handles such

mundane things as I/O, memory management, time management, program loading,

error processing, interdevice or intertask communications, and so forth. Be prepared

for a giant step backwards when you address the microcontroller. There is usually no

operating system, no libraries of useful functions, no I/O handling, nothing but a bare-

bones computer with a bunch of hard-to-tame peripheral components onboard the

single-chip device.

C compilers for the microcontrollers have been available long enough that

they are thoroughly tested and do a good job of creating proper code. Anyone who has

programmed a microcontroller in assembly language knows that the programs must

be very direct and have no fancy overhead. Memory is strictly limited, and the

compiler must generate assembly code that is as resourceful as can be created by any

thoroughly qualified assembly language programmer for the machine.

The development environment, while quite sophisticated in terms of how it

works, does little for the programmer in terms of direct help in debugging a program.

There are two different types of development systems that are in common use. Both

of these systems require a host computer to run the device. The simplest of these

systems goes by names like evaluation module, evaluation system, or evaluation

board. These devices are usually board- level products that require a power supply in

addition to a host computer.

The software to run the development boards is merely a good terminal

emulator. Assemblers and linkers for the different chips are provided as part of the

development board. The programmer writes the code for the part in the host



computer. This code is assembled, compiled, and linked in the host computer. The

code is then down-loaded to the development board through either a serial or a

parallel link depending upon the individual system.

The development board has the microcontroller that is to be emulated on

board. This microcontroller sometimes operates in a nonuser mode that allows

internal bus access. A second computer on the development board controls the

operation of the microcontroller. Code delivered from the host is put into memory

accessed by the microcontroller, and the microcontroller can operate as if the code

were contained within its internal memory. All of the I/O lines associated with the

microcontroller are brought to a header on the development board, and a cable can be

attached to this header to a plug-in device that plugs into a target board. This target

system then operates as if it had a programmed microcontroller plugged into its

socket.

The microcomputer on the development board has a complete monitor system

in its firmware. This monitor provides communications with the host, down-loading

and up- loading capability and, most important, complete debugging firmware for the

microcontroller.

There is a single line assembler and disassembler in the firmware. This

package allows the programmer to examine and change memory in assembly

mnemonics. The microcontroller program can be single stepped, run, address

breakpointed, and the memory can be displayed in normal hexadecimal format. The

microcontroller runs at full speed when emulating operation in a target board. An

experienced programmer will be able to debug code in a microcontroller with the help

of such a development board. There is additional software available that provides a

nice display of all pertinent information in a single screen on the host computer. In

this area, you will also find that the microcontroller can be controlled from a display

of C source code on the host computer. This technique is called source level

debugging.

On later chips, another feature is incorporated to help the development

environment. This feature is called Background Debug Mode, or ONCE. Both of



these similar operations allow debug to take place in an external computer without

any access to the microcontroller resources such as interrupts or memory. When a

chip is put into BDM, certain pins become a special serial input/output port. There are

several commands that can be delivered to this port from an external computer. These

commands allow the computer to set memory, examine memory, examine registers,

set and clear registers, execute code, set and clear break points, and so forth. All of the

operations normally needed to debug a program can be executed through this special

serial port. There is no need for an on-board monitor on the microcontroller, and it is

not necessary to make use of the chip interrupts by the debugger during the debug

operation. All of the programming needed for the debugger can reside in a host

computer. Most modern chips have this type of interface, which greatly simplifies

debugging of microcontrollers.

All of the above capabilities are available with the development boards.

Another level of capability is available. These devices are box level, and usually have

a built- in power supply. Most development systems require a host computer, and

usually they come with special software to interface with the host computer. These

systems have all of the capabilities outlined above plus some significant

improvements. The breakpoint capability of these systems is much improved over the

simple address breakpoint above. Here a complicated breakpoint can be employed that will break the

program operation on read or write, at any data or address location, on access of data or program, or access

of a range of data or address locations. Also, the breakpoint can occur after a specified number of

occurrences of the breakpoint conditions.

Another major difference in the development systems is the trace buffer. A

trace buffer is a memory that is as many as 48 or 64 bits wide. Each clock cycle of the

microcontroller, the condition of all address bits, the data bus, the internal

microcontroller control bus, and as many as 16 external test point lines are captured in

the trace buffer.

Usually, the trace buffer is 4 to 16 kilowords deep, so it can hold a significant

number of microcontroller clock cycles. Even if the microcontroller is running slowly,

one million clock cycles per second, such a trace buffer represents an insignificant

execution time. To help make the data contained in the trace buffer, trace buffer



capture can be controlled by a system that is the same as the breakpoint operation.

Therefore, the portion of the program that is traced is under the detailed control of the

programmer.

The data in the trace buffer can be displayed in several different manners. The

simplest, of course, is to print to the computer screen the I/O pattern of all the lines

captured. This type of display is extremely difficult to interpret, but it is useful in

some cases. To help the programmer determine where the microcontroller is

operating, it is possible to read the data bits and display a disassembled version of the

code being read into the microcontroller. This display is also quite useful in

debugging the code.

Yet another display is called a logic analyzer. A logic analyzer is an

oscilloscope display that shows the logical status of the various lines captured in the

trace buffer. A logic analyzer is a separate device, but it can have a built- in

disassembler that displays the disassembled code along with the condition of the

designated lines. The devices with logic analyzers and trace buffers are quite a bit

more expensive than the development boards discussed earlier. Some of the

development systems provide source level debugging capability for high- level

languages like C.

Another approach to development systems has been made available in some of

the newer microcontrollers. The microcontrollers from the MC68HC16 family

and those from the MC68300 family all have a background mode of operation.

When operating in the background mode, these chips stop their normal

computing and start

serial communications with an external computer. The background mode can be entered as the result of an

internal command or an external signal. There are enough debug commands that can be communicated over

this port to allow complete debug of any program that the microcontroller might be running. Minimum

external circuitry is needed to support the debug mode, so these high-powered chips can operate as their own

development environment. Here, the development support is mostly software contained within the host

computer, and the deliverable system can contain all of the essential components of a development system.

2.9 Summary



The microcontroller is a microprocessor intended for small-scale control

applications. It integrates a conventional microprocessor core and a range of

peripheral devices on a single IC, at the smallest size and lowest cost possible. A

family of controllers is based around the same core, but with different peripherals and

IC packaging, optimized for different applications.

While all microprocessors differ, there are some fundamentally different

options in processor design, which have major significance for the final performance.

These options include RISC vs. CISC, conventional von Neumann vs. Harvard, and

the option of pipelining. The PIC, 80C51 and MC68HC05/08 series of

microcontrollers are all successful and well-established 8-bit controllers, each with

their own unique attributes and advantages.

2.10 Self Test

1. The microprocessor keeps track of the instructions to execute using……….

a. Program counter

b. Accumulator

c. Instruction Counter

d. All the above

2. A microprocessor based system consists of ……………

a. ALU

b. ROM

c. Keyboard

d. All the above

3. A microcontroller consists of ……………….

a. Microprocessor

b. Memory

c. Peripherals

d. All the above

4. In the ……………, program and data memory share the same address and

data buses.

5. RISC computers have less instructions in their instruction set compared to the

CISC computers. (True/False)



6. In a microcontroller, the program instructions are usually stored in Read only

Memory. (True/False)

7. Usually microcontrollers use called memory mapped I/O. (True/False)

8. All microcontrollers have the same combination of core and peripherals.

9. 8051 is ----bit microcontroller.

10. In the instruction pipelining all the instructions of the processor may have

different duration of execution.

Answers

1. a

2. d

3. d

4. von Neumann architecture

5. True

6. True

7. True

8. False

9. 8

10. False

2.11 Questions

1. Explain the working of a microprocessor-based system.

2. Briefly explain the different microprocessor design options. Which one will

you choose? Why?

3. With a block diagram explain a microcontroller.

4. Explain the different subsystems of microcontrollers.

5. What are the characteristics of microcontroller?

6. How is microcontroller programming different from conventional

programming?

UNIT 3



Embedded Programming Basics

Contents

3.1 Introduction

3.2 Objectives

3.3 Example Embedded Program

3.4 Compiling, Linking and Locating

3.5 Downloading and Debugging

3.6 Summary

3.7 Self Test

3.8 Questions

3.1 Introduction

Embedded systems programming is the development of programs, intended to

be part of a larger operating system or, in a somewhat different usage, to be

incorporated on a microprocessor that can then be included as part of a variety of



hardware devices. Embedded systems are usually programmed in high- level language

that is compiled (and/or assembled) into an executable (―machine‖) code. This

machine code is loaded into Read Only Memory (ROM) and is called ―firmware‖,

―microcode‖ or a ―microkernel‖.

In this unit we'll discuss the basic tools required for embedded programming.

Also we will look at an example program to have a feel of what embedded

programming is all about. The program we'll look at is similar in spirit to the "Hello,

World!" example found in the beginning of most other programming books. Unlike

any other programming, the execution of a particular program in embedded systems

depends on the target hardware. We will discuss the parts of the example program that

are dependent on the target hardware.

3.2 Objectives


What are the problems associated with embedded programming?

How to build a simple embedded program?

What are the tools used in building an embedded program?

Downloading and Debugging an Embedded program

3.3 An Example Embedded Program

Embedded systems are among the most difficult computer platforms for

programmers to work with. In some embedded systems, it might even be impossible

to implement the "Hello, World!" program. And in those systems that are capable of

supporting it, the printing of text strings is usually more of an endpoint than a

beginning. Most embedded systems lack a monitor or analogous output device. And



those that do have one typically require a special piece of embedded software, called a

display driver, to be implemented first.

Embedded programmers must be self-reliant. They must always begin each

new project with the assumption that nothing works - that all they can rely on is the

basic syntax of their programming language. Even the standard library routines might

not be available to them. These are the auxiliary functions - like printf and scanf that

most other programmers take for granted. In fact, library routines are often as much a

part of the language standard as the basic syntax. However, that part of the standard is

more difficult to support across all possible computing platforms and is occasionally

ignored by the makers of compilers for embedded systems.

So you won't find an actual "Hello, World!" program in this unit. Every

embedded system will usually have at least one LED that could be controlled by

software. So, as a substitute for the "Hello, World!" program, we can think of a

program that blinks an LED at a rate of 1 Hz (one complete on-off cycle per second).

We will assume that only the basic syntax of C is available for our example.

The blinking LED program is shown below. This part of the program is

hardware-independent. However, it relies on the hardware-dependent functions

toggleLed and delay to change the state of the LED and to handle the timing,

respectively.

/*************************************************************

* Function: main ()

* Description: Blink the green LED once a second.

* Notes: This outer loop is hardware-independent. However,

* it depends on two hardware-dependent functions .

* Returns: This routine contains an infinite loop.

**************************************************************/

void



main (void)

{

while (1)

{

toggleLed(LED_GREEN) ; /* Change the state of the LED. */

delay(500) ; /* Pause for 500 milliseconds. */

}

} /* main() */

toggleLed

We will consider the Arcom board to explain a few hardware dependent

concepts. An Arcom board will usually have two LEDs: one red and one green. The

state of each LED is controlled by a bit in a register called the Port 2 I/O Latch

Register (P2LTCH, for short). This register is located within the very same chip as the

CPU and takes its name from the fact that it contains the latched state of eight I/O

pins found on the exterior of that chip. Collectively, these pins are known as I/O Port

2. And each of the eight bits in the P2LTCH register is associated with the voltage on

one of the I/O pins. For example, bit 6 controls the voltage going to the green LED:

#define LED_GREEN 0x40 /* The green LED is controlled by bit 6. */

By modifying this bit, it is possible to change the voltage on the external pin

and, thus, the state of the green LED. As shown in Figure 3.1, when bit 6 of the

P2LTCH register is 1 the LED is off; when it is 0 the LED is on.

Figure 3.1: LED wiring on the Arcom board



The P2LTCH register is located in a special region of memory called the I/O

space, at offset 0xFF5E. Registers within the I/O space of an 80x86 processor can be

accessed only by using the assembly language instructions IN and OUT. The C

language has no built- in support for these operations. Its closest replacements are the

library routines inport and outport, which are declared in the PC specific header file

dos.h. Ideally, we would just include that header file and call those library routines

from our embedded program. However, because they are part of the DOS

programmer's library, we'll have to assume the worst: that they won't work on our

system. At the very least, we shouldn't rely on them in our example program.

An implementation of the toggleLed routine that is specific to the Arcom

board and does not rely on any library routines is shown below. The actual algorithm

is straightforward: read the contents of the P2LTCH register, toggle the bit that

controls the LED of interest, and write the new value back into the register. Although

this routine is written in C, the functional part is actually implemented in assembly

language. This is a handy technique, known as inline assembly, that separates the

programmer from the intricacies of C's function calling and parameter passing

conventions but still gives the full expressive power of assembly language.

The exact syntax of inline assembly varies from compiler to compiler. In this

example, the format used is the one preferred by the Borland c++ compiler. Borland's

inline assembly format is one of the best because it supports references to variables

and constants that are defined within the C code.

#define P2LTCH 0xFF5E /* The offset of the P2LTCH register. */

/*******************************************************

* Function: toggleLed ()

* Description: Toggle the state of one or both LEDs .

* Notes: This function is specific to Arcom’s Target188EB board.

* Returns: None defined.

*********************************************************/

void

toggleLed(unsigned char ledMask)

{

asm {

mov dx, P2LTCH /* Load the address of the register */



in al, dx /* Read the contents of the register. */

mov ah, ledMask /* Have the ledMask into a register. */

xor al, ah /* Toggle the requested bits. */

out dx, a1 /* write the new register contents. */

};

} /* toggleLed() */

delay

We also need to implement a half-second (500 ms) delay between LED

toggles. This is done by busy-waiting within the delay routine shown below. This

routine accepts the length of the requested delay, in milliseconds, as its only

parameter. It then multiplies that number by the constant CYCLES_PER_MS to

obtain the total number of while- loop iterations required to delay for the requested

time period

/ **********************************************************

* Function: delay ()

* Description: Busy-wait for the requested number of milliseconds .

* Notes: The number of decrement-and-test cycles per millisecond

was determined through trial and error. This value is

dependent upon the processor type and speed.

* Returns: None defined.

************************************************************/

void

delay(unsigned int nMilliseconds)

{

#define CYCLES_PER_MS 260 /* Number of decrement-and-test cycles. */

unsigned long nCycles = nMilliseconds * CYCLES_PER_MS;

while (nCycles--) ;

} /* delay() */

The hardware-specific constant CYCLES_PER_MS represents the number of

decrement-and-test cycles (nCycles-- != 0) that the processor can perform in a single



millisecond. To determine this number we can use trial and error. We have used the

value 260 randomly assuming it would be the nearest to the ideal value, that would

make the LED blink at a rate of 1Hz.

That's all there is to the Blinking LED program. The three functions main

toggleLed, and delay do the whole job. If you want to port this program to some other

embedded system, you should read the documentation that came with your hardware,

rewrite toggleLed as necessary, and change the value of CYCLES_PER_MS. In the

next few sections we will see how to build and execute this example program. Before

that we will see what is the role of infinite loops in embedded systems.

The Role of the Infinite Loop

One of the most fundamental differences between programs developed for

embedded systems and those written for other computer platforms is that the

embedded programs almost always end with an infinite loop. Typically, this loop

surrounds a significant part of the program's functionality - as it does in the Blinking

LED program. The infinite loop is necessary because the embedded software‘s job is

never done. It is intended to be run until either the world comes to an end or the board

is reset, whichever happens first.

In addition, most embedded systems have only one piece of software running

on them. And although the hardware is important, it is not a digital watch or a cellular

phone or a microwave oven without that embedded software. If the software stops

running, the hardware is rendered useless. So the functional parts of an embedded

program are almost always surrounded by an infinite loop that ensures that they will

run forever.

3.4 Compiling, Linking, and Locating

In this section, we'll examine the steps involved in preparing your software,

the example program, for execution on an embedded system. We'll also discuss the

associated development tools and see how to build the Blinking LED program.

Actually, embedded systems programming is not substantially different from the



programming you've done before. The only thing that has really changed is that each

target hardware platform is unique. Unfortunately, that one difference leads to a lot of

additional software complexity, and it's also the reason you'll need to be more aware

of the software build process than ever before.

The Build Process

There are a lot of things that software development tools can do automatically

when the target platform is well defined. This automation is possible because the

tools can exploit features of the hardware and operating system on which your

program will execute. For example, if all of your programs will be executed on IBM-

compatible PCs running DOS, your compiler can automate - and, therefore, hide from

your view - certain aspects of the software build process. Embedded software

development tools, on the other hand, can rarely make assumptions about the target

platform. Instead, the user must provide some of his/her own knowledge of the system

to the tools by giving them more explicit instructions.

The process of converting the source code representation of your embedded

software into an executable binary image involves three distinct steps. First, each of

the source files must be compiled or assembled into an object file. Second, all of the

object files that result from the first step must be linked together to produce a single

object file, called the relocatable program. Finally, physical memory addresses must

be assigned to the relative offsets within the relocatable program in a process called

relocation. The result of this third step is a file that contains an executable binary

image that is ready to be run on the embedded system.

The embedded software development process just described is illustrated in

Figure 3.2. In this figure, the three steps are shown from top to bottom, with the tools

that perform them shown in boxes that have rounded corners. Each of these

development tools takes one or more files as input and produces a single output file.



Figure 3.2: The embedded software development process

Each of the steps of the embedded software build process is a transformation

performed by software running on a general-purpose computer. To distinguish this

development computer (usually a PC or Unix workstation) from the target embedded

system, it is referred to as the host computer. In other words, the compiler, assembler,

linker, and locator are all pieces of software that run on a host computer, rather than

on the embedded system itself. Yet, despite the fact that they run on some other

computer platform, these tools combine their efforts to produce an executable binary

image that will execute properly on the target embedded system. This split of

responsibilities is shown in Figure 3.3.

Figure 3.3: The split between host and target



In this section and the next we will be using the GNU tools (compiler,

assembler, linker, and debugger) as examples, to explain the build process. These

tools are extremely popular with embedded software developers because they are

freely available (even the source code is free) and support many of the most popular

embedded processors. We will use features of these specific tools as illustrations for

the general concepts discussed. Once understood, these same basic concepts can be

applied to any equivalent development tool.

Compiling

The job of a compiler is mainly to translate programs written in some human

readable language into an equivalent set of opcodes for a particular processor. In that

sense, an assembler is also a compiler (you might call it an "assembly language

compiler") but one that performs a much simpler one-to-one translation from one line

of human-readable mnemonics to the equivalent opcode. Everything in this section

applies equally to compilers and assemblers. Together these tools make up the first

step of the embedded software build process.

Of course, each processor has its own unique machine language, so you need

to choose a compiler that is capable of producing programs for your specific ta rget

processor. In the embedded systems case, this compiler almost always runs on host

computer. It simply doesn't make sense to execute the compiler on embedded system

itself. A compiler such as this - that runs on one computer platform and produces code

for another - is called a cross-compiler. The use of cross-compiler is one of the

defining features of embedded software development.

The GNU C/C++ compiler (gcc) and assembler (as) can be configured as

either native compilers or cross-compilers. As cross-compilers these tools support

impressive set of host-target combinations. Table 3-1 lists some of the most popular

of the supported hosts and targets. Of course, the selections of host platform and

target processor are independent; these tools can be configured for any combination.

Table 3.1: Hosts and Targets Supported by the GNU Compiler



Host Platforms Target Processors

DEC Alpha Digital Unix AMD/Intel x86 (32-bit only)

HP 9000/700 HP-UX Fujitsu SPARClite

IBM Power PC AIX Hitachi H8/300, H8/300H. H8/S

IBM RS6000 AIX Hitachi SH

SGI Iris IRIX IBM/Motorola Power PC

Sun SPARC Solaris Intel i960

Sun SPARC SunOS MIPS R3xxx, R4xx0

X86 Windows 95/NT Mitsubishi D10V, M32R/D

XB6 Red Hat Unix Motorola 68k

Sun SPARC, MicroSPARC

Toshiba TX39

Regardless of the input language (C/C++, assembly, or any other), the output

of the cross-compiler will be an object file. This is a specially formatted binary file

that contains the set of instructions and data resulting from the language translation

process. Although parts of this file contain executable code, the object file is not

intended to be executed directly. In fact, the internal structure of an object file

emphasizes the incompleteness of the larger program.

The contents of an object file can be thought of as a very large, flexible data

structure. The structure of the file is usually defined by a standard format like the

Common Object File Format (COFF) or Extended Linker Format (ELF). If you'll be

using more than one compiler (i.e., You'll be writing parts of your program in

different source languages), you need to make sure that each is capable of producing

object files in the same format. Although many compilers (particularly those that run

on Unix platforms) support standard object file formats like COFF and ELF (gcc

supports both), there are also some others that produce object files only in proprietary

formats. If you're using one of the compilers in the latter group, you might find that

you need to buy all of your other development tools from the same vendor.

Most object files begin with a header that describes the sections that follow.

Each of these sections contains one or more blocks of code or data that originated

within the original source file. However, these blocks have been regrouped by the



compiler into related sections. For example, all of the code blocks are collected into a.

section called text, initialized global variables (and their initial values) into a section

called data, and uninitialized global variables Into a section called bss.

There is also usually a symbol table somewhere in the object file that contains

the names and locations of all the variables and functions referenced within the source

file. Parts of this table may be incomplete, however, because not all of the variables

and functions are always defined in the same file. These are the symbols that refer to

variables and functions defined in other source files. And it is up to the linker to

resolve such unresolved references.

Linking

All of the object files resulting from step1 must be combined in a special way

before the program can be executed. The object files themselves are individually

incomplete, most notably in that some of the internal variable and function references

have not yet been resolved. The job of the linker is to combine these object files and,

in the process, solve all of the unresolved symbols.

The output of the linker is a new object file that contains all of the code and

data from the input object files and is in the same object file format. It does this by

merging the text, data, and bss sections of the input files. So, when the linker is

finished executing, all of the machine language code from all of the input object files

will be in the text section of the new file, and all of .the initialized and uninitialized

variables will reside in the new data and bss sections, respectively.

While the linker is in the process of merging the section contents, it is also on

the lookout for unresolved symbols. For example, if one object file contains an

unresolved reference to a variable named foo and a variable with that same name is

declared in one of the other object files, the linker will match them up. The

unresolved reference will be replaced with a reference to the actual variable. In other

words, if foo is located at offset 14 of the output data section, its entry in the symbol

table will now contain that address.



The GNU linker (ld) runs on all of the same host platforms as the GNU

compiler. It is essentially a command-line tool that takes the names of all the object

files to be linked together as arguments. For embedded development; a special object

file that contains the compiled startup code must also be included within this list. (See

the sidebar "Startup Code" later in this section.) The GNU linker also has a scripting

language that can be used to exercise tighter control over the object file that is output.

If the same symbol is declared in more than one object file, the linker is unable

to proceed. It will likely appeal to the programmer - by displaying an error message -

and exit. However, if a symbol reference instead remains unresolved after all of the

object files have been merged, the linker will try to resolve the reference on its own.

The reference might be to a function that is part of the standard library, so the linker

will open each of the libraries described to it on the command line (in the order

provided) and examine their symbol tables. If it finds function with that name, the

reference will be resolved by including the associated code and data sections within

the output object file (ie static linking).

Unfortunately, the standard library routines often require some changes before

they can be used in an embedded program. The problem here is that the standard

libraries provided with most software development tool suites arrive only in object

form. So you rarely have access to the library source code to make the necessary

changes yourself.

Startup Code One of the things that traditional software development tools do automatically

is to insert startup code. Startup code is a small block of assembly language code that

prepares the way for the execution of software written in a high- level language. Each high- level language has its own set of expectations about the runtime environment. For example, C and C++ both utilize an implicit stack. Space for the stack has to be

allocated and initialized before software written in either language can be properly executed. That is just one of the responsibilities assigned to startup code for C/C++

programs.

Most cross-compilers for embedded systems include an assembly language file called startup.asm, crtO.S (short for C runtime), or something similar. The location and contents of this file are usually described in the documentation supplied

with the compiler.



Startup code for C/C++ programs usually consists of the following actions, performed in the order described:

1. Disable all interrupts. 2. Copy any initialized data from ROM to RAM.

3. Zero the uninitialized data area. 4. Allocate space for and initialize the stack. 5. Initialize the processor's stack pointer.

6. Create and initialize the heap. 7. Execute the constructors and initializers for all global variables (C++ only).

8. Enable interrupts. 9. Call main.

Typically, the startup code will also include a few instructions after the call to main. These instructions will be executed only in the event that the high- level

language program exits (i.e., the call to main returns). Depending on the nature of the embedded system, you might want to use these instructions to halt the processor, reset the entire system, or transfer control to a debugging tool.

Because the startup code is not inserted automatically, the programmer must

usually assemble it himself and include the resulting object file among the list of input files to the linker. He might even need to give the linker a special command- line option to prevent it from inserting the usual startup code. Working startup code for a

variety of target processors can be found in a GNU package called libgloss.

After merging all of the code and data sections and resolving all of the symbol

references, the linker produces a special "relocatable" copy of the program. In other

words, the program is complete except for one thing: no memory addresses have yet

been assigned to the code and data sections within. If you weren't working on an

embedded system, you'd be finished building your software now.

But embedded programmers aren't generally finished with the build process at

this point. Even if your embedded system includes an operating system, you'll

probably still need an absolutely located binary image. In fact, if there is an operating

system the code and data of which it consists are most likely within the relocatab le

program too. The entire embedded application - including the operating system - is

almost always statically linked together and executed as a single binary image.

Locating

The tool that performs the conversion from relocatable program to executable

binary image is called a locator. It takes responsibility for the easiest step of the three.

In fact, you will have to do most of the work in this step yourself, by providing



information about the memory on the target board as input to the locator. The loca tor

will use this information to assign physical memory addresses to each of the code and

data sections within the relocatable program. It will then produce an output file that

contains a binary memory image that can be loaded into the target ROM.

In many cases, the locator is a separate development tool. However, in the

case of the GNU tools, this functionality is built right into the linker. Whether you are

writing software for a general-purpose computer or an embedded system, at some

point the sections of your relocatable program must have actual addresses assigned to

them. In the first case, the operating system does it for you at load time. In the second,

you must perform the step with a special tool. This is true even if the locator is a part

of the linker.

The memory information required by the GNU linker can be passed to it in the

form of a linker script. Such scripts are sometimes used to control the exact order of

the code and data sections within the relocatable program. But here, we want to do

more than just control the order; we also want to establish the location of each section

in memory.

What follows is an example of a linker script for a hypothetical embedded

target that has 512 KB each of RAM and ROM;

MEMORY

{

ram : ORIGIN = 0X00000, LENGTH = 512K

rom : ORIGIN = 0X80000, LENGTH = 512K

}

SECTIONS

{

data ram : /*Initialized data */

{

_DataStart = . ;

*(.data)

_DataEnd = . ;

} >rom

bss : /*Uninitialized data. */

{

_BssStart = . ;

*(.bas)

_BssEnd = . ;

}

_BottomOfHeap = . ; /* The heap starts here. */



_TopOfStack = 0x80000, /* The stack ends here */

text rom : /* The actual instructions. */

{

*( .text)

}

}

This script informs the GNU linker's built- in locator about the memory on the

target board and instructs it to locate the data and bss sections in RAM (starting at

address 0x00000) and the text section in ROM (starting at 0x80000). However, the

initial values of the variables in the data segment will be made a part of the ROM

image by the addition of >rom at the end of that section's definition.

All of the names that begin with underscores (_TopOfStack, for example) are

variables that can be referenced from within your source code. The linker will use

these symbols to resolve references in the input object files. So, for example, there

might be a part of the embedded software (usually within the startup code) that copies

the initial values of the initialized variables from ROM to the data section in RAM.

The start and stop addresses for this operation can be established symbolically, by

referring to the integer variables _DataStart and _DataEnd.

The result of this final step of the build process is an absolutely located binary

image that can be downloaded to the embedded system or programmed into a read-

only memory device. In the previous example, this memory image would be exactly 1

MB in size. However, because the initial values for the initialized data section are

stored in ROM, the lower 512 kilobytes of this image will contain only zeros, so only

the upper half of this image is significant. You'll see how to download and execute

such memory images in the next sections.

Building the Example Program

The blinking LED program can be built using Borland's C++ Compiler and

Turbo Assembler. These tools can be run on any DOS or Windows-based PC. The

following command are used:



The first step in the build process is to compile the source files. The

command-line options we'll need are -c for "compile, but don't link," –v for "include

symbolic debugging information in the output," –ml for "use the large memory

model," and -1 for "the target is an 80186 processor." Here is the actual command:

bcc -c -v -ml -1 source file

This command is repeated as many times as the number of source modules are

there (i.e. compile each of the source module),. The result of the above command is

the creation of an object file that has the same prefix as the .c file and the extension

.obj.

We must also include some startup code for the C program. The following

command can be used for that purpose:

tasm /mx startup code file

The command that's actually used to link all object files together is shown

here. Beware that the order of the object files on the command line does matter in this

case: the startup code must be placed first for proper linkage.

tlink /m /v /s startup_code_object_file soucemodule_object_files,

new_exe_file_name, map_file_name

As a result of the tlink command, Borland's Turbo Linker will produce two

new files: new_exe_file_name and map_file_name in the working directory. The first

file contains the relocatable program and the second contains a human-readable

program map. The map file provides information similar to the contents of the linker

script described earlier. However, these are results and, therefore, include the lengths

of the sections and the names and locations of the public symbols found in the

relocatable program.

One more tool must be used to make the Blinking LED program executable: a

locator. The locating tool we'll be using is provided by Arcom, as part of the

SourceVIEW development and debugging package included with the board. Because



this tool is designed for this one particular embedded platform, it does not have as

many options as a more general locator. In fact, there are just three parameters: the

name of the relocatable binary image, the starting address of the ROM (in

hexadecimal) and the total size of the destination RAM (in kilobytes)

tcrom new_exe_file_name ROM_ starting_addres total_size_of_RAM

The tcrom locator massages the contents of the relocatable input file-assigning

base addresses to each section-and outputs the file new_exe_file_name.rom. This file

contains an absolutely located binary image that is ready to be loaded directly into

ROM. But rather than load it into the ROM with a device programmer, we'll create a

special ASCII version of the binary image that can be downloaded to the ROM over a

a serial port. For this we will use a utility provided by Arcom, called bin2hex. Here is

the syntax of the command:

bin2hex new_exe_file_name.rom port

This extra step creates a new file, called new_exe_file_name.hex, that contains

exactly the same information as new_exe_file_name.rom, but in an ASCII

representation called Intel Hex Format.

3.5 Downloading and Debugging

Once you have an executable binary image stored as a file on the host

computer, you will need a way to downloa4, that image to the embedded system and

execute it. The executable binary image is usually loaded into a memory device on the

target board and executed fr6m there. And if you have the right tools at your disposal,

it will be possible to set breakpoints in the program or to observe its execution in less

intrusive ways. This section describes various techniques for downloading, executing,

and debugging embedded software.

When in ROM

One of the most obvious ways to download your embedded software is to load

the binary image into a read-only memory device and insert that chip into a socket on



the target board. Obviously, the contents of a truly read-only memory device could

not be overwritten. Embedded systems commonly employ special read-only memory

devices that can be programmed (or reprogrammed) with the help of a special piece of

equipment called a device programmer. A device programmer is a computer system

that has several memory sockets on the top - of varying shapes and sizes - and is

capable of programming memory devices of all sorts.

In an ideal development scenario, the device programmer would be connected

to the same network as the host computer. That way, files that contain executable

binary images could 'be easily transferred to it for ROM programming. After the

binary image has been transferred to the device programmer, the memory chip is

placed into the appropriately sized and shaped socket and the device type is selected

from an on-screen menu. The actual device programming process can take anywhere

from a few seconds to several minutes, depending on the size of the binary image and

the type of memory device you are using.

After you program the ROM, it is ready to be inserted into its socket on the

board. Of course, this shouldn't be done while the embedded system is still powered

on. The power should be turned off and then reapplied only after the chip has been

carefully inserted. As soon as power is applied to it, the processor will begin to fetch

and execute the code that is stored inside the ROM. However, beware that each type

of processor has its own rules about the location of its first instruction. For example,

when the Intel 80188EB processor is reset, it begins by fetching and executing

whatever is stored at physical address FFFF0h. This is called the reset address and the

instructions located there are collectively known as the reset code. We must always

ensure that the binary image we've loaded into the ROM satisfies the target

processor's reset rules.

Led Debugging: One of the most primitive debugging techniques available is the use

of an LED as indicator of success or failure. The basic idea is to slowly walk the LED

enable code through the larger program. In other words, you first begin with the LED

enable code at the reset address. If the LED turns on, then you can edit the program,

moving the LED enable code to just after the next execution milestone, rebuild, and

test. This works best for very simple, linearly executed programs like the startup code.



The Arcom board includes a special in-circuit programmable memory, called

Flash memory, that does not have to be removed from the board to be reprogrammed.

In fact, software that can perform the device programming function is already

installed in another memory device on the board. The Arcom board actually has two

read-only memory devices - one (a true ROM) contains a simple program that allows

the user to in-circuit program the other (a Flash memory device).

All the host computer needs to talk to the monitor program is a serial port and

a terminal program. Instructions for loading an Hex Format file, into the Flash

memory device are usually provided in the user manuals given for that hardware.

The biggest disadvantage of this download technique is that there is no easy

way to debug software that is executing out of ROM. The processor fetches and

executes the instructions at a high rate of speed and provides no way for you to view

the internal state of the program. This might be fine once you know that your software

works and you're ready to deploy the system, but it's not very helpful during software

development. Of course, you can still examine the state of the LEDs and other

externally visible hardware but this will never provide as much information and

feedback as a debugger.

RReemmoo ttee DDeebb uuggggeerrss

If available, a remote debugger can be used to download, execute, and debug

embedded software over a serial port or network connection between the host and

target. The frontend of a remote debugger looks just like any other debugger that you

might have used. It usually has a text or GUI-based main window and several smaller

windows for the source code, register contents, and other relevant information about

the executing program. However, in the case of embedded systems, the debugger and

the software being debugged are executing on two different computer systems.

A remote debugger actually consists of two pieces of software. The fro ntend

runs on the host computer and provides the human interface just described. But there

is also a hidden backend that runs on the target processor and communicates with the

frontend over a communications link of some sort. The backend provides for low-



level control of the target processor and is usually called the debug monitor. Figure -1

shows how these two components work together.

Figure 3.4: A remote debugging program

The debug monitor resides in ROM - having been placed there in the manner

described earlier (either by you or at the factory}- and is automatically started

whenever the target processor is reset. It monitors the communications link to the host

computer and responds to requests from the remote debugger running there. Of

course, these requests and the monitor's responses must conform to some predefined

communications protocol and are typically of' a very low-level nature. Examples of

requests the remote debugger can make are "read register x," "modify register y,"

"read n bytes of memory starting at address," and "modify the data at address." The

remote debugger combines sequences of these low-level commands to accomplish

high- level debugging tasks like downloading a program, single-stepping through it,

and setting breakpoints.

One such debugger is the GNU debugger (gdb). Like the other GNU tools, it

was originally designed for use as a native debugger and was later given the ability to

perform cross-platform debugging. So you can build a version of the GDB frontend

that runs on any supported host and yet understands the opcodes and register names of

any supported target. Source code for a compatible debug monitor is included within

the GDB package and must be ported to the target platform. However, beware that

this port can be tricky, particularly if you only have LED debugging at your disposal.

Communication between the GDB frontend and the debug monitor is byte-

oriented and designed for transmission over a serial connection. The command format



and some of the major commands are shown in Table 4-1. These commands

exemplify the type of interactions that occur between the typical remote debugger

frontend and the debug monitor.

Table 3.2: GDB Debug Monitor Commands Command Request Format Response Format

Read registers g data

Write registers Gdata OK

Read data at address maddress,length data

Write data at address Maddress,length:data OK

Start/restart execution c Ssignal

Start execution from address caddress Ssignal

Single step s Ssignal

Single step from address saddress Ssignal

Reset/kill program k no response

Remote debuggers are one of the most commonly used downloading and

testing tools during development of embedded software. This is mainly because of

their low cost. Embedded software developers already have the requisite host

computer.

In addition, the price of a remote debugger frontend does not add significantly

to the cost of a suite of cross-development tools (compiler, linker, locator, etc.).

Finally, the suppliers of remote debuggers often desire to give away the source code

for their debug monitors, in order to increase the size of their installed user base.

EEmmuullaattoo rrss

Remote debuggers are helpful for monitoring and controllin8 the state of

embedded software, but only an in-circuit emulator (ICE) allows you to examine the

state of the processor on which that program is running. In fact, an ICE actually takes

the place of - or emulates - the processor on your target board. It is itself an embedded

system, with its own copy of the target processor, RAM, ROM, and its own embedded

software. As a result, in-circuit emulators are usually pretty expensive - often more



expensive than the target hardware. But they are powerful tools, and in a tight

debugging spot nothing else will help you get the job done better.

Like a debug monitor, an emulator uses a remote debugger for its human

interface. In some cases, it is even possible to use the same debugger frontend for

both. But because the emulator has its own copy of the target processor it is possible

to monitor and control the state of the processor in real time. This allows the emulator

to support such powerful debugging features as hardware breakpoints and real-time

tracing, in addition to the features provided by any debug monitor. With a debug

monitor, you can set breakpoints in your program. However, these software

breakpoints are restricted to instruction fetches-the equivalent of the command "stop

execution if this instruction is about to be fetched." Emulators, by contrast, also

support hardware breakpoints. Hardware breakpoints allow you to stop execution in

response to a wide variety of events. These events include not only instruction

fetches, but also memory and I/O reads and writes, and interrupts. For example, you

might set a hardware breakpoint on the event "variable foo contains 15 and register

AX becomes 0."

Another useful feature of an in-circuit emulator is real- time tracing. Typically,

an emulator incorporates a large block of special-purpose RAM that. is dedicated to

storing information about each of the processor cycles that are executed. This feature

allows you to see in exactly what order things happened, so it can help you answer

questions, such as, did the timer interrupt occur before or after the variable bar

became 94? In addition, it is usually possible to either restrict the information that is

stored or post-process the data prior to viewing it in order to cut down on the amount

of trace data to be examined.

RROOMM EEmmuullaattoorrss

A ROM emulator is a device - that emulates a read-only memory device. Like

an ICE, it is an embedded system that connects to the target and communicates with

the host. However, this time the target connection is via a ROM socket. To the

embedded processor, it looks like any other read-only memory device. But to the

remote debugger, it looks like a debug monitor.



ROM emulators have several advantages over debug monitors. First, no one

has to port the debug monitor code to your particular target hardware. Second, the

ROM emulator supplies its own serial or network connection to the host, so it is not

necessary to use the target's own, usually limited, resources. And finally, the ROM

emulator is a true replacement for the original ROM, so none of the target's memory is

used up by the debug monitor code.

Simulators and Other Tools

Of course, many other debugging tools are available to you, including

simulators, logic analyzers, and oscilloscopes. A simulator is a completely host-based

program, that simulates the functionality and instruction set o f the target processor.

The human interface is usually the same as or similar to that of the remote debugger.

In fact, it might be possible to use one debugger frontend for the simulator backend as

well, as shown in Figure 3.5. Although simulators have many disadvantages, they are

quite valuable in the earlier stages of a project when there is not yet any actual

hardware for the programmers to experiment with.

Figure 3.5: The ideal situation: a common debugger front-end

By far, the biggest disadvantage of a simulator is that it only simulates the

processor And embedded systems frequently contain one or more other important

peripheral. Interaction with these devices can sometimes be imitated with simulator

scripts or other workarounds, but such workarounds are often more trouble to create



than the simulation is valuable. So you probably won't do too much with the simulator

once you have the actual embedded hardware available to you.

Once you have access to your target hardware-and especially during the

hardware debugging- logic analyzers and oscilloscopes can be indispensable

debugging tools. They are most useful for debugging the interactions between the

processor and other chips on the board. Because they can only view signals that lie

outside the processor, however, they cannot control the flow of execution of your

software like a debugger or an emulator can. This makes these tools significantly less

useful by themselves. But coupled with a software-debugging tool like a remote

debugger or an emulator, they can be extremely valuable.

A logic analyzer is apiece of laboratory equipment that is designed specifically

for troubleshooting digital hardware. It can have dozens or even hundreds of inputs,

each capable of detecting only one thing: whether the electrical signal it. Is attached to

is currently at logic level l or 0. Any subset of the inputs that you select can be

displayed against a timeline as illustrated in Figure 3.6. Most logic analyzers will also

let you begin capturing data, or "trigger," on a particular pattern. For example, you

might make this request: "Display the values of input signals 1 through 10, but don't

start recording what happens until inputs 2 and 5 are both zero at the same time."

Figure 3.6: A typical logic analyzer display

An oscilloscope is another piece of laboratory equipment for hardware

debugging. But this one is used to examine any electrical signal, analog or digital, on

any piece of hardware. Oscilloscopes are sometimes useful for quickly observing the

voltage on a particular pin or. in the absence of a logic analyzer, for something

slightly more complex. However, the number of inputs is much smaller (there are



usually about four) and advanced triggering logic is not often available. As a result,

it'll be useful to you only rarely as a software debugging tool.

Most of the debugging tools described in this chapter will be used at some

point or another in every embedded project. Oscilloscopes and logic analyzers are

most often used to debug hardware problems-simulators during early stages of the

software development, and debug monitors and emulators during the actual software

debugging.

3.6 Summary

Embedded systems are among the most difficult computer platforms for

programmers to work with. Most embedded systems lack a monitor or analogous

output device. Embedded programmers must be self- reliant. They must always begin

each new project with the assumption that nothing works - that all they can rely on is

the basic syntax of their programming language.

The process of converting the source code representation of your embedded

software into an executable binary image involves three distinct steps. First, each of

the source files must be compiled or assembled into an object file. Second, all of the

object files that result from the first step must be linked together to produce a single

object file, called the relocatable program. Finally, physical memory addresses must

be assigned to the relative offsets within the relocatable program in a process called

relocation. The result of this third step is a file that contains an executable binary

image that is ready to be run on the embedded system.

3.7 Self Test

1. A compiler that runs on one computer platform and produces code for another

is called …………………..

2. The output of the cross compiler is ……………..

3. The object file contains

a) Text

b) Data



c) Bss

d) All the above

4. A simulator simulates the functionality and instruction set of …………….

5. A remote debugger consists of……..

a) frontend only

b) backend only

c) either front end or backend

d) both frontend and backend

6. …………….allows you to examine the state of the processor on which the

program is running.

a) In-circuit Emulator

b) Remote debugger

c) Logical analyser

d) None of these

7. The tool that performs the conversion from relocatable program to executable

binary image is called a …………

8. Infinite loops are not important in embedded systems programming.

(True/False)

9. The process of converting the source code in to binary image does not

involve……

a) Compilation

b) Linking

c) Relocating

d) Simulating

e) All of these

10. A logic analyzer to troubleshoot digital hardware (True/False)

Answers 1. cross compiler

2. object file

3. d) All the above

4. target processor



5. d) both frontend and backend

6. a)In-circuit emulator

7. locator

8. False

9. d)Simulating

10. True

3.8 Questions

1. Briefly explain the steps involved in building an embedded program

2. What is an Emulator? What are the advantages of a ROM emulator?

3. What are the challenges an embedded programmer has to face

4. List out the tools used to build an embedded program along with their

functionalities.

5. What is the role of infinite loop in embedded systems?

6. Explain the process of linking and locating

UNIT 4

Embedded Operating System Features

Contents

4.1 Introduction

4.2 Objectives

4.3 Embedded Operating System

4.3.1 Tasks

4.3.2 Scheduler

4.3.3 Synchronization

4.4 RTOS design issues

4.4.1 What is a RTOS?



4.4.2 Applications of RTOSs

4.4.3 Design options

4.4.3.1 Polled Loop Systems

4.4.3.2 Phase/State-Driven Systems

4.4.3.3 Interrupt Driven Systems

4.4.3.3.1 Preemptive Priority System

4.4.3.3.2 Hybrid System

4.4.3.4 Foreground/Background Systems

4.4.3.4.1 Background Processing

4.4.3.4.2 Initialization

4.5 Real time kernels

4.6 Performance Characterstics of RTOSs

4.7 An example RTOS: VxWorks

4.8 Summary

4.9 Self Test

4.10 Questions

4.1 Introduction

All but the most trivial of embedded programs will benefit from the inclusion

of an operating system. This can range from a small kernel to a full- featured

commercial operating system. Either way, you'll need to know what features are the

most important and how their implementation will affect the rest of your software.

An embedded operating system runs on any embedded platform. Embedded

platforms typically function without human intervention. They consist of a single-

board microcomputer with an OS and software loaded in ROM. Embedded

applications start running special purpose programs at power on and will not stop

until turned off. These applications will not usually have any peripheral support

(keyboard, monitor, serial connections, mass storage, etc.) or a user interface. Often

an embedded OS must provide real-time response to perform its requirements.

A real-time operating system (RTOS) is an embedded operating system that

guarantees a certain capability within a specified time constraint. A real-time kernel is



software that manages the time of a microprocessor, allows multitasking and provides

services to your application.

4.2 Objectives


What is an Embedded OS?

What are the elements of an Embedded OS?

What is RTOS?

Design Issues of RTOS

Real Time Kernel Working

Performance Characterstics of RTOSs

VxWorks

4.3 Embedded Operating System

In the early days of computing there was no such thing as an operating system.

Application programmers were completely responsible for controlling and monitoring

the state of the processor and other hardware. In fact, the purpose of the first operating

system was to provide a virtual hardware platform that made application programs

easier to write. To accomplish this goal, operating system developers needed to only

provide a loose collection of routines - much like a modem software library - for

resetting the hardware to a known state, reading the state of the inputs, and changing

the state of the outputs.

Modern operating systems add to this the ability to execute multiple software

tasks simultaneously on a single processor. Each such task is a piece of the software

that can be separated from and run independently of the rest. A set of embedded

software requirements can usually be decomposed into a small number of such

independent pieces.



Tasks provide a key software abstraction that makes the design and

implementation of embedded software easier and the resulting source code simpler to

understand and maintain. By breaking the larger program up into smaller pieces, the

programmer can more easily concentrate on the unique features of the system under

development.

Actually embedded operating systems are easier to write than their desktop

counterparts - the required functionality is smaller and better defined. Embedded

operating systems are small because they lack many of the things you would expect to

find on your desktop computer. For example, embedded systems rarely have disk

drives or graphical displays, and hence they need no file system or graphical user

interface in their operating systems. In addition, there is only one "user" (i.e., all of

the tasks that comprise the embedded software cooperate), so the security features of

multiuser operating systems do not apply. All of these are features that could be part

of an embedded operating system but are unnecessary in the majority of cases.

4.3.1 Tasks Task is a piece of software that can be executed independently of others. In

multitasking many tasks are executed simultaneously. An operating system makes it

possible to execute multiple programs at the same time. In actuality, the tasks are not

executed at the same time. Rather, they are executed in pseudoparallel. They merely

take turns using the processor.

Operating system is responsible for deciding which task gets to use the

processor at a particular moment. In addition, it maintains information about the state

of each task. This information is called the task's context. A task's context records the

state of the processor just prior to another task's taking control of it. This usually

consists of a pointer to the next instruction to be executed (the instruction pointer), the

address of the current top of the stack (the stack pointer), and the contents of the

processor's flag and general-purpose registers. On 16-bit 80x86 processors, these are

the registers CS and IP, SS and SP, Flags, and DS, ES, SI, Dl, AX, BX, CX, and DX

respectively.

In order to keep tasks and their contexts organized, the operating system

maintains a bit of information about each task. Operating systems often keep this



information in a data structure called the task control block. The collection of task

control blocks is stored in one or more datastructures, such as linked lists. The figure

4.1 below shows the structure of a task control block:

Program counter

Task Status

Task ID #

Contents of register 0

Pointer to next TCB

.

. .

Contents of register n

Other context

Figure 4.1: The structure of a task control block

Task states

Only one task could actually be using the processor at a given time. That task

is said to be the "running" task, and no other task can be in that same state at the same

time. Tasks that are ready to run, but are not currently using the processor, are in the

"ready" state, and tasks that are waiting for some event external to themselves to

occur before going on are in the "waiting" state. Figure 4.2 below shows the

relationships between these three states.

Figure 4.2: Relation ship between task states

A transition between the ready and running states occurs whenever the

operating system selects a new task to run. The task that was previously running

becomes ready, and the new task (selected from the pool of tasks in the ready state) is

promoted to running. Once it is running, a task will leave that state only if it is forced

to do so by the operating system or if it needs to wait for some event external to itself

to occur before continuing. In the latter case, the task is said to block, or wait, until

Running

Waiting Ready



that event occurs. And when that happens, the task enters the waiting state and the

operating system selects one of the ready tasks to be run. So, although there may be

any number of tasks in each of the ready and waiting states, there will never be more

(or less) than one task in the running state at any time.

It is important to note that only the scheduler – the part of the operating

system decides which task to run – can promote a task to the running state. Newly

created tasks and tasks those are finished waiting for their external events are placed

into the ready state first. The scheduler will then include these new ready tasks in its

future decision-making.

Sometimes some part of the code of important tasks needs to be executed with out an interruption. We can put such code in critical sections. A critical section is a part of the program that must be executed atomically. That is, the instructions that

make up that part must be executed in order and without interruption. Because an interrupt can occur at any time, the only way to make such a guarantee is to disable

interrupts for the duration of the critical section.

4.3.2 Scheduler

The important element of any operating system is its scheduler. This is the

piece of the operating system that decides which of the ready tasks has the right to use

the processor at a given time. Some of the more common scheduling algorithms used

are: first- in-first-out, shortest job first, and round robin. These are simple scheduling

algorithms that are used in non-embedded systems.

First- in-first-out (FIFO) scheduling describes an operating system like DOS,

which is not a multitasking operating system at all. In this each task runs until it is

finished, and only after that is the next task started. However, in DOS a task can

suspend itself, thus freeing up the processor for the next task. And that's precisely how

older version of the Windows operating system permitted users to switch from one

task to another. True multitasking wasn't a part of any Microsoft operating system

before Windows NT.

Shortest job first describes a similar scheduling algorithm. The only difference

is that each time the running task completes or suspends itself, the next task selected



is the one that will require the least amount of processor time to complete. Shortest

job first was common on early mainframe systems.

Round robin is the only scheduling algorithm of the three in which the running

task can be preempted, that is, interrupted while it is running. In this case, each task

runs for some predetermined amount of time. After that time interval has elapsed, the

running task is preempted by the operating system and the next task in line gets its

chance to run. The preempted task doesn't get to run again until all of the other tasks

have had their chances in that round.

Unfortunately, embedded operating systems cannot use any of these simplistic

scheduling algorithms. Embedded systems (particularly real-time systems) almost

always require a way to share the processor that allows the most important tasks to

grab control of the processor as soon as they need it. Therefore, most embedded

operating systems utilize a priority-based scheduling algorithm that supports

preemption.

This means that at any given moment the task that is currently using the

processor is guaranteed to be the highest-priority task that is ready to do so. Lower-

priority tasks must wait until higher-priority tasks are finished using the processor

before resuming their work. The word preemptive adds tha t any running task can be

interrupted by the operating system if a task of higher priority becomes ready. The

scheduler detects such conditions as a finite set of time instants called scheduling

points.

When a priority-based scheduling algorithm is used, it is also necessary to

have a backup policy. This is the scheduling algorithm to be used in the event that

several ready tasks have the same priority. The most common backup scheduling

algorithm is round robin.

SScchheedduull iinngg ppooiinnttss

Simply stated, the scheduling points are the set of operating system events that

result in an invocation of the scheduler. Two such events are : task creation and task

deletion. During these events the next task to run is selected. If the currently executing



task still has the highest priority of all the ready tasks, it will be allowed to continue

using the processor. Otherwise, the highest priority ready task will be executed next.

Of course, in the case of task deletion a new task is always selected : the currently

running task is no longer ready, by virtue of the fact that it no longer exists!

A third scheduling point is called the clock tick. The clock tick is a periodic

event that is triggered by a timer interrupt. The clock tick provides an opportunity to

awake tasks that are waiting for a software timer to expire. In fact, support for

software timers is a common feature of embedded operating systems. During the

clock tick, the operating system decrements and checks each of the active software

timers. When a timer expires, all of the tasks that are waiting for it to complete are

changed from the waiting state to the ready state. Then the scheduler is invoked to see

if one of these newly awakened tasks has a higher priority than the task that was

running prior to the timer interrupt.

Ready list

The scheduler uses a data structure called the ready list to track the tasks that

are in the ready state. Usually, the ready list is implemented as an ordered linked list

or ordinary linked list. So the head of this list is always the highest Priority task that is

ready to run. Following a call to the scheduler, this will be the same as the currently

running task. In fact, the only time that won't be the case is during a reschedule.

Figure 4.3 below shows what the ready list might look like while the operating

systems ID use.

Figure 4.3: The ready list in action

Idle task

If there are no tasks in the ready state when the scheduler is called, the idle

task will be executed. The idle task looks the same in every operating system. It is

simply that does nothing. Usually in an OS, the idle task is completely hidden from

the application developer. It does, however, will have a valid task ID and priority

(both of which are zero, by the way). The idle task is always considered to be in the



ready state (when it is not running), and because of its low priority, it will always be

found at the end of the ready list. That way, the scheduler will find it automatically

when there are no other tasks in the ready state. Those other tasks are sometimes

referred to as user tasks to distinguish them from the idle task.

Context Switch

The actual process of changing from one task to another is called a context

switch. Because contexts are processor-specific, so is the code that implements the

context switch. That means it must always be written in assembly language.

4.3.3 Task Synchronization

Though we frequently talk about the tasks in a multitasking operating system

as completely independent entities, that portrayal is not completely accurate. All of

the tasks are working together to solve a larger problem and must occasionally

communicate with one another to synchronize their activities. For example, in the

printer-sharing device the printer task doesn't have any work to do until new data is

supplied to it by one of the computer tasks. So the printer and computer tasks must

communicate with one another to coordinate their access to common data buffers.

One way to do this is to use a data structure called mutex.

Mutexes are provided by many operating systems to assist with task

synchronization. They are not, however, the only such mechanism available. Others

are called semaphores, message queues, and monitors. However, if you have anyone

of these data structures, it is possible to implement each of the others. In fact, a mutex

is itself a special type of semaphore called a binary, or mutual-exclusion, semaphore.

Mutex can be thought as being nothing more than a multitasking-aware binary

flag. The meaning associated with a particular mutex must, therefore, be chosen by

the software designer and understood by each of the tasks that use it. For example, the

data buffer that is shared by the printer and computer task would probably have a

mutex associated with it. When this binary flag is set, the shared data buffer is

assumed to be in use by one of the tasks. All other tasks must wait until that flag is

cleared (and then set again by themselves) before reading or writing any of the data

within that buffer.



Mutexes are multitasking-aware because the processes of setting and clearing

the binary flag are atomic. That is, these operations cannot be interrupted. A task can

safely change the state of the mutex without risking that a context switch will occur in

the middle of the modification. If a context switch were to occur, the binary flag

might be left in an unpredictable state and a deadlock between the tasks could result.

The atomicity of the mutex set and clear operations is enforced by the operating

system, which disables interrupts before reading or modifying the state of the binary

flag.

Critical sections

The primary use of mutexes is for the protection of shared resources. Shared

resources are global variables, memory buffers, or device registers that are accessed

by multiple tasks. A mutex can be used to limit access to such a resource to one task

at a time. It is like the stoplight that controls access to an intersection. In a

multitasking environment you generally don't know in which order the tasks will be

executed at runtime. One task might be writing some data into a memory buffer when

it is suddenly interrupted by a higher-priority task. If the higher-priority task were to

modify that same region of memory, then bad things could happen. At the very least,

some of the lower-priority task's data would be overwritten.

Pieces of code that access shared resources contain critical sections. We've

already seen something similar inside the operating system. There, we simply

disabled interrupts during the critical section. But tasks cannot (wisely) disable

interrupts. If they were allowed to do so, other tasks - even higher-priority tasks that

didn't share the same resource - would not be able to execute during that interval. So

we want and need a mechanism to protect critical sections within tasks without

disabling interrupts. And mutexes provide that mechanism.

Deadlock and Priority Inversion

Mutexes are powerful tools for synchronizing access to shared resources.

However, they are not without their own dangers. Two of the most important

problems to watch out for are deadlock and priority inversion.



Deadlock can occur whenever there is a circular dependency between tasks

and resources. The simplest example is that of two tasks; each of which require two

mutexes: A and B. If one task takes mutex A (for resource 1) and waits for mutex B

for (resource 2) while the other takes mutex B (for resource 2) and waits for mutex A

(for resource 1), then both tasks are waiting for an event that will never occur. This

essentially brings both tasks to a halt and, though other tasks might continue to run for

a while, could bring the entire system to a standstill eventually. The only way to end

the deadlock is to reboot the entire system.

Priority inversion occurs whenever a higher-priority task is blocked, waiting

for a mutex that is held by a lower-priority task. This might not sound like a problem

after all, the mutex is just doing its job of arbitrating access to the shared resource -

because the higher-priority task is written with the knowledge that sometimes the

lower-priority task will be using the resource they share. However, consider what

happens if there is a third task that has a priority somewhere between those two.

This situation is illustrated in Figure 4.4. Here there are three tasks: high

priority, medium priority and low priority. Low becomes ready first (indicated by the

rising edge) and, shortly thereafter, takes the mutex. Now, when high becomes ready,

it must block (indicated by the shaded region) until low is done with their shared

resource. The problem is that Medium, which does not even require access to that

resource, gets to preempt Low and run even though it will delay High's use of the

processor. Many solutions to this problem have been proposed, the most common of

which is called "priority inheritance." This solution has Low's priority increased to

that of High as soon as High begins waiting for the mutex. Some operating systems

include this "fix" within their mutex implementation.



Figure 4.4: An example of priority inversion

The basic elements of a simple embedded operating system are the scheduler and

scheduling points, context switch routine, definition of a task, and a mechanism for

intertask communication. Every useful embedded operating system will have these same

basic elements. However, you don't always need to know how they are implemented, if

you are an application programmer. You can usually just treat the operating system as a

black box on which you, as an application programmer, rely. You simply write the code

for each task and make ca lls to the operating system when and if necessary. The operating

system will ensure that these tasks run at the appropriate times relative to one another.

4.4 RTOS Design Issues

4.4.1 What is a real time operating system (RTOS)?

A real-time operating system (RTOS) is an operating system that guarantees a

certain capability within a specified time constraint. A real time operating system

must perform within time constraints and can be thought of as either a soft, hard or

firm RTOS. Almost every system built can have real time characteristics. For

example, If you are writing a paper and it takes 3 seconds after you hit a key before

the character appears on the monitor. This is not a very favorable event but is far from

saying that your system has failed. This is probably something the designers thought

may happen but choose to ignore it because it is not a disastrous event or may only

happen when the system is low on resources. However, If you are controlling a motor

and it does not stop in time resulting in disaster or death, then you have a major

system failure. This is the difference between hard and soft real time systems. A firm

real time system has some characteristics of both hard and soft systems and usually

results in some sort of priority given to a process that needs to be performed now and

one that is not as important. A high priority process will always run before the low

priority process allowing the system to run processes when it needs to. The need for a

time responsive system gives rise to many practical applications.

General RTOS requirements

The OS behavior must be predictable

The OS must be multithreaded and preemptive.



The OS must support thread priority.

The OS must support predictable thread synchronization mechanisms.

The maximum time during which interrupts are masked by the OS and by

device drivers must be known.

The maximum time that device drivers use to process an interrupt, and specific

IRQ(Interrupt ReQuest) information relating to those device drivers, must be

known.

The interrupt latency must be predictable.

4.4.2 Applications of a RTOS

There are many applications for a RTOS. If a system is responsible for

controlling anything mechanical then there is going to be some time constraints

depending on the control. A guidance system for the military needs be under very

strict time constraints. If a system is responsible for controlling a high priced piece of

artillery then it better hit whatever it is aimed at, otherwise the money spent is simply

not worth it if it is just as accurate as something cheaper. Aircraft pilots need certain

data now or the results could be disastrous. A nuclear power plants system better be

able to give accurate data and detect a possibility for a core meltdown. If any of these

systems were to fail, the result would be disaster. The airplane may not crash, but is

this a chance the developer wants to take. Three seconds to a pilot flying a plane is

much different than the three seconds it may take for a character to appear on a PC

monitor. These kind of constraints result in the design of the OS to be thorough and

highly responsive.

4.4.3 Design Options

There are many commercial operating systems available, but the most well

known are designed for PCs, networks, or mainframes. These systems complexity

would not work well for real time operating systems. A good example is that some PC

OSs do not have any way to handle a deadlock. This is fine for a system that you can

reboot, but if you have an automated robot that you are monitoring remotely, if there

is a deadlock someone would have to manually go out and fix the problem. These



may not be the most feasible solution. A good example is one of the robots used to go

into an area where a bomb threat is possible. Nobody in their right mind would go

into a building that may explode at any minute to reset the robot because of a

deadlock. This is just one downfall of commercial OSs and many real time systems

have to be scaled down or possibly built from scratch to adapt to the application.

There are commercially available RTOS but the designer of the application must

decide if they can live with the downfalls of each, or if they need to build it from

scratch to handle their specific application. A few of these options will be discussed

starting with the polled- loop system.

4.4.3.1 Polled-Loop Systems

A polled loop system is as simple as it gets. The system allows for fast-

response to a single process and that is it. The polled loop simply runs an infinite loop

that constantly tests for an event. There is no need for a scheduler or dispatcher

because there is only one process that runs. An example of what the polled loop may

look like is given below in C++. A flag is set when a process is ready

(IsProcessReady), and when the process is run the flag is reset until the process is

ready again.

For (;;) //Do Forever

{

If (IsProcessReady == True){ //If the process is ready

RunProcess( ); //Run the process

IsProcessReady = False; //Set the polling flag

} //End if

} //End Do forever

Polled loop is not the way to go if more than one process needs to be executed.

However, some systems may have to wait a fixed amount of time before the switching

of the flag becomes stable. This can be handled via an interrupt, and may not be

needed depending on the bounce of the switch from false to true. If an interrupt is

used inside of the polled- loop it is now a polled-loop with interrupt.

There are advantages of a polled-loop system. They are only useful if a

processor is dedicated to handling one process and the event does not occur at a high



frequency. Polled- loop wastes CPU time and if the event did occur at a high

frequency the system would fail. Also, if the event occurs at very long intervals, then

the waste of CPU time becomes an issue.

4.4.3.2 Phase/State Driven Systems

The use of a state driven system allows performing a context switch only at

certain points of each process. A process will be broken up into states and when the

process changes state a context switch will occur. The state of the process will be

saved and when the process is executed again it will be at its next state. This is

accomplished using a case statement that runs a state of the process and then

performs a context switch to the next process. The case statement may look like the

following example:

void Process( ) {

Switch (State)

Case 1:

State1 ( ); //Run state one of the process

State = 2; //Set state to next state

Switch( ); //Switch to next Process

break;

Case 2:

State2 ( ); //Run state two of the process

State = 3; //Set state to next state

Switch( ); //Switch to next Process

break;

default: //If state is indeterminate

State = 1; //Reset Process

Switch ( ); //Switch to next Process

}

This method will work as long as the states are kept small and can run in a

reasonable amount of time. It would also be difficult to implement more than two

processes without also implementing a kernel to schedule and dispatch the processes.

Some processes may not be able to be broken down into states and this could arise in

some processes taking longer than others. With no method of preemption, some high

priority processes may not execute in time resulting in system failure. To implement



preemption, there will need to be some sort of interrupt routine to allow the process to

stop. This will extend the idea of a state-driven process by thinking of the interrupt

routine as a variable state allowing a context-switch to occur anywhere in the process.

4.4.3.3 Interrupt Driven Systems

Interrupt driven systems are implemented using an interrupt service routine to

handle the dispatching of processes. It is up to the designer to decide on how

often and when the interrupts will occur to perform the context switch between

two processes. There is a round-robin system where an interrupt occurs at a

fixed rate or it can be seen as each process having a time slice to execute.

Interrupts can also occur and based on the priority of the interrupt solving the

problem of preemption. If a system contains the ability to have interrupts

occur at both a fixed or variable rate, it is a hybrid system. A hybrid system

can be thought of as a round-robin system allowing preemption. The first of

the interrupt driven systems discussed will be a preemptive priority system.

4.4.3.3.1 Preemptive Priority Systems

Preemptive priority systems work by assigning each process a priority. This

priority is then used whenever an interrupt occurs to decide which process will run. If

a process is of higher priority than the currently running process, then the high

priority process will preempt the low priority process. If the new process has a lower

priority then it is stored in a priority queue and must wait till the higher priority

processes are finished. This could result in a pile of unfinished preempted processes

that have been preempted by a higher priority process. This can be dangerous if there

are a lot of processes. Some processes may never be able to execute because they are

always preempted. This is known as starving a process.

Some systems will allow processes to change their priority depending on how

often they need to be executed. If a process executes more than a different process,

then the process that executes more will be assigned a higher priority. This works well

as long as the highest priority processes are executed the most frequently. Otherwise,

a low priority process may preempt a high priority process that does not occur very

often and could result in failure.



4.4.3.3.2 Hybrid Systems

Hybrid systems are a combination of preemption and a fixed interrupt

scheduler. They assign each process a priority but only a few processes have a higher

priority. Most processes have the same priority and the system runs a round-robin

type of scheduling to handle equal priorities. The processes that have the highest

priority must be selected carefully. An error handler is a good example. If an error

occurs the error handler will have the power to preempt any process and take care of

the error that occurred. A hybrid system takes advantage of both the fixed interrupt

system (Roundrobin) and the preemptive interrupt system. This allows the designer to

take the advantages from both systems while cutting down on some of the negative

side effects of each. All of the interrupt systems discussed are a special case of a

foreground/background system which allows each type of system discussed thus far to

be used.

4.4.3.4 Foreground/Background systems

Foreground/Background systems involve a set of interrupt driven processes as

the foreground and a set of non- interrupt driven processes as the background. All of

the systems discussed can be thought of as a special case of foreground/background.

The polled- loop and phase/state would be a background with no foreground and the

interrupt driven systems would be a foreground with no background. This type of

system will use a polled- loop as the background, but the loop will accomplish

something other than waste CPU time. The foreground will behave just as an interrupt

driven systems would. However, the background task will be preempted by any

foreground task, making all the background processes of lower priority than the

foreground.

4.4.3.4.1 Background Processing

The background processes need to be all processes that are time independent.

Since the foreground has the ability to preempt all background processes, all time

critical processes need to be in the foreground. If someone wanted to launch a guided

missile, it may be frustrating if the missiles target was missed because it was



interrupted to perform a self-test. This is why it is important to make sure all

background processes are not critical to the application. A test of the system should be

performed in the background along with any sort of error handling. It may be feasible

to have a counter for each process in the background and if one of these counters

exceeds a timeout limit, then a system failure is indicated. To accomplish this, the

first background process should be the initialization phase.

4.4.3.4.2 Initialization

To initialize the system, a background process should be run first that sets up

all the data structures for each process, sets up any necessary registers and vector

jump tables needed for interrupts. Finally, interrupts should be enabled and the system

should begin running processes.

The foreground/background idea can be very useful if designed correctly. It

allows the designer to take advantage of a variety of different systems. However, the

fact that foreground processes can preempt any background process there is the

possibility of starving the background completely. This may not be a problem if the

background is simply used to detect fatal errors, but if there is a problem that only the

background can take care of, the system may run with corrupt data for a significant

amount of time before the background is allowed to fix it. One of the biggest

problems of all the systems discussed is that there must be a fixed number of

processes.

This problem can be solved by giving each process its own unique set of data

that completely defines the process. This block is referred to as a PCB or process

control block (similar to task control block). This block should have a unique ID

number for each process, a pointer to the processes stack, and the state of the process.

The PCB idea can be extended to become a node for a list. Then, a pointer to the next

node of the list will also need to be in the PCB. This structure will allow the OS to

skip processes or execute processes depending on the state their in.



Any application that requires very strict time constraints, should have a

specialized OS built for it. The commercially available RTOS‘s can increase overhead

significantly resulting in slower response times. Another drawback is misleading

specifications of a certain system. Manufacturer‘s usually release theoretical or

bestcase response times. A developer also needs to decide if the features in a specific

system are useful or if it will create a waste of memory. Ultimately, the designer of

the application needs to make the decision whether it is worth making an application

specific OS or the downfalls of the commercial model are feasible.

4.5 Inside Real-Time Kernels

This section introduces you to real- time kernels by showing you how a real-

time kernel works.

A kernel is software that manages the time of a microprocessor to ensure that

all time critical events are processed as efficiently as possible. A kernel

simplifies the design process of the system because it allows the system to be

divided into multiple independent elements called tasks. A task is a simple

program which thinks it has the microprocessor all to itself. Each task is given

a priority based on its importance. The design process for a real-time system

consists of splitting the work to be done into tasks which are responsible for a

portion of the problem. The microprocessor still has the same amount of work

to do but now the work can be prioritized. A kernel also provides valuable

services to your application such as time delays, system time, message

passing, synchronization, mutual-exclusion and more.

Kernels come in two flavors: non-preemptive and preemptive. Non-preemptive

kernels require that each task does something to explicitly give up control of the CPU

(i.e.microprocessor). To maintain the illusion of concurrency, this process must be

done frequently. Asynchronous events are handled by ISRs (Interrupt Service

Routines). An ISR can make a higher-priority task ready to run but the ISR always

returns to the interrupted task. The new higher-priority task will gain control of the

CPU only when the current task gives up the CPU. Non-preemptive kernels are

seldom used in real-time applications. On the other hand, a preemptive kernel is used



when system responsiveness is important. With a preemptive kernel, the kernel itself

ensures that the highest-priority task ready to run is always given control of the CPU.

When a task makes a higher-priority task ready to run, the current task is preempted

(suspended) and the higher-priority task is immediately given control of the CPU. If

an ISR makes a higher-priority task ready, when the ISR completes, the interrupted

task is suspended and the new higher-priority task is resumed. The execution profile

of a system designed using a preemptive kernel is shown in figure 4.5. As shown, a

low priority task is executing 1. An asynchronous event interrupts the microprocessor

2. The microprocessor services the event 3which makes a higher priority task ready

for execution. Upon completion, the ISR invokes the kernel, which decides to run the

more important task 4. The higher priority task executes to completion unless it also

gets interrupted 5. At the end of the task, the kernel resumes the lower priority task 6.

The lower priority task continues to execute 7.

A preemptive kernel can ensure that time critical tasks are performed first. In

other words, without a kernel we may not get to perform a task that is time critical

because we would be working on something that is not. Furthermore, execution of

time critical tasks are deterministic and are almost insensitive to software changes.

Figure 4.5: Execution profile in a preemptive kernel

TASKS

A task is basically an infinite loop and looks as shown below. In order for the

kernel to allow other tasks to perform work, you must invoke a service provided by



the kernel to wait for some event to occur. The event can either be time to elapse or

the occurrence of a signal from either another task or an ISR.

void Task(void)

{

while (1) {

/* Perform some work (Application specific) */ /* Wait for event by calling a service provided by the kernel */

/* Perform some work (Application specific) */

}

}

At any given time, a task is in one of six states: Dormant, Ready, Running,

Delayed, Waiting for an event or Interrupted. The ‗dormant‘ state corresponds to a

task which resides in memory but has not been made available to the multitasking

kernel. A task is ‗ready‘ when it can execute but its priority is less than the current

running task. A task is ‗running‘ when it has control of the CPU. A task is ‗delayed‘

when the task is waiting for time to expire. A task is ‗waiting for an event‘ when it

requires the occurrence of an event: waiting for an I/O operation to complete, a shared

resource to be available, a timing pulse to occur, etc. Finally, a task is ‗interrupted‘

when an interrupt occurred and the CPU is in the process of servicing the interrupt.

You must invoke a service provided by the kernel to have the kernel manage

each task. This service creates a task by taking it from its ‗dormant‘ state to the

‗ready‘ state. Each task requires its own stack and where the task is concerned, it has

access to most CPU registers. The function prototype for a task creation function is

shown below.

void OSTaskCreate(void (*task)(void), void *stack, UBYTE priority)

‘task’ is the function name of the task to be managed by the kernel. ‘stack’ is a

pointer to the top-of-stack of the stack to be used by the task. ‘priority’ is the task‘s

relative priority - its importance with respect to all other tasks in the system.

The kernel keeps track of the state of each task by using a data structure called

the Task Control Block (TCB). The TCB contains the status of the task (Ready,

Delayed or Waiting for an event), the task priority, the pointer to the task‘s top-of-

stack and other kernel related information. Figure 4.6 shows the relationship between

the task stacks, the TCBs and the CPU registers. Based on events, the kernel will



switch between tasks. This process basically involves saving the contents of the CPU

registers (i.e. the CPU context) onto the current task stack, saving the stack pointer

into the current task‘s TCB, loading the stack pointer from the new task‘s TCB and

finally, loading the CPU registers with the context of the more important task. This

process is called a context switch (or task switch).

Figure 4.6: The relationship between the task stacks, the TCBs and

the CPU registers.

THE READY LIST

One of the kernel‘s function is to maintain a list of all the tasks that are ready-

to-run in order of priority as shown in figure 4.7. This particular list is called the

Ready List. When the kernel decides to run a more important task it basically picks

the TCB at the head of the ready list. This process is called Scheduling. As you can

see, finding the most important task in the list is fairly simple. The problem with

using a singly- linked list as shown in figure 4.7 is that the kernel may have to go

through the whole list in order to insert a low priority TCB. This actually happens



every time a high priority task preempts a lower priority task. Linked lists are only

used here for purpose of illustration. There are actually better techniques to insert and

remove TCBs from a list instead of using linked lists.

Figure 4.7: The ready list

EVENT MANAGEMENT

The kernel provides services to allow a task to suspend execution until an

event occurs. The most common type of event to wait for is the semaphore. A

semaphore is used to either control access to a shared resource (mutual exclusion),

signal the occurrence of an event or allow two tasks to synchronize their activit ies. A

semaphore generally consist of a value (an unsigned 16-bits variable) and a waiting

list (see figure 4.8).

Figure 4.8: A semaphore

The semaphore must be initialized to a value through a service provided by the

kernel before it can be used. Because the kernel provides multitasking, a resource

such as a printer must be protected from simultaneous access by two or more tasks.

Because you only have one printer, you initialize the semaphore with a value of 1. A

task desiring access to the printer performs a WAIT operation on the semaphore. If

the semaphore value is 1, the semaphore value is decremented to 0 and the task

continues execution. At this point, the task ‗owns‘ the printer. If another task needs

access to the printer, it must also perform a WAIT operation. This time, however, the

semaphore value is 0, which indicates that the printer is in use. In this case, the task is



removed from the ready list and placed in the semaphore waiting list. When the first

task is done with the printer, it performs a SIGNAL operation which frees up the

printer. At this point, the task waiting for the printer it is placed in the ready list. If the

task that was waiting for the printer is more important than the current task, the

current task will be preempted and the task waiting for the printer will be given

control of the CPU. The pseudo code for the semaphore management functions is

shown below.

UBYTE OSSemWait(OS_EVENT *sem, UWORD to)

{

Disable Interrupts;

if (sem->Value > 0) {

sem->Value--;

Enable Interrupts;

} else {

Remove task’s TCB from ready list;

Place task’s TCB in the waiting list for the semaphore;

Wait for semaphore to be ‘signaled’ (execute next task);

if (timeout occurred) {

Enable Interrupts;

return (timeout error code);

} else {

Enable Interrupts;

return (no error);

}

}

}

void OSSemSignal(OS_EVENT *sem)

{

Disable Interrupts;

if (any tasks waiting for the semaphore) {

Remove highest priority task waiting for the sem. from wait list;

Place the task in the ready list;

Enable Interrupts;

Call the scheduler to run the highest priority task;

} else {

sem->Value++;

Enable Interrupts;

}

}

Kernels generally support other types of events such as message mailboxes,

message queues, event flags, pipes, etc. The operation of these other types of events is

very similar to semaphores. A message mailbox is a data structure containing a

pointer to a user definable message along with a waiting list. Tasks are placed in the

waiting list until messages are sent by either another task or an ISR. A message queue

is a data structure containing a list of pointers to messages arranged as a FIFO (First-

In-First-Out) along with a waiting list. Tasks are placed in the waiting list until

another task or an ISR deposits one or more messages in the FIFO. Event flags are



grouping of single-bit flags (on or off) and a waiting list. Tasks are placed in the

waiting list until one or more bits are set. Pipes are character size FIFOs with a

waiting list. Again, tasks are placed in the pipe‘s waiting list until characters are

received.

SYSTEM TICK

Every kernel provides a mechanism to keep track of time. This is handled by a

hardware timer which interrupts the CPU periodically. The ISR for this timer invokes

a service provided by the kernel which is responsible for updating internal time

dependant variables. Specifically, a number of services rely on ‗timeouts‘. This ISR is

generally called the clock tick ISR. The clock tick interrupt is usually set to occur

between 10 and 100 times per second. A task can thus suspend its executio n for an

integral number of ‗clock ticks‘. A special list is used to keep track of tasks that are

waiting for time to expire. This list is called the delayed task list. The delayed task list

can be set up as a delta list which basically orders delayed tasks so that the first task

in the list has the least amount of time remaining to expire. For example, if you had

five tasks with values of 10, 14, 21, 32 and 39 tenths of a second remaining to timeout

then, the list would contain 10, 4, 7, 11 and 7. The tota l time before the first task to

expire is 10, the second is 10+4, the third is 10+4+7, the fourth is 10+4+7+11 and

finally, the fifth task would be 10+4+7+11+7.

INTERRUPT PROCESSING

Kernels provide services that are accessible from ISRs to notify tasks about

the occurrence of events. In order to use these services, however, your ISRs must save

all CPU registers and notify (through a function) the kernel that an ISR has been

entered. In this case, the kernel simply increments a nesting counter (generally a n 8-

bit variable). The nesting counter is used to determine the nesting level of interrupts

(how many interrupts were interrupted). Upon completion of the ISR code, your ISR

must invoke another service provided by the kernel to notify it that the ISR has

completed. This service basically decrements the nesting counter. When all interrupts

have returned to the first interrupt nesting level, the kernel determines if a higher

priority task has been made ready-to- run by one of the ISRs. If the interrupted task is



still the most important task to run, the CPU registers are restored and the interrupted

task is resumed. If a higher priority task is ready-to-run, the kernel saves the stack

pointer of the interrupted task into its TCB, obtains the stack pointer of the new task,

restores the CPU registers for the new task and resumes execution of the new task.

This process is shown in figure 4.9.

Figure 4.9: Interrupt Processing

A kernel allows real-time applications to be easily designed and expanded;

other functions could be added without requiring major changes to the software. A

large number of applications can benefit from the use of a kernel. Kernels can ensure

that all time critical events are handled as quickly and efficiently as possible. A

minimum kernel requires only about 1 to 3 KB of ROM and a few hundred bytes of

RAM. Some kernels even allow you to specify the size of each task's stack on a task-

by-task basis. This feature helps reduce the amount of RAM needed in your

application.

4.6 Performance characteristics of RTOSs

We often use the term real-time to describe computing problems for which a

late answer is as bad as a wrong one. These problems are said to have deadlines, and

embedded systems frequently operate under such constraints. For example, if the

embedded software that controls your anti- lock brakes misses one of its deadlines,

you might find yourself in an accident. So it is extremely important that the designers

of real-time embedded systems know everything they can about the behavior and



performance of their hardware and software. In this section we will discuss the

performance characteristics of real time operating systems, which are a common

component of real-time systems,

The designers of real- time systems spend a large amount of their time

worrying about worst-case performance. They must constantly ask themselves

questions like the following: What is the worst-case time between the human operator

pressing the brake pedal and an interrupt signal arriving at the processor? What is the

worst-case interrupt latency, the time between interrupt arrival and the start of the

associated interrupt service routine (ISR)? And what is the worst-case time for the

software to respond by triggering the braking mechanism? Average or expected-case

analysis simply will not suffice in such systems.

Most of the commercial embedded operating systems available today are

designed for possible inclusion in real time systems. In the Ideal case, this means that

their worst-case performance is well understood and documented. To earn the

distinctive title "Real- Time Operating System" (RTOS), an operating system should

be deterministic and have guaranteed worst-case interrupt latency and context switch

times. Given these characteristics and the relative priorities of the tasks and interrupts

in your system, it is possible to analyze the worst-case performance of the software.

An operating system is said to be deterministic if the worst-case execution

time of each of the system calls is calculable. An operating system vendor that takes

the real-time behavior of its RTOS seriously will usually publish a data sheet that

provides the minimum, average, and maximum number of clock cycles required by

each system call. These numbers might be different for different processors, but it is

reasonable to expect that if the algorithm is deterministic on one processor, it will be

so on any other.

Interrupt latency is the total length of time from an interrupt signal's arrival at

the processor to the start of the associate interrupt service routine. When an interrupt

occurs, the processor must take several steps before executing the ISR. First, the

processor must finish executing the current instruction. That probably takes less than

one clock cycle, but some complex instructions require more time than that. Next, the



interrupt type must be recognized. This is done by the processor hardware and does

not slow or suspend the running task. Finally, and only if interrupts are enabled, the

1SRthat is associated with the interrupt is started.

Of course, if interrupts are ever disabled within the operating system, the

worst-case interrupt latency increases by the maximum amount of time that they are

turned off. But as we have just seen, there are many places where interrupts are

disabled. These are the critical sections we talked about earlier, and there are no

alternative methods for protecting them. Each operating system will disable interrupts

for a different length of time, so it is important that you know what your systern's

requirements are. One real- time project might require a guaranteed interrupt response

time as short as 1 microsecond, while another requires only 100 microseconds.

The third real-time characteristic of an operating system is the amount of time

required to perform a context switch. This is important because it represents over head

across your entire system. For example, imagine that the average execution time of

any task before it blocks is 100 microseconds but that the context switch time is also

100 microseconds. In that case, fully one-half of the processor's time is spent within

the con text switch routine! Again, there is no magic number and the actual times are

usually processor-specific because they are dependent on the number of registers that

must be saved and where. Be sure to get these numbers from any operating system

vendor you are thinking of using.

4.7 An Example RTOS: Vx Works

VxWorks, is one of the most popular commercial realtime operating system

available in the market. VxWorks is a product of Wind River Systems Inc. This

section provides detailed information on the main features of the operating system.

VxWorks is the premier development and execution environment for complex

real-time and embedded applications on a wide variety of target processors. It is a

networked real- time operating system designed to be used in a distributed

environment (VxWorks was the realtime operating system kernel that was used in the



famous Mars Pathfinder mission in 1997). It has very high performance along with

sophisticated networking facilities. Also, VxWorks is flexible, scalable, reliable, has

an open architecture and an industry standard support that makes it easy for users to

design efficient multi-vendor systems and migrate to different processors with

minimal effort. The latest version of VxWorks 5.X series is VxWorks 5.4.

Wind River Systems Inc. is a leading provider of embedded software and

services. Its product range includes software development tools, realtime operating

systems and advanced connectivity that is used in various industries like

telecommunication, digital imaging, networking, computer and aerospace etc. Wind

River Systems Inc. was founded in 1983, with its headquarters in Alameda, California

and operating in sixteen countries worldwide.

In the embedded systems programming the actual program is done on one

system(Host) other than the one(Target) in which the software will eventually run.

VxWorks is a real-time operating system which runs time-critical and embedded

applications.(running on target boards).

The VxWorks operating system consists of a small kernel (which controls the

execution of tasks) and surrounding facilities provided for application use. Many

surrounding facilities (and some parts of the kernel) may be scaled out of the

VxWorks image if the application does not need them. Some facilities (e.g.

networking) may be useful during development even if they are not needed in the

final product; it is easy to remove such facilities from the shipping image.

Figure 4.10: VxWorks operating system

Scheduling,

System Clock

Facilities, Mutual

Exclusion,

Synchronization

&ITC

Memory

Management

Device

Support

File

Systems

I/O

System

Networking Support



Tornado Environment

Tornado is an integrated development environment for cross-development of

real-time and embedded applications.(running on Host Machine)

Tornado includes:

The VxWorks real-time operating system, running on the target computer.

The Tornado Tools, which allow developers to download, test, and debug

their applications.

The GNU compiler tool chain (compiler, assembler, linker, make, and binary

utilities). These tools run on the host, and produce code for the target.

The Target Server (host) and WDB agent (target), which together allow the

tools to communicate with and control the target.

Figure 4.11: Tornado environment

Real Time OS based Target hardware

Development System with standard OS &

RTOS tools

RS 232

Network Connection



After booting the target, you must start a Target Server (on the host)to access

the target using the Tornado tools. All Tornado tools use the Wind River Exchange

Protocol (WTX) to communicate with the target server. Target server manages target

resources from host by:

Communicating with debug agent on target.

Managing host-resident symbol table for target.

Loading and unloading modules dynamically.

Allocating memory on target for host tools.

Maintaining a cache of target program text segment memory.

The WDB agent acts on the target on behalf of the target server and Tornado

tools. It does the following :

Reading or modifying memory.

Setting or clearing break points

Creating, starting, stopping and deleting tasks.

Calling functions

The WDB agent interprets the request and calls the appropriate functionality.

The result of calling that functionality is then packaged and sent back to the Target

Server on the host, which then passes the results to the tool, which initiated the

transaction. The target server and WDB Agent communicate via the Wind

Debug(WDB) protocol.

Wtxregd is the Wind Registry daemon, which tracks available target servers .It

must be running on the host. The wtxregd manages the information a tool needs to

connect to a target server. The wtxregd must be started before target server and tools

Tools initially contact the registry to find out how to contact a particular target server.



The cross compiler is an executable on the host, generating an image for the

target hardware. Using cross compilers and linkers, compile and link the two sources,

and generate an ―image‖. There would be a mechanism provided (by the OS and your

hardware) to move the image from the host machine to the target hardware

The Operating System Source tree consists of the source shipped by the RTOS

vendor, and usually includes, OS primitives ( task/thread libraries / IPC support) and

Framework and library support for developers to support newer devices. The board-

specific code for initializing and managing a board‘s hardware is called the BSP. The

BSP provides VxWorks with standard hardware interface functions which allow it to

run on a board.

Tornado includes an integrated target simulator, which allows application

development prior to hardware availability. The simulator offers support for all

VxWorks facilities except hardware-specific facilities and Networking support.

Important features of VxWorks

High Performance micro-kernel: VxWorks comes with ―wind‖ kernel that provides

efficient task management and multitasking environment with unlimited number of

tasks. Its multitasking environment creates a group of independent tasks for a realtime

application. Each task has a separate thread of execution and is provided with the

system resources. The ―wind‖ kernel provides intertask synchronization and

communications facilities that facilitates the independent tasks of the realtime

application to coordinate their actions within the system.



The ―wind‖ kernel also provides message queues, pipes, sockets, and signals

for intertask communication. It comes with an optional component called ―VxMP‖

that provides shared-memory objects as a communication mechanism for tasks

executing on different CPUs. The ―wind‖ kernel uses interrupt-driven, preemptive

priority-based task scheduling with 256 priorities and round robin scheduling as

shown below:

Every subroutine is a separate task in itself and has its own context and

resource stack. Context switching is required between tasks to transfer information

between tasks. ―Wind‖ kernel features fast deterministic context switching with low

interrupt latency. Other task control facilities provided by the kerne l are: suspend,

resume, delete, delay, and move. Task synchronization and mutual exclusion is done



using semaphores. The kernel also provides interrupt handling support, watchdog

timers, and memory management.

POSIX Compatibility: The purpose of POSIX (Portable Operating System Interface)

compatibility is to make an application portable across various operating systems. As

the name means, it makes moving applications from one operating system to another

much easier. VxWorks supports the final approved standard for POSIX 1003.1b Real-

Time Extensions specification, including POSIX-compliant asynchronous I/O,

counting semaphores, message queues, signals, memory management, clocks and

timers, and scheduling control. It provides most interfaces specified by the1003.1b

standard (formerly the 1003.4 standard), simplifying your ports from other

conforming systems. It also supports the basic system calls in the 1003.1

specification, including process primitives, files and directories, I/O primitives,

language services, and directory handling.

Using the VxWorks POSIX API, new application could be developed for

VxWorks target and port existing applications or application components to VxWorks

systems from other development and target platforms. The POSIX functions invoke

the VxWorks kernel and other routines to communicate between tasks and routines

and to coordinate the use of system resources.

I/O System: VxWorks has seven basic I/O routines: create(), remove(), open(),

close(), read(), write(), and ioct l(). Higher- level I/O routines such as ANSI C

compatible print f() and scanf() routines are also provided. It also has a standard

buffered I/O package (stdio) that includes ANSI C-compatible routines such as

fopen(), fclose(), fread(), fwrite(), getc(), and putc(). The VxWorks® I/O system also

includes POSIX compliant asynchronous I/O that includes a library of routines that

perform I/O operations along with a task‘s other activities. VxWorks includes device

drivers for serial communication, disks, RAM disks, SCSI tape devices, intertask

communication devices (called pipes), and devices on a network. It also provides

facilities for additional drivers to be written by the developer. It allows dynamic

installation and removal of drivers without rebooting the system. When a I/O request

is sent by the user, the I/O system does very little processing of the request and gives

control of the request to the device driver. It works like a switch in transferring the



user request to the driver. Drivers can then process the request appropriately using

different protocols, device specific routines, and different file systems, without any

interference from the I/O system. In addition VxWorks also provides several high-

level subroutine libraries for writing drivers that implement standard protocols for

both character (file system storage devices like hard drive or floppy drive) and block

oriented devices (like terminal and printer).

Shared Memory: VxWorks provides memory sharing. But memory sharing can

pose a threat of data corruption when two or more processes try to access the shared

memory simultaneously as shown below:

This requires task synchronization and mutual exclusion, which is done using

Semaphores in VxWorks. Semaphores coordinate the several tasks that are sharing the

same memory and avoid data corruption. There are four kinds of semaphores in

VxWorks:

1. Binary Semaphores: For synchronization and mutual exclusion. These are the

fastest and most commonly used semaphores.

2. Counting Semaphores: For mutual exclusion specifically.

3. Mutual-Exclusion Semaphores: Similar to Binary Semaphores. In addition it keeps

track of number of times semaphore is given.

4. POSIX semaphores: POSIX defines both named and unnamed semaphores. The

POSIX semaphore library provides routines for creating, opening, and destroying both

named and unnamed semaphores.

The VxMP option with VxWorks facilitates sharing of message queues and

semaphores. With the VxMP option, the above-mentioned inter processor

communication functions are made available to entire system. The developer may

design applications using shared memory for simple sharing of data, message queues



or pipes for intertask messaging, sockets and remote procedure calls for network-

transparent communication, and signals for exception handling.

Local File System: VxWorks provides fast file systems tailored to real-time

applications. It provides the following compatible file system:

– File system compatible with the MS-DOS file system “dosFs”: Allows efficient

organization of files and permits arbitrary number of files to be created. It provides

the ability to specify contiguous file allocation for any file for enhanced performance.

Provides compatibility with other storage media like diskette and hard drives created

with ―dosFs‖. Name lengths not restricted to eight characters like is the case in MS-

DOS.

– File system compatible with the RT-11 file system “rt11Fs” : This file system is used

for real-time applications because all files are contiguous,which results in very high

performance. VxWorks® provides ―rt11Fs‖ compatibility including byte-addressable

random access to all files. Eachopen file has a block buffer for optimized reading and

writing.

– File system compatible with the “raw disk” file system “rawFs”: In ―rawFs‖ the

entire disk is considered to be one large file. It permits reading and writing portions of

the disk, which is specified by a byte offset and performs simple buffering. When

only simple, low-level disk I/O is required, it has the advantages of size and speed.

– File system compatible with SCSI tape devices “tapeFs”: In ―tapeFs‖ compatibility,

VxWorks does not provide a standard file or directory structure on tape. Instead the

tape is considered as one large file. The higher- level layer does data organization on

the tape.

– File system compatible with CD-ROM devices “cdRomFs” : Any CD-ROM,

formatted in accordance with ISO 9660 file system standards can be read in

VxWorks. Data on a CD-ROM device can be accessed using the standard POSIX I/O

calls.

C++ Development Support: VxWorks provides support for C++ libraries including

the i/o stream class library and the standard template library. The optional component

―Wind Foundation Classes‖ adds the VxWorks Wrapper Class library and Tools.h++

library from Rogue Wave.



Virtual Memory: VxWorks provides both bundled and unbundled (VxVMI option)

virtual memory support for boards with an MMU (Memory Management Units),

including the ability to make portions of memory non-cacheable or read-only, as well

as a set of routines for virtual-memory management. Bundled virtual memory is

useful for multiprocessor environments where memory is shared across processors or

where DMA transfers take place.

Target-resident Tools: It is possible to configure a target-resident shell, module

loader and unloader, and symbol table into the VxWorks system using the

targetresident tools in the host system.

Utility Libraries: VxWorks provides an extensive set of over 1100 utility routines for

the help of application developers in solving common problems. Some o f the most

commonly used utility routines are given below:

Interrupt handling support: VxWorks supplies routines for handling hardware

interrupts and software trap. Interrupt handling is nested and prioritized. An interrupt

runs in a special context outside any task, which makes it faster. In addition, routines

are provided for connecting C routines to hardware interrupt vectors, and to

manipulate the processor interrupt level.

Watchdog timers: A watchdog timer can be used to delay the execution of a task.

When the watchdog timer is activated, the task goes into a delayed state. When the

delay time period elapse, an interrupt service routine is called that returns control to

the task, unless the watchdog timer is canceled first.

Message logging: The message logging facility allows applications to log a formatted

error or status message to a system-wide logging device, such as the system console,

disk, or accessible memory.

Memory allocation: Memory management is one of the most critical aspects of

realtime systems. VxWorks provides a memory management facility that allocates,

frees, and reallocates blocks of memory from a memory pool.



String formatting and scanning: VxWorks provides a complete set of ANSI C library

string formatting and scanning subroutines that implement printf()/scanf() format-

driven encoding and decoding and associated routines.

Linear and ring buffer, Linked List manipulations: All tasks in VxWorks exist in a

single linear address space. The linear and ring buffers, linked list can be referenced

directly by code that is running in different context. It provides a set of ring buffer

routines that manage first- in-first-out (FIFO) circular buffers. These buffers can be

accessed simultaneously without interlock. VxWorks also provides a set of routines

for creating and manipulating doubly linked lists.

ANSI C libraries: VxWorks provides all C libraries specified by ANSI X3.159 -

1989. It includes the following libraries: assert, ctype, errno, float, limits, locale,

math, setjmp, signal, stdarg, stdio, stddef, stdlib, string, and time. The header files

float.h, limits.h, errno.h, and stddef.h provide definitions and declarations.

Performance Evaluation Tools : VxWorks performance evaluation tools includes an

execution timer for timing a routine or group of routines. This tool is very helpful

because the system clock is too slow to time the fast routines. It also has spy utility to

show CPU utilization percentage by task.

1

Board Support Packages: Board Support Packages (BSPs) are available with

VxWorks for a variety of boards and provide routines for hardware initialization,

interrupt setup, timers, memory mapping, and so on. Two target-specific libraries,

sysLib and sysALib, are included with each port of VxWorks. These libraries provide

an identical software interface to the hardware functions of all boards.

VxWorks Simulator (VxSimTM): VxSimTM is an optional product that comes with

VxWorks. It is a complete prototyping and simulation tool. It enables application

development to begin before hardware becomes available, allowing a large portion of

software testing to occur early in the development cycle . It adds networking facilities,

allowing the simulator to obtain an Internet address and communicate with the

network using the VxWorks networking tools. VxSim is essentially a port of

VxWorks. It provides a stable environment for prototype software of VxWorks

applications.



Logic Analyzer (WindView): WindView is an optional product that comes with

VxWorks, it provides advanced debugging tools for the simulator environment.

WindView provides software logic analyzer support for all WRS BSPs.

Network Facilities: VxWorks provides the following networking facilities:

– It allows access to other VxWorks systems.

– It allows access to other TCP/IP-networked systems.

– It provides a MUX interface supporting advanced features such as multicasting, polled-

mode Ethernet, and zero-copy transmission.

– It provides a BSD 1 Sockets-compliant programming interface.

– It provides remote procedure calls (RPC), SNMP (optional), remote file access (including

NFS client and server facilities and a non-NFS facility utilizing RSH, FTP, or TFTP),

BOOTP, proxy ARP, DHCP, DNS, OSPF (optional), and RIP.

All VxWorks network facilities comply with standard Internet protocols, both

loosely coupled over serial lines or standard Ethernet connections and tightly coupled

over a backplane bus using shared memory.

Intertask Communication: VxWorks provides the following mechanism for

intertask communication:

Shared memory, for simple sharing of data.

Semaphores, for mutual exclusion and synchronization.

Message queues and pipes, for intertask message passing within a CPU.

Sockets and remote procedure calls, for network-transparent intertask

communication.

Signals, for exception handling.

Details on Message queues and Pipes:

In VxWorks, intertask communication within a processor is performed by

―message queues‖. Using message queues, tasks can exchange information securely

and efficiently. Message queues allow a variable number of messages, each of

variable length, to be queued. Any task or ISR can send messages to a message queue.

Any task can receive messages from a message queue. Multiple tasks can send to and

receive from the same message queue. The messages are queued in FIFO order with



two priorities, high and low. High priority messages are attached to the head of the

queue and low priority messages are added to the tail of the queue.

Synchronization between sending and receiving tasks is done by the message

queue to avoid data corruption or loss of data. A task attempting to receive a message

from an empty queue is required to wait for a specific length of time, wait infinitely or

try again later. A task attempting to send a message through a queue that is already

full has the same options. This way data is never lost or corrupted.

VxWorks provides two libraries, msgQLib and mqPxLib, each of which

contains subroutines for controlling Wind message queues and POSIX message queue

respectively.

Pipes are provided in VxWorks as an alternative to message queues. It is a

FIFO buffer very similar to message queues because it incorporates message queues,

but they are much simpler to use compared to message queues. The driver ―pipeDrv‖

manages pipes. A pipe is created using the ―pipeDevCreate()‖ routine in the

―pipeDrv‖ library. This routine creates the pipe along with the underlying message

queue. The syntax of such a call us given below:

status = pipeDevCreate ("/pipe/name", max_msgs, max_length);

Once a pipe is created, a task can read or write from it by using the read() and

write() routines. The created pipe is a normally named I/O device. Tasks can use the

standard I/O routines to open, read, and write pipes, and invoke ioctl routines. Similar

to message queues, if a task tries to write to a pipe that is full, the task is pended until



space becomes available in the pipe. And similarly if a task tries to read from a pipe

that is empty, it is pended until a message arrives in the pipe. Pipes works just like

any message queue with additional features, which makes it usable in place of

message queues in most situations. Like message queues, ISRs can write to a pipe, but

cannot read from a pipe. Pipes are addressable with file descriptors therefore they

provide one important feature that message queues does not. They are compatible

with select() routine and the basic I/O routines. This routine allows a task to wait for

data to be available on any of a set of I/O devices. The select() routine also works

with other asynchronous I/O devices including network sockets and serial devices.

Thus, by using select(), a task can wait for data on a combination of several pipes,

sockets, and serial devices.

Target platform supported by VxWorks : PowerPC, MC680x0, MC683xx, Intel

i960, ColdFire, MCORE, Intel i960, 80X86, Pentium, ARM and StrongARM, MIPS,

SH, NEC V8xx, R3000, RAD6000, M32 R/D, ST 20, TriCore, SPARC, Fujitsu

SPARClite, and TRON Gmicro.

Host platform supported by VxWorks: Sun3, Sun4, HP9000, IBM RS-6000, DEC,

SGI, and MIPS.

Applications of VxWorks:

Aerospace Industry: Flight Simulators, Satellite Tracking Systems etc.

Transportation Industry: ABS, Realtime Suspension, Traffic Control Systems etc.



Computer Industry: Networking Switches, Routers, X terminals, I/O control, RAID

Data Storage, Audio/Video Systems, Postscript Laser Printer etc.

Telecommunications Industry: PBX‘s, Modems, Cellular Systems etc.

Medical Industry: MRI & PET Scanners, Radiation Therapy Equipment etc.

Others: Navigation & Sonar Systems, Robotics, Digital Imaging Equipments etc.

4.8 Summary

An embedded system runs on embedded platform. The basic elements of a

simple embedded operating system are the scheduler and scheduling points, context

switch routine, definition of a task, and a mechanism for intertask communication. A

real-time operating system (RTOS) is an embedded operating system that guarantees

a certain capability within a specified time constraint. A real time kernel is software

that manages the time of a microprocessor to ensure that all time critical events are

processed as efficiently as possible. VxWorks is one of the most popular commercial

real-time operating system available in the market.

4.9 Self Test

1. ….. Kernels require that each task odes something to explicitly give up the

control of CPU.

2. List of all the tasks ready to run is called ……….

3. Scheduling is the process of picking up the TCB at the head of ready list.

(True/ False)

4. Kernels provide a mechanism to keep track of time. (True/False)

5. …………… system runs in an infinite loop that constantly tests for an event.

a. Polled loop

b. Phase driven

c. Hybrid

d. Loop

6. All time critical processes need to be……..



7. Priority inversion occurs when a high priority task is blocked , waiting for a

mutex that is held by a low priority task. (True/False)

8. Which of the following is a task synchronization mechanism?

a. Mutexes

b. Message queues

c. Monitors

d. All of these

Answers

1. Non-preemptive

2. ready list

3. True

4. True

5. a

6. Foreground

7. True

8. d

4.11 Questions

1. Briefly explain the important elements of Embedded OS

2. What is a RTOS? How does it differ from a general Embedded OS?

3. Explain Foreground/Background Systems.

4. How are the events managed in real time kernels?

5. What are the types of kernel? Briefly explain

6. Explain the performance characterstics of an RTOS.



References

1. Programming Embedded Systems in C and C++

by Michael Barr.

2. Embedded Systems Building Blocks, Complete and Ready-to-Use modules in C

by Lawrence, Kansas

3. µC/OS-II, The Real-Time Kernel

by Lawrence, Kansas

4. The Art of Programming Embedded Systems

by Ganssle, Jack

5. Real Time Systems Design and Analysis

by Laplante, Philip

Embeddedsystems Www.revastudents.info

Documents

Transcript of Embeddedsystems Www.revastudents.info