News Letter, object recognition and tracking

February 2010 | Concurrent Vision ApS | +4530328964

Real Time Object Recognition and Tracking Concurrent Vision ApS

Contents

Intelligent active video surveillance

Biomedical Image Analysis

Visual Feedback control of robots

Diverse Object recognition applications

Automation systems, especially in the world of robotics, are

becoming faster creating an increasing need to track objects at

higher speeds than ever before.

Systems which rely on computer vision analysis to make artificial

intelligence decisions and provide control, extend from high speed

production lines and robot arms to autonomous guided vehicles,

missiles and planes. Such systems use computer vision algorithms

to extract information from images in a video sequence to identify

and track objects in a scene. Usually these algorithms require high

computational resources from a general purpose processor or a

DSP, causing high computational latencies. High latencies act as a

prohibitive factor for providing true, real time recognition and

tracking of objects moving at high velocities.

Company Concurrent Vision ApS develops real time, high speed,

vision-based systems that identify and track objects in a continuous

video stream. These systems are based on the digital ASIC and

FPGA technologies to implement high speed parallel computations

providing true real time recognition and tracking of objects moving

at speeds above 200 km/h. Typical applications of these systems

include active video surveillance, vision- based robotic arms motion

control, providing cognitive characteristics to robots and tracking

high speed moving targets. Of other applications can be mentioned

video stabilizing, augmented reality, image stitching, real time

demosaicing for high definition video cameras, 3D imaging,

intelligent toy and physical interactive computer games.

Concurrent Vision also provides solutions for the acceleration of high

speed content based image retrieval systems that search for digital images in large databases. An example of such systems is retrieval and matching medical images for computer aided diagnosis.

1.issue, February 2010

Concurrent Vision ApS

Intelligent Active Video Surveillance

STATE-OF-THE-ART COMPONENT

TECHNOLOGIES IN VIDEO ANALYSIS

FOR SURVEILLANCE

Automated video surveillance in commercial, law

enforcement, and military applications is concerned with

real-time observation of people and vehicles in crowded

environments. A type of observation that tends to describe

actions and interactions and probably predict behavior.

Active surveillance as a real-time medium creates effective

deterrence systems protecting people and businesses from

crime and criminal activity. In continuous automated

monitoring of surveillance video, security alerts are issued

responding to burglary in progress or to suspicious

individuals, moves or objects in a scene. Automated Video

surveillance technology has also been proposed in

applications to measure traffic flow, detect accidents on

highways and log routine maintenance tasks at nuclear

facilities. Military applications include patrolling national

borders, measuring the flow of refugees in troubled areas,

monitoring peace treaties, and providing secure perimeters

around bases and embassies. Such video surveillance

presents a number of technical issues including moving

object detection and tracking, object classification, human

motion analysis, and activity understanding.

Concurrent Vision’s solutions solve or aid solving technical

issues of Automated Video surveillance by providing high

speed techniques for the following:

Detection and tracking which involves real-time

extraction of moving objects from video and continuous

tracking over time to form persistent object trajectories.

Human motion analysis which is concerned with

detecting periodic motion signifying a human gait and

acquiring descriptions of human body pose over time.

Activity analysis deals with parsing temporal sequences

of object observations to produce high-level descriptions

of agent actions and multiagent interactions.

2

Concurrent Vision, In

key areas of Video-based detection and

tracking,

Video-based person

identification, and

Large-scale surveillance

systems.

Biomedical Image Analysis

Computer Aided

Diagnosis Reduces

Human Errors Matching Medical Images for Computer Aided Diagnosis

Some studies show that 20 to 40 percent of statements made on radiological reports by radiologists or radiology residents were found to be erroneous. Errors can be classified as observational and interpretational errors. Observational errors can be linked to incomplete or faulty search patterns. Observation is for instance enhanced by taking advantage of the computer ability to see shades of gray beyond the range of human vision and ability to use sophisticated search patterns. Computers can store and analyze all 1000 shades of gray in the photon beam exiting the patient during radiologic scans. Shades representing differences in bone and tissue density, whereas Human visual range can only see 32 or fewer shades of gray. Errors of interpretation can be linked to the practitioner’s failure to link abnormal radiologic signs to relevant clinical data. Using object recognition, image retrieval and matching algorithms, computers can access and process huge amounts of stored clinical data and produce accurate interpretations

Medical databases contain huge amount of information relevant to illnesses

and their cures. These databases contain radiologic medical images that give

pictures of small details of organs in the body. Benefiting from these images is

however quite difficult, since data sets to be analyzed by radiologists is

increasing substantially. Automatic image retrieval and matching Systems

based on scale and rotation invariant object recognition techniques, can be

used to collect or classify the statistical information obtained from the

databases, and perform computer aided diagnosis of diseases and

abnormalities. Computer aided diagnosis improve accuracy of statements made

on radiological reports and reduce both observational and interpretational

human errors in these statements. Automatic retrieval and matching of medical

images takes advantage of the merging of medical imaging with multimedia

technology in networked multimedia systems for image-assisted medical care.

Object recognition techniques depend greatly on extracting and detecting features in

2-D scalar images. Feature points are used to establish correspondence between pairs

of images which is important for landmark based image registration and for building

statistical models of shape and appearance. Extracting and matching Features in

images can for instance be used in content based image retrieval from a database of

fracture images for the purpose of planning surgical interventions after fractures.

Image retrieval and matching can be used to supply similar cases to an example to

help treatment planning and find the most appropriate technique for a surgical

intervention. The Figure below shows examples images of a fracture database used for

retrieval.

Concepts used in the computer vision technique for extracting and matching features

in 2D scalar images can be extended to scalar images of arbitrary dimensionality.

Retrieval and matching of 3D human Magnetic Resonance Imaging (MRI) brain scans

and 4D computed tomography (CT) cardiac scans are examples.

3

Continuous real-time monitoring of vasospasm using TCD Cerebral aneurysm refers to the localized dilation or ballooning of the cerebral artery due to the weakening of the wall of the blood vessel. As the size of the aneurysm grows, the chance of it to rupture increases. Rupture of the cerebral aneurysm will lead to subarachnoid hemorrhage (SAH), which is a serious condition with a mortality rate of 30-60%. The primary treatment for this condition includes open surgery aneurysm clipping and endovascular coiling. Regardless of the treatment, patients suffering from SAH may undergo vasospasm, which is a condition when blood vessels spasm, leading to decreased oxygen delivery. It is most likely to occur within 3-7 days after treatment. As a result, continuous monitoring of the blood vessels within the first 3-14 days of after SAH is desired to assess the presence of vasospasm. At the present time, there are various accepted clinical methods to diagnose vasospasm. A non-invasive technique to monitor vasospasm includes the use of Transcranial Doppler Ultrasound (TCD), which is cost-effective, easy to use and potentially available 24-7. TCD is a tool that transmits ultrasound to measure the blood flow velocity in the blood vessels, which acts as an indicator for the occurrence of vasospasm. However, the use of TCD requires the presence of a skilled ultrasonographer, and suffers from operator dependence. The use of computerized monitoring improve sthe current TCD technology and minimizes the need of dedicated ultrasonographer.

Object Recognition in Multimodal

Biomedical Imaging

Extracting and matching features for

object recognition can be exercised to all

types of medical images acquired by any

existing image acquisition modality. It can

for example be used to automatically

detect and diagnose knee meniscus tears

from MR medical images. It can also be

used to perform real-time analysis of

Transcaranial Doppler Ultrasound

(TCD) image streams for such purposes

as the study of cerebrovascular ischemia

(stroke), the monitoring of blood flow

velocity during intensive care, general

anesthesia and carotid endarterectomy

(CEA), the detection of vasospasms after

subarachnoid hemorrhage (SAH) and the

assessment of arteriovenous malformations (AVM).For instance, many morphological

and dynamic properties of the common carotid artery (CCA), e.g. lumen diameter,

distension and wall thickness, can be measured non-invasively with ultrasound (US)

techniques. This however requires as a preliminary step the manual recognition of the

artery of interest within the ultrasound image. In real-time US imaging, such manual

initialization procedure interferes with the difficult task of the sonographer to select

and maintain a proper image scan plane. Even for off-line US segmentation, the

requirement for human supervision and interaction precludes full automation to

eliminate user interference and to speed up processing for both real-time and off-line

applications.

Automatic object recognition and tracking can also be extremely useful in conjunction

with Optical Coherence Tomography (OCT). OCT is an imaging technique that

allows non-invasive, high resolution, cross sectional-imaging of both transparent and

non-transparent structures. The greatest advantage of OCT is its resolution. Standard

resolution OCT can achieve axial resolution of 10-15 µm. A high resolution OCT

increases the resolution to the sub-cellular level of 1-2 µm. Below, a figure showing a

true subcellular image using OCT with a resolution of 4 µm

OCT has demonstrated feasibility for high-resolution imaging of the vascular system

and other vulnerable tissue. This includes the central nervous system and the cartilage

of joints. OCT can be applied to a variety of applications such as

Diagnosing and monitoring of retinal diseases

Imaging atherosclerotic plaque

Tumor detection in gastrointestinal, urinary, and respiratory tracts

Detection of skin Cancer

Early detection of osteoarthritic changes in cartilage

4

Real-time in vivo Brain Tumor Microvasculature Monitoring Using Combined Laser Scanning Confocal Fluorescence Microscopy and Optical Coherence Tomography in Preclinical Window-chamber Models

Glioblastoma multiforme (GBM) is a common primary brain tumor with aggressive, lethal, and malignant characteristics. Its high proliferative and invasive nature leads to T-cell immunosuppression and drug inefficiency and hinders surgical resection. There is a need to investigate GBM in vivo in preclinical animal models, at the macro and micro levels, that are also potentially translatable to the clinic. Combined intravital microscopy using confocal fluorescence (CF) and optical coherence tomography (OCT) can be used for the purpose. The potential applications include surgical guidance and monitoring of tumor response to photodynamic therapy (PDT). Using this imaging technique enables real-time microvasculature imaging of brain tumors and normal brain tissue in order to track the tumor growth pattern in vivo and to monitor and quantify the tissue responses to PDT treatment. A computerized system based on tumor boundaries recognition would provide a real-time and potentially available 24-7 monitoring and quantification.

OCT is typically used for assessing arterial wall pathology in vivo. It may provide more

detailed structural information than other techniques. With high resolution OCT, and

automatic object recognition, atherosclerotic plaques can be diagnosed in real-time

with high accuracy including measurement of the thickness of thin fibrous caps less

than 65µm. This represents a step towards in vivo assessment of the risk of rupture.

Insight into the physiology of a plaque is complementary to the structural information

offered by the OCT grayscale image. While the OCT image presents morphological

information in highly resolved detail, it relies on interpretation of the images by trained

readers for the identification of vessel wall components and tissue type. Computerized

image retrieval and matching as well as object recognition can help the interpretation

and identification process. It can be used to characterize different atherosclerotic

plaque components by their distinctive signal patterns as shown in the next figure. The

Figure shows histopathologic (hematoxylin and eosin staining; magnification ×40) and

OCT images of a predominantly lipid-rich plaque in quadrants I—III.

The OCT( right) shows the lipid-rich plaque (lip) with a low signal appearance and poorly delineated borders

compared with the signal-rich appearance of the fibrous plaque material (fib). (Courtesy of Meissner OA, Rieber J, Babaryka G, et al: Intravascular optical coherence tomography: comparison with histopathology in

atherosclerotic peripheral artery specimens. J Vasc lnterv Radiol 17:343–349, 2006. © SCVIR.)

OCT is also proving valuable in the differentiation

between cancerous and normal tissues as it is

sensitive to the disruption of normal tissue

architecture. The picture to the left shows the

image of a sarcoma, or muscle tumor, obtained

using (OCT). In the picture, the tissue looks

healthy and normal on the left. To the right, the

structure appears cancerous and irregular. Images

like this one, obtained by using OCT in real-time,

can help detecting tumors early during image-

guided procedures. Due to its inherent

compatibility with the use of compact fiber-based

probes, and the ability to construct portable

systems, OCT appears to be promising in the early detection of several types of cancer

in clinics. OCT aided with computerized recognition and classification of tissue structure

has capabilities for in-vivo detection of bladder cancers, colon cancers, oral cancers

and skin cancer. In detecting breast cancer for instance, compact optical fiber probes

permit access within the ductal structure of the breast or to a suspicious lesion via the

tip of biopsy needle making possible to perform localized optical imaging of tissues at

the needle tip, this along with real-time feedback, has the potential to enhance the

guidance accuracy of the biopsy and reduce “miss-rate” compared to large-core needle

biopsy obtained under ultrasound guidance.

Higher resolution and acquisition rates of OCT images improve real-time imaging

capability. Given the data acquisition rates possible with the state-of-the-art OCT

systems, rigorous human interpretation of every image is not possible in real-time.

Thus high speed computer algorithms are essential for real-time feedback during

biopsy or surgical guidance procedures to enable synchronization of motor rotation

with the high-speed OCT frame acquisition in mechanically actuated probes, and to

enable real-time tissue classification and suspicious object recognition.

Visual Feedback control of robots

Both industrial robot arms and mobile robots require sensing capability to

adapt to new tasks without explicit intervention or reprogramming. Visual sensing capability of robots overcomes many of the difficulties of uncertain

models and unknown environments which limit the domain of application of robots used without external sensory feedback. The image-based structure is an approach to visual servo control, which uses image

features (e.g., image areas, and centroids) as feedback control signals, thus

eliminating a complex interpretation step (i.e., interpretation of image features to

derive world-space coordinates).

In this approach, a 2-D image from a video sequence is processed in order to

extract image features. The extracted features are matched to precompiled object

model features in order to identify, pick or track desired objects or provide an

absolute measure of the robot state (localization). In particular, the information

gathered by a vision system about the environment makes it possible to detect

natural landmarks, navigate among unknown obstacles, and achieve a reactive

robot behavior. Vision-based sensing however has some drawbacks, such as the

need to recognize and extract a huge number of characteristic features from the

image, an increased computational burden, and a critical dependence on lightning

conditions of the environment.

Concurrent Vision ApS delivers solutions in digital logic that compensate for

these drawbacks using a proven algorithm for feature extraction which is invariant

to scale, illumination, occlusion and viewing angle. Moreover, the solution from

Concurrent Vision ApS implements a novel method of feature matching that

accelerates the process up to real time speeds.

Increased speed and

enhanced safety of visual

based robot control

An Example of a vision-based

control task is for a robot arm to

acquire an unoriented object

from a pallet without prior

knowledge of the object position.

A CCD camera attached to the

robot arm provides visual

sensing capability. Images

acquired by the camera are

processed by a computer vision

system in order to identify the

object and infer relationships

between the spatial position of

the object and the camera

position. Such relative position

information is used to guide the

robot to acquire the object. The

same problem arises in the

navigation of a mobile robot with

respect to objects in an

unstructured environment using

visual feedback. The Use of

computer vision to infer position

and orientation of objects or

interpret general three-

dimentional relationships in a

scene is a complex task

requiring extensive computing

resources which may render

robots slow in reacting to

unexpected events or

constraints the robotic system

with respect to speed. Using the

high speed parallel processing

approach, speed limits can be

removed and system safety can

be enhanced. This can especially

be desired in robotic systems

designed to remove defect

objects from high speed

automated production lines.

6

Diverse Object Recognition Applications

Visio-haptic wearable systems for the blind Object recognition and tracking algorithms for visio-haptic

information analysis, i.e., the conversion of visual data

into haptic (tangible) features, can be utilized in wearable

assistive devices for blind individuals. Touch is an

important modality for individuals who are blind, but it is

limited to the extent of one's reach. By estimating how an

object feels from its visual image, we are able to

overcome this limitation.

Common haptic devices and systems allow blind people

recognize three-dimensional (3D) objects that exist in

virtual environment. Such systems allow blind people to

touch, grasp and manipulate objects that exist in the hap-

tic enabled virtual environment. In the regard object

recognition reduces the overall time needed to understand the shape of objects and

provide better immersion to the virtual environment.

Tracking Objects in Augmented Reality

Augmented reality (AR) is a term for a live direct or indirect view of a physical

real-world environment whose elements are merged with virtual computer-

generated imagery - creating a mixed reality. The augmentation is conventionally

interactive in real-time and in semantic

context with environmental elements,

such as sports scores on TV during a

match. With the help of advanced AR

technology the information about the

surrounding real world of the user

becomes interactive and digitally usable.

Artificial information about the

environment and the objects in it can be

stored and retrieved as an information

layer on top of the real world view.

The ability to track visible objects in real-time provides an invaluable tool for the

implementation of Augmented Reality. Once an object has been detected, it’s

location in future frames can be used to position virtual content, and thus annotate

the environment. Object recognition and tracking solutions provided by Concurrent

Vision ApS can effectively be used in real-time AR systems.


Steen Koldsø [email protected] Mobile: + 4550168167

Moatasem Chehaiber [email protected] Mobile: +4530328964

Object Recognition aid

to the Blind

Obstacle Avoidance:

Object Recognition aids provide advance warning of obstacles and allow the blind to find a safe, clear path. A popular system is the SonicGuide Also known as the Binaural Sensory Aid. This device uses ultrasound to scan the space in front of the user and creates a stereo audio signal that varies in pitch to indicate the distance

of obstacles. The system fits conveniently in the frame of a pair of glasses. Although the rich information provided by the SonicGuide can be extremely useful, learning how to decode this signal requires significant effort There is also fear that the audio signal could mask important environmental sounds. While audition is already tapped for echolocation, the sense of touch is largely unused while traveling. It is thus possible to stimulate the skin without interfering with the normal activities and environmental cues used by the blind. Moreover, it may be easier to represent spatial information on the skin rather than through audition. The best known such aids can be classified as vision substitution systems since they provide sufficiently rich information to be used. These systems use vision by CCD to scan the space and can be integrated in an earpiece on a hearing aid device as well in a haptic device. Such systems are required to be small, lightweight and ultra low power consuming which can be achieved by implementing algorithms in HW using the ASIC technology.

.

7

mailto:[email protected]


Company Details Moatasem Chehaiber: Senior Electronics Engineer and CTO Steen Koldsø: Senior Management Consultant and CEO MD. Mohammad Chehaiber: Consultant In Endocrinology MD. Tahseen Chouheiber: Consultant In Orthopedics MD. Mohammad Elhashimy: Consultant In General medicine

Join the biomedical Engineering group on LinkeIn. A forum connecting biomedical

engineers and biomedical imaging experts where people share biomedical engineering ideas based on biomedical image analysis. http://www.linkedin.com/groups?gid=2788350&trk=myg_ugrp_ovr


Hvidovrevej 44 2610 Rødovre +4530328964 [email protected] Moatasem Chehaiber [email protected] Steen Koldsø [email protected]

http://www.linkedin.com/groups?gid=2788350&trk=myg_ugrp_ovr

http://www.linkedin.com/groups?gid=2788350&trk=myg_ugrp_ovr



News Letter, object recognition and tracking

Documents

Transcript of News Letter, object recognition and tracking