Computer Vision: Intro
Transcript of Computer Vision: Intro
Computer Vision: Intro.What are its goals?What are the applicationsWhat are some ways of using images(Later: methods and programming)
Finding ships in an aerial photograph
A corresponding map
Dock areaRegistered map and image
Finding a kidney in computer-aided tomographic scan
Prototype kidney model and model fitting
Resulting kidney and spinal cord instances
Goal of computer visionMake useful decisions about real physical objects and scenes based on sensed images.Alternative (Aloimonos and Rosenfeld): goal is the construction of scene descriptions from images. (Read reference S1)How do you find the door to leave?How do you determine if a person is friendly or hostile? .. an elder? .. a possible mate?
Critical IssuesSensing: how do sensors obtain images of the world?Information: how do we obtain color, texture, shape, motion, etc.?Representations: what representations should/does a computer [or brain] use?Algorithms: what algorithms process image information and construct scene descriptions?
Images: 2D projections of 3D3D world has color, texture, surfaces, volumes, light sources, objects, motion, betweeness, adjacency, connentions, etc.2D image is a projection of a scene from a specific viewpoint; many 3D features are captured, some not.Brightness or color = g(x,y) or f(row, column) for a certain instant of timeImages indicate familiar people, moving objects or animals, health of people or machines
Image receives reflectionsLight reaches surfaces in 3DSurfaces reflectSensor element receives light energyIntensity countsAngles countMaterial counts
CCD Camera has discrete eltsLens collects light raysCCD eltsreplace chemicals of filmNumber of eltsless than with film (so far)
Resolution is “pixels per unit of length”
Resolution decreases by one half in cases at leftHuman faces can be recognized at 64 x 64 pixels per face
Features detected depend on the resolution
Can tell hearts from diamondsCan tell face valueGenerally need 2 pixels across line or small region (such as eye)
Camera + Programs = DisplayCamera inputs to frame bufferProgram can interpret dataProgram can add graphicsProgram can add imagery
Computer Vision-Rosenfeld
VisionThe most powerful sense for many living organismsFiled related to visual perception: physiology, psychology, computational and robot vision, and engineeringLarge part of the human brain is devoted to visual perception
Computational algorithms
Computer Vision-Rosenfeld
Do you see as it is?
Totally straight line
How many black dots?
Computer Vision-Rosenfeld
The Goal of Image UnderstandingWe use vision to interact with our environments and survivePerform visual tasks: engage in many kinds of behaviors that are guided by visual inputsDavid Marr: construction of a detailed representation of the physical world
Transform 2D data into a description of the 3D spatiotemporal world
Computer Vision-Rosenfeld
Scene RecoveryWhat properties of a scene can be recovered by means of vision?Inverse optics problem
Optics map: the world into the imageVision: attempts to invert the optical map
Computer vision and other fields
http://en.wikipedia.org/wiki/Computer_vision
A Range of representationsGeneralized images
Iconic(image-like)Low level processing
A Range of representations
Segmented images
Edge segmentation
A Range of representationsGeometric representation
3D Shape Prior knowledge
A Range of representationsRelational models
Semantic netsHigh level processing
Look at some CV applications
Graphics or image retrieval systems; Geographical: GIS;
Medical image analysis; manufacturing
Image Database SearchCompany wants a new logoMake several designsSearch logo database for infringement
Aerial images & GISAerial image of WenatchieRiver watershedCan correspond to map; can inventory snow coverage
Medical imaging is critical
Visible human project at NLMAtlas for comparisonTestbed for methods
Medical Imaging
CT image of a patient’s abdomen
3D reconstruction of Blood Vessel Tree
Medical Imaging
Cardiac tagged MRIs
Manufacturing case 100 % inspection neededQuality demanded by major buyerAssembly line updated for visual inspection well before today’s powerful computers
Simple Hole Counting Alg.Customer needs 100% inspectionAbout 100 holesBig problem if any hole missingImplementation in the 70’sAlg also good for counting objects
Imaging added to lineCamera placed above conveyor lineBack lighting added1D of image from motion of object past the camera
Critical “corner patterns”“external corner”has 3(1)s and 1(0)“internal corner”has 3(0)s and 1(1)Holes computed from only these patterns!
Hole (Object) Counting Alg.
#holes = (#e - #i)/4
Variations on AlgorithmEasy if entire image is in memoryOnly need to have 2 rows in memory at any time
* used in the 1970’s* can allow special hardware
Some other methods
Finding contrast in an image; using neighborhoods of pixels;
detecting motion across 2 images
Differentiate to find object edges
For each pixel, compute its contrastCan use max difference of its 8 neighborsDetects intensity change across boundary of adjacent regions
4 and 8 neighbors of a pixel4 neighbors are at multiples of 90 degrees
. N .W * E. S .
8 neighbors are at every multiple of 45 degrees
NW N NEW * ESW S SE
Detect Motion via SubtractionConstant backgroundMoving objectProduces pixel differences at boundaryReveals moving object and its shape
Some image format issues
Spatial resolution; intensity resolution; image file format
Resolution is “pixels per unit of length”
Resolution decreases by one half in cases at leftHuman faces can be recognized at 64 x 64 pixels per face
Features detected depend on the resolution
Can tell hearts from diamondsCan tell face valueGenerally need 2 pixels across line or small region (such as eye)
Many different image file forms
Portable gray map (PGM) older formGIF was early commercial versionJPEG (JPG) is modern versionMany others existDo they handle color?Do they provide for compression?Need to have size & parameters & pixels
PGM image with ASCII info.P2 means ASCII grayCommentsW=16; H=8192 is max intensityCan be made with editor
P1: binary, P3: RGB, P4 and P6: binary format
JPG current popular formPublic, not private, standardAllows for image compression; often 10:1 or 30:1 are easily possible8x8 intensity regions are fit with basis of cosinesError in cosine fit coded as wellParameters then compressed with Huffman codingVERY TECHNICAL!
First day course businessSyllabus on web (read for next time)Course web pages (www.research.rutgers.edu/~chansu/CS580Web/ ) Textbook by Shapiro and StockmanRead Chapters 1 and 2 Read Chapter 1 for Ballard Computer Vision book (Available online: http://homepages.inf.ed.ac.uk/rbf/BOOKS/BANDB/bandb.htm)