AD-AIIS POLYTECHNIC INST AND STATE UNIV BLACKSBURG …90 SUPALEt4EPTANY NOTES It9 E Y WORDS...
Transcript of AD-AIIS POLYTECHNIC INST AND STATE UNIV BLACKSBURG …90 SUPALEt4EPTANY NOTES It9 E Y WORDS...
AD-AIIS 948 VIRGINIA POLYTECHNIC INST AND STATE UNIV BLACKSBURG -;ETC F6PSeQUALITY METRICS OF DIGITALLY DERIVED IMAGERY AND THEIR RELATION--ETC(U)FEB 82 H L SNYDER. D I SHEDIVY, M E MADDOX F49620-BO-C-0057
UNLSIIDVPIHFL8-3 AFOSR-TR-82-0475 NI.
JfllI15£
'"SR-TR- 82.0475 HFL-8-3
Quality Metrics ofDigitally Derived Imagery and TheirRelation to Interpreter Performance:
-lt Ill. Subjective Scaling ofcc Hard-Copy Digital Imagery
FINAL TECHNICAL REPORT
Harry L. Snyder, Ph.D.David I. Shedivy, M.S. , ,: .,
Michael E. Maddox, Ph.D.
ADepartment of Industrial Engineering
and Operations ResearchVirginia Polytechnic Institute and State University
Blacksburg, Virginia 24061
xppimd for pubiIa release I
iLstribut ion unlimited.
Contract F49620-80-C-0057Life Sciences Directorate
US Air Force Office of Scientific ResearchBoiling Air Force Base, D C 20332
c i t. f jedSEC lp t 'sFIC ATI)N OF lSP '~(D,.I..I.~,-
REPORT DOCUMENTATION PAGE 11111111p'
IREOR T 8 2 0 47 * GOVY ACCESS 1004 P.O t #4 f f # TiV * '. AIA. )1 tj Iu64 i
SUBJECTIVE SCAtIIX; OF IIARI-COPY DIGITAL IMGERY. I P ~ E A 0AW aj6
7 AUTP0,1111.) I* o COT a c aA. Mr ..
Harry L. Snyder. Ph.D. I.4962U-8u-(L-0U5eDavid 1. Shedivv, M.S.Mchael E. Maddox. Ph.D. ___
9 PRFORMING ORGhP.,2AYIOPO NAME AN.C ADOXIat C. IW1,a PrDOj(7771..
Dept. of Industrial Engineering and Up;wratioroi Res. 102Virginia Polytechnic Institutv and Staxte Ln1v,.rsit4 ~t2Bla ksburig. _irszinia 24061 _________
Air Force Office. of Scientific Research ~ ~~Bolling Air Force Base, D.C. 20332 C aE
16 OalTfjkT iosu? STATEMENT to# this Dlep.., ----r------
Approved for public release; distribution unlimitcd
07 0 OST R9UT lope ST A TeMlet r@ (of th bto.cI enteped, in e- r 'O .11 ja.Ig~pr.tt tr. t j')
90 SUPALEt4EPTANY NOTES
It9 E Y WORDS (Conltrue ort te-PIPSO aid* Of net@ e...y and Ide.,tiv h bloc,Vwb~f~er)
Digital imagery Image interpretationHuman factors Image processingImape quality
20 ABSTRACT tCnrv11nwo ort ieee,.. oad. It nereesrr and identitv , b , lock- etvr
- - Hard-copy digital Imagery was studied witht respect to subjective imarequality. Trained Air Force photointerpreters judged the interpretahilitvof 250 military scenes. The scenes varied In noise, blur, and scene cointent.
The results showed that noise, blur, and scene content produce differvn-tial perceptions oif interpretability. Many interactions were significant.
DD 1473 colliomo OC OV 6S IS ODSOL Ert Unclassifivd -
%Ecuplty CLASSP'ICATIO. OF ?pIIS PAGE 0%1 .. ae P.,Pdl
-0 CTVLk9,FICATIOd Of TP61S PAEWIW ;&Isat.
Other analyses p)erformed on the cdata showed tha3t at lva~it 62 catvs-'or ik'should be used to scale inti'rpretabi ii v, tlue correlIat ion bvt. ,vi itiih'ri- hiextrattion performanice and .31ca' vatw foic r di ~ t al j i ii '-1!flitlt ldimen.iona1 scaling Can he ust-d with lim Ltd utLil ity. iii stLwdvi iw i w.'.qua Iit Cv
In gencrat. from it revivw of the lit. erat urc * digital i k!.aer nod ~tappear to be great ly diJfferenut from standard an.alo g 1:%djt-r% t: rstabjvctive clualitv or interi)rt'tabil ity.
I.0
HFL-81.3
Quality Metrics ofDigitally Derived Imagery and TheirRelation to Interpreter Performance:
Ill. Subjective Scaling ofHard-Copy Digital Imagery
FINAL TECHNICAL REPORT
Harry L. Snyder. Ph.D.David I. Shedivy. M.S.
Michael E. Maddox. Ph.D.
Department of Industrial Engineeringand Operations Research
Virginia Polytechnic Institute and State UniversityBlacksburg, Virginia 24061
AIR FOR~CE oFT, nl? Sv'irrrrc R3SE.U1 AFCNOTICE cF-~.7""-L TO DTTCT hi s t c :-,.'1 • : : , r n v i -* d an d IsOpprov : -7. e JAW AFR 123-12.
Distritbut 'o, '--':; ited,MATTHEk J. KIUR
Chief. Technioal Inforiation Division
Contract F49620-80 C-0057Life Sciences Directorate
U S Air Force Office of Scientific ResearchBolling Air Force Base. D C 20332
PREFACE
This research was sponsored by Contract No.
F49620-80-C-0057, from the Air Force Office of Scientific
Research, under the monitorship of Dr. Alfred R. Fregly.
Dr. Harry L. Snyder is the principal investigator.
The imagery used in this research was supplied by the
Environmental Research Institute of Michigan. Dr. James J.
Burke and his staff at the University of Arizona (UA), in
cooperation with Mr. Ray Schmidt of the Image Processing
Institute, University of Southern California, and Mr. H. D.
Fisher of UA, degraded and digitized the original negative
transparencies. Mr. Gilbert Kuperman, AFAMRL, provided
technical advice throughout the effort. The staff and
photointerpreters of the 548th Reconnaissance Technical
Group graciously and professionally supported the data
collection effort.
40II,,'
f.j- ii -
. .4~ ~ -
CONTENTS
PREPACE . . . .. . . . .. . . * *.. . i i
page
1. INTRODUCTION ....... ................. . . I
OVERVIEW OF T1,E RESEARCH PLAN . . ......... 4RESEARCH OBJE'.TIVES . . . . . . . ......... 5SPECIFIC RESEARCH TASKS ............. 6BACKGROUND . . . . . . . . . . . . ......... 8
Noise Effects . . . . . . . . . . . .. . . 9Blur Effects......... . . . . . . . 12Subjective Scaling and Information Extraction . 13Multidimensional Scaling . . . . . . . . ... 15Rating Scales . . . . . . . . ......... 16
II. SELECTION OF THE SCALING PROCEDURE . . . . . . . . . 18
THE ALTERNATIVES . . . . . . . . . . . . . . . . . 19THE MOST FEASIBLE ALTERNATIVE . . . . . . . . . . 20
III. PURPOSE OF TIll, EXPERIMENT . . . . . . . . . . . . 21
IV. METHOD . . . . . . . . . . . . . . . . . . . . . . . 22
PHOTOINTERPRETERS . . . . . . . . . . . . . . . . 22APPARATUS . . . . . . . . . . . . . . . . . . . . 22PROCEDURE . . . . . . . . . . . . . . . . . . . . 24
V. RESULTS . . . . . . . . . . . . . . . * . . . . . . . 26
ANALYSIS OF VARIANCE (ANOVA) ON RATINGS . . . . . 26Blur . . . . . . . . . . ........... 27No ise . . . . . . . . . . . . * . . . . . . . 28Order of Battle . . . . . . . . . . . . . . . . 28Scene . . . . .. . . . . . . # . . . 29Blur x Noise . . . . . . . . . f . .. . . . 30OB x Blur . . . . . . . . . . 0 . . . . . . . . 3108 x Noise . . . . . . . . . . . . . . . . . . 32Scene x Blur . . . . . . . . . . . . . . . . . 33Scene x Noise . . . . . . . . . 6 . . 0 . . 0 . 34Scene x Blur x Noise . . . . . . . . . . . . . 34
OPTIMAL NUMBER OF RESPONSE CATEGORIES . . . . . . 41MDS ANALYSIS . . . . . . . . . . . . . . . . . . . 42CORRELATION WITH PERFORMANCE . . . . . . . . . . . 45
- I11 -
VI. DISCUSSION . . . . . . . . . . . . . . . . . . . . . 47
NOISE AND BLUR EFFECTS . . . . . . . . . . . . . . 47NUMBER OF RESPONSE CATEGORIES . . . . . . . . . . 48MDS ANALYSIS . . . . . . . . . . . .. . 49RELATIONSHIP BETWEEN INFORMATION EXTRACTION AND
SCALING . . . . . . . . . . . . . . . . 51DIGITAL IMAGERY SIMILARITIES . .......... 52
VII. CONCLUSIONS . . . . . . . . . . . . . . . . . . . . 54
REFERENCES . . . . . . . . . . . . .. . . . . . . . . . 56
Append i|xpage
A. THE NATO SCALE -- AN IMAGE INTERPRETABILITY RATINGSCALE . . . . . . . . o . . . o . .. . .. . . 59
B. SCENE MCT RESULTS . . . . . . . . . . . . . o . . . . 63
C. BLUR X NOISE MCT RESULTS . . . . . . . . . . * . . . 64
D. SCENE MCT RESULTS -- BLUR LEVEL 40 . . . . . . . . . 66
E. SCENE MCT RESULTS -- BLUR LEVEL 52 . . . . . . . . . 67
F. SCENE MCT RESULTS -- BLUR LEVEL 84 . . o o .. . . 68
G. SCENE MCT RESULTS -- BLUR LEVEL 162 . . . . . . o . . 69
H. SCENE MCT RESULTS -- BLUR LEVEL 322 . . . . . . . . . 70
I. PROJECTIONS OP MDS ANALYSIS . .. . . . . . . . .. 71
4(
- iv -
. ... . . . .. .. C,4
LIST OF FIGURES
Figure
1. Schematic diagram of proposed res,-irch ..........
2. The effect of Blur on NATO scale value .. ....... .. 28
3. The effect of Signal-to-Noise ratio on NAT') scal.tvalue . . . . ..................... 29
4. The effect of Order of Battle on NAT) scale v.lue 29
S. The effect of Scene on NATO scale value . ...... .. 30
6. The effect of Blur x Noise on NATO scale value . . . 31
7. The effect of Blur x Order of Battle on NATO scabevalue . . . . . . . . . . . . . . . . . . . . . . 32
8. The effect of Noise x Order of Battle on NATO scalevalue . . . . . . . . . ............. 3
9. The effect of Blur x Scene on NATO scale value . . . 34
10. The effect of Noise x Scene on NATO scale values . . 35
11. The effect of Blur x Noise on NATO scale value,Scene 1 * . . * .............. . . . . . 36
12. The effect of Blur x Noise on NATO scale value,Scene 2 . . . . . . . . . . . . . . ....... 36
13. The effect of Blur x Noise on NATO scale value,Scene 3 ....................... 37
14. The effect of Blur x Noise on NATO sea 1e value,Scene 4 . . . . ................... 37
15. The effect of Blur x Noise on NATO scale value,Scene 5 . . . . . . . . . ............. 38
16. The effect of Blur x Noise on NATO scale value,Scene 6 . . . . . . . . . ............. 38
ov
17. The effect of Blur x Noise on NATO scale value,Scene 7 ....... ..................... 39
18. The efft.ict of Blor x Noise on NATO scale value,.;ene 8 .......... ..................... 39
19. The effect of Blur x Noise on NATO scale value,Scene 9 ......... ..................... .40
20. The eff.e-ct of Blur x Noise on NATO scale value,Scene 10 .......... ................... 4U
21. The relationship between stress and the number ofdimensions ....... ................... . 44
LIST OF TABLES
Table
i. Summary of Analysis of Variance on NATO Scale Scores 27
- vi -
1. INTRODUCTION
Recent technological developments have resulted in a wide
variety of imaging systems and subsystems. The flexibility
and technologies available to the designer include various
means for collection, coding, transmitting, decoding, analog
and digital processing, and analog and digital display. The
applications of such systems and subsystems are myriad,
ranging from static and dynamic military photointerpretive
functions, ehrough commercial and closed-circuit television
and facsimile systems, to diagnostic radiological
instrumentation and earth resources applications. The
scientific wnrld is quite familiar with some of the
techniques whio.h can be used to "improve* the nature of any
such image, and the non-scientific world has equally seen
examples of such processing effectiveness, such as the
Zapruder and Hughes films of the Kennedy assassination. In
many cases, it is clear that such processing and display
techniques can extract information In the original image
which is otherwise well below the threshold capacity of the
human visual system, whereas in other cases it is clear that
processing techniques can often serve either to hide
existing, and important, image detail or to 'create* image
detail which is perhaps not present in the original image or
-1
in rhe "real world".- Horetof or(- most if these areas. of
irnaqe system anil subsystom Ieveloppmenlt have~ pl ainly suffered
Ifrom their Inattention to homan obsterver ro-quirements. Th is
i s par t iular ly tt it' of t!1#1 .xtenisive -f t)r t i n d1iita I
imakde pi )ceS5, nq l, te spt.'i a 1 y that. part devoted to tile
improvement ( "onhaiworient" . "rk-;tor3ti on") of images fo r
putps,-;- of .-ium-,ii informal im)i oxt r~ttion. In neairly all of
the work per forme'd in i r t r)r i-.'- around t he coountry that
.ir#, pirioi nq rh i_ type of re--oar ,ho, the netcsiary evAlua tive
ef fort.-; to dere..rmit Oilth utility of p r )- ! ;-; i n and d ispl ay
tte-hn i que's have nortwben .o nli v I * R-i t het!r , r epor ts and
publi ,tions *,f this work typi-al ly t ,k(- the feirm of " befo rt.
.ind .ift.'r" pai ir!- f) i Ma.1 , w'eero the* r ,e t is Ieft to
1'; t i MA t*' th' it I t r f ~;ii imi j es to !w r by v Sla I
i 11rp-kt ioll of t tit, piabiI i ;!wd. ISn I- -)r th r (I )er a t i on)
pho t, 1rip ) r t)y t 1w S01) jI t a' op in ins r)f Icr ek i n thle
t.xtr by th' i o I)r
i,. j-*.# rh o n *'-n r i) f sit.- h a mA.1e pr (c(-;- i "1 t eo:hti i qtw!; i
t ) mpr we', thie i.a'orma~t inn eoxt rawtion capai I tit-, o f the
hiiano-n )tbsPrvo'r , it i I le ir I y ippr 'pr ii t-- and mariatory that
*-%/ I Ilit , e I t qhilit. ri do Ih ).t''-t i V eT r.Sa r fn viit if h om an
n )r i rne i )n ox t r i.' t i -in f rom suh )imal in adidi tion ti
_,13ie'trive ost maIto's of the *,r'l I qI tIIl I t y or ti i it y r)f
the imAqe. IUnf rtonat.'y, tht' human Victors experiments
requi r"d to prodiweqeaft t 'it nd objec-ivz e assoes;ment o~f
im:%-le qtiality 'jave rarely tw-en condoucted in imaqe( processinq
20
laborato)ries or in conjunction wit image processing
programs.
in view of the many millions of dollars being devoted to
image (oller-tion, processing, and display systems for the
military and civilian use of digitized images, it is quite
clear that an assilssment program is urgently needed to
devise procedures, techniqti',s, and metrics of digital image
quality. Such a program requiros the establishment of a
itindardized set ,)f procedures for obtaining human observer
information extraction performance; relating this
perfornan'e, in a qianti tative mmnner, to the var ious
coil lection, pr cessi ng, and display techniques and
il orithms; ind d.vi.sing a ;antitative relationship for the
mul t i-dimensionaI scaling of the var ious collection,
nroc?ssinq, and di snlay t'chnique5 in "performance space".
Only through suc-h an inteqrj ted prog ram of research can
the sy';ren an-! subsystem desilner have meaningful data for
(,nnt-henefir analys,,F of future system development, be such
systems intended *irher for military or for non-military
applications. The image collection, processing, and display
S ,-hno.- q y is n)w t - p,-int whereby such evaluative
researc*h is sorely needed. Fortunately, microphotometric,
m icrodensi tometr i:, and human performance measurement
techniques have been evolved during the past several years
to relate human informatian extraction performance to the
vario(us physical char cteristics of both electro-optical and
photogjraphi- imale Itisp. iysi. Th e present reseairch program
i-, Jetipied' tc) exte end these re(-eitl1y developed techniques
int) tsio1 irt:na )f lti-lital imatjes, emphas~izing derivation of
metriecs <)f ina~le iility appropriate to digitized images,
in]l pr )v ii in; I st i 7 av )f~Jt-henefit ijatt I 'a W1 :h wi I1I
p'InI t Aest *1~Ii rl-r .111 sY . on ;I'16 ! ope r 0 P Ia! hl I
o kv e 1rpmen r i I . t r I.,; We 1 v-i t-) sp -i f y riptimizn .iystem
*)flpon-fn'.t f~ pir r * ul .ir imAje ucquisitt ij)fa ndi splay
r equi r 'n-nt~i
OVP.RVIFW -)F THE RF1:;FAW-H PLAN
Th'i r-vso'ir:h pI in ii; !Ail r) ot iiena t i- a I Iy ini Faur e I.
Ev'h -ml, M~ I Iid - ! ined b,)x, wi th th"o 4(CePtiltl Of the
!Jppermos t inil *At-e-c 1 :eparitg tisk tq ')- conductori during
t~t 'it !ou rse of t ho f u tr -yea r ef fonr t. The' two lar.qe, broken-
linped boxis Ielineate the spe'-ific- display fornats thitt will
be stodied inrl cmpared durinq this initial prog,,ram: black
aind white hardi-copy trainspairencies and elee-troniic displays.
Tho smaIlIl, b roken- I i nert box at the bottom illustrates
importAnt extensio)ns of tihis research to) be parsued in the
future, nAmewly interir'tive diglital. displays in both black
and white~ and fall 7olor. The present report describes in
detail the hard-copy %iubjective scaling experiment.
4-
i ~ 4 f tA ~
III
Filure 1: Schematic diagram of proposed research
N :'; 'R,.TH O11JECT*IVEc;
The wveill resear.-h objectives of this program are as
* Is
I - eveIop ,;t.indardIzecI procedures and techniques to
eva luate har4-copy (film) and soft-copy (CRT) digital
image quality.
2 . Compara candidate physical metrics of imag.? quality.
3. Cjnpara harl -copy with soft-copy displays for im~age
interpretation.
4. Evaluate candidate processing, enhancement, ind
restiration algori thms for improvement of Image
interpretation on soft-copy displays.
I I q I i l A. I I . .. - -
SPECIFIC RESEARCH TASKS
In keepinj with the general goals described above, the
specific research tasks are as follows:
1. Develop an imagery database and image interpretation
scenarios from high quality aerial photography
relevant to the image Interpretation task.
2. Select and purchase display and interface hardware to
present the image database on soft-copy (CRT)
displays.
3. Develop image manipulation software for soft-copy 1,nd
hard-copy experiments.
4. Develop and standardize observer data collection
procedures for hard-copy and soft-copy experiments.
5. Develop and standardize procedures for obtaininq
physical image metrics from hard-copy and soft-copy
displ ays.
6. Digitize and deqrade datahase imagery and record
images on hard-copy and magnetic tapes for soft-copy
display.
7. Obtain physical image metric data for hard-copy and
soft-copy displays.
8. Conduct subjective quality scaling and information
extraction studies on hard-copy images.
9. Conduct subjective scaling and information extraction
studies on soft-copy displays.
- nm l l l" ...........-...
10. Evaluate the utility of image quality metrics for
both hard-copy and soft-copy imagery.
11. Conduct subjective scaling and information extraction
studies on processed soft-copy imagery.
12. Analyze the utility of image quality metrics for
processed soft-copy imagery.
13. Compare image quality metrics for hard-copy and sotft-
copy (processed and nonprocessed) images. Relate
these results to concepts and models of human visual
performance and to imaging system design variables.
This present report relates to Objective 8 above. it
describes the results of that part of program dealing with
subjective scaling of the hard-copy imagery. It also
addresses the question of how subjective image quality is
affected by measurable physical properties of digitally
derived imagery. Specifically, trained photointerpreters
p'qrformed a subjective scaling task using images which were
degraded by two known physical characteristics common to
digitized aerial imagery, blur and noise. A parallel
experiment assessing information extraction performance with
the same images is reported by Snyder, Turpin, And Maddox
(1981).
in addition to obtaining these important baseline scaling
data, the experiment also served to evaluate the scaling
methodology to be used in the subsequent soft-copy phases of
-7-
the research program. Objectives of this methodology are
described later.
BACKGROUND
Often, Pis are not able to inspect in detail all images
brought to them. Hence, the photographs that are used are
those that possess high content and technical quality. Even
in this aqe of high speed computer technology, the
calculable physical metrics are not completely reliable in
determining which photographs will provide the most
information. Thus, Pis typically make a cursory inspection
of imagery to determine which frames they feel are most
interpretable. Pis often use a standard scale (e.g., NATO
scale) as a reference for selection/rejection decisions,
although the literature contains other scales and
experiments that ptirport to measure subjective image
quality, and some relate noise and blur content to
subjective image quality for nondigital imagery. Some of
this more pertinent literature is summarized below.
For the most part, this present research is unique in its
use of digital images. Few laboratories have the facilities
and personnel to play an active role in the generation of
digital images and evaluate the effects of degrading them.
Perhaps the most advanced research in this area is conducted
by the government laboratories, although few studies of
government Pis have been reported in the literature.
-8-
Noise Effects
For example, several studies have investigated the
effects of random noise upon subjective assessments of imaq]?
quality (Allnatt and Prosser, 1946; Below, Huertas-'endra,
Fritze, and Samrau, 1963; Geddes, 1963; Newell and Geddes,
1963; Prosser and Allnatt, 1965; Weaver, 1959). All of
these studies involved raster-scan television images. The
primary aim of these studies was to develop standards for
commercial television broadcasting. Without exception,
descriptive statistics were the only results reported.
Variations, other than noise levels, in the experimental
conditions across these studies include viewing ratio, the
type of observers, the number of raster lines (405-625),
average luminance, ambient illuminance, color versus black
and white presentation, noise frequency and waveform, and
the subjective scale used for rating the images. The type
of scales used by these investigators is called a grading
scale (Prosser, Allnatt, and Lewis, 1964). The scales
consist of points (typically 1-7) with descriptive
adjectives corresponding to each point. Some of the scales
measure the amount of image impairment while others measure
the subjective quality of the image. in these studies, each
observer assigned a number to an image, a technique known a3
direct magnitude estimation (Stevens, 1975).
The authors' conclusions from these studies are
consistent: noise degrades image quality. It is
-9-
infeasible to eliminate all noise from a television imaje.
However, in developing television standards, mark points
have been reported. A mark point refers to the noise level
that intersects a particular subjective score. These points
q:an be determined most easily by graphic methods. For a
scale with five points (bad, poor, fair, good, excellent,
one-five, respectively), a noise level is associated with
each scale category. Therefore, a standard might be based
on the noise level (or signal-to-noise ratio) yielding a
rating of four or good. In general, the middle scale
category (average quality, good to poor, marginal, obviously
impaired but not objectionable) has been associated with a
range of signal-to-noise ratios (in dB) of 23 to 35 or an
average of about 30. A signal-to-noise ratio of 30 dB is
equivalent to a signal-to-noise ratio of 32:1.
While these studies of the effects of noise provide some
data for comparison with the present research, the problems
inherent in doing so should be mentioned. Television
displayed images are quite different from digitized
photographs. It would be foolharJy to expect ]reat
similarity in the results. Further, in all of these
studies, signal-to-noise ratios were calculated at the input
to the video system. The actual viewed signal-to-noise
ratios should be measured at the display by photometric
means.
- 10 -
Rosell and Willson (1973) have donto mch to torwari ti-,
study of noise and its role in tar-jet det' ction. Th,ir
experiments have involved many aspects of vis ual
performance. Much of the preliminary psychophysical work by
these authors involved the detection of displayed stripes
(called bar patterns) on photographs mixed with noise and
displayed on a television monitor. After determining
threshold signal-to-noise levels in bar pattern detection
tasks, experiments were conducted in recognizing ind
identifying military targets. These threshold data wer-e
then compared with the bar pattern studies and it was found
thit the results of the bar studies could predict
recognition/identification performance, and vice versa. By
mpplyin:j a scale f.ictor of eight for bar pattern detection
data and matching bar area to tarq et area, the authors
reported a high degree of similarity in the data. No
stitistics were reported.
Finally, Humes and Bluerschmidt (1968) took a different
-i'pmr,3ac:h to the study of noise degraded imagery. Us ing the
,1etIod of limits and five standard signal-to-noise ratios
from 2 to 29.1 di, their judges reported whether or not the
test images were equal to or more or less "noi3y" than the
standar -s. The 29 images used in their stludy had previously
hen z-;e,] in a target recognition study in which the
;t.in-farA signal-to-noise ratios were effective in showini a
performance difference. The psychophysical study found an
- 11 -
une,.rtainty range of I d11 at t : ,P-1 ;r ndarl aH 'i5o t r . I
d B at the 29. 1 dB stjnd ar]. In prticular , this
relation:,hi.p between ranje of uncertainty jnJ thu -. jidsard
signal-to-noise ritio was linear above the 7 dB :standar'J.
glur Effeerts
31 utr has nt been studied as extensively as no i ,, wi th
r :spef:t to subjective scal in-j. In fact, one nust Iraw ,po:i
relate d re:3 i rch in order t) ja in any ins igh t in t) t.Ie
ef fe:ts; of bl or. Film de'jraded by blur is 1enerally
discar-)ed immediately, unless new picttires cannot be
obt I ined.
P3,jr in raster-scan images is similar Lo echo or delay,
which is manifested in "ghost" imaq3s about the desired
siqnal. Several studies have ,r issed thi. type of
impairment (AllInatt and Prossez , 19 A ; Cavanaugh and
Lessman, 1971; Lessman, 1972; Weave .968". These studies
ar-. very similar to those of noi.;. -impaired television
inages. Whereas noise is c! ar.. -erized by frequency and
aziplitude, echo can be defined in terms of the delay (spread
in time/space about the signal) and amplitude. Echo
impairi;tnt i3 typically expressed as ,I sijnail-to-echo ratio,
the a(pl i tfide of the signal to the ampl i tude of the
displazed signal at a part>,-lar point before or after the
desirel :ignal. Allnatt and Prosser (1915) and others niave
:;hown liit convincingly that a delay of two i croset-rn,] is
most ,b)ejtionable subjectively.
-12-
66.-
Th e reSLIl t-; Of Stfil i'~S n j thi is ar.., irt!i.ri
-39reeme n t Th e middle scale v.'aI j c- i s s so(:i a t e i wi t~i :i
range of 3ignal-to-echo ratios (in r)9) from 10 to 25) wt1 A*I -i
averaige of about 20 dB. A si'Jnal-to-echio ratio of 1(): 1 is
equivalent to 20 (18. These numbers provide another neansz of
*mnparinq an analog node of presentation with the prese!nt
'iat..
Subleztive Scaling and Information Extraiction
13everail studies of photo-interpretation by PIs have boon
piublishodt, primarily in the tachnical report literature. A
fe-w of thesta studies have investigated the relationship
betwet-n scAle values and information extraction performance
(brainard, Sadacca, Lopez, and Ornstein, 1966; Klingberg,
Elworth, and Filleau, 1970; Sadacca and Schwartz, 1963) .
(Information extraction performance data were available for
.subset of the images used in the present research.
Performance da t-a all1owed for an evaluation of the
relationsliip betwteen performance and scale values in this
st u(ly.
,.-aiacca and Schwartz (1963) used a ranking technique to
scale 72 images. Several Performance neasures were also
coll ctked, including tile number of correct target
icieitifications, the number of wrong target identifications,
md . .)n overill accurac7y score. Thle ranks were correlate d
wi ti per formanze for eac~h of three scenes and the t lt -(
perf)rn.an~e measures. The correlations between the rinks
- 13 -
and the number of correct identi f icatiols r i trij, iron . ig t
.59, between ranks and incorrect identifications from -. 30
to .00, and between r3nk- and overall accurazy from .14 t.,
.. l, for the three scenus . It should be rirtel tha
different Pis were used in the performance anl s,:alini
St od i -,S.
Rr ii:a rd e t a 1 . (1966) had P Is miake r2 1,a iv j Uli J me ts
in scil in, image quality. Each of 36 i,,-ai.o:s (s sc _ie:n ) was
rated by comparinj each image wi Lh a ,-taloq of 16 dejradei
imajes and assigning eich test image a cat.lloq image nkiiber.
Two performanrce meaiures were collected: noriber of ,:orrect
target identific:a tions and number of correct ident if ica t ions
of tarjet area. The same Pi:s participated in both phases of
the experiment. The correlations between catiloq nu,bers
and t r,3et idenrti f ication p,.erformance ranqod Lrom .59 t.)
.70, ind f:)r area identification from .56 to .79.
Klingborg et al. (1970) used a standarl ranking
procedure to scale 32 images. A target identi ficati on
measure wa3 again collected. Subjective rank correlated
very highly, .92, with the performance neasure.
It is obviously the .xase that Pis were able to predict
the interpretability of images through a variety of scaling
procedures in the studies reviewed here. The extnt af the
predictability was not consistent.
- 14-
Multidimensional Scal inj,
Two technical reports have been published in which
multidimensional scaling (MDS) was applied to subjective
image quality (Marmolin and Nyberg, 1978; Sad,,tca an1
Schwartz, 1963). Sadacca and Schwartz (19( 3) js#,!. 1 imaq.i
varying in scene content, ground scale, sharpn. , n]
contrist. Twenty-one PIs judged the similarity, bi,:;ed .
interpretability of all pairs of images (;6). Another irnup
of PIs performed in an information extraction stody with
this imagery database. The authors chose six dimensions for
their MDS spatial configuration but could only explain fujr
of the six. The four dimensions were interpreted by
averaging the projections over the levels of the imagery
parameters and relating the direction of these means to the
logical subjective effect of the parameters. The means
indicated the following: dimension one was related to ground
scale, dimension two was related to sharpness, dimension
three to scene content, and dimension four to contrast. No
other analysis was attempted using the projections.
Marmolin and Nyberg (1978) took a similar approach using
24 degraded images and four untrained judges. Physical
quality metrics were calculated for each image. Four
dimensions were chosen to represent the data. While six
dimensions produced minimum stress, the authors chose the
four-dimension model because these dimensions could be
interpreted. The four dimensions were interpreted from
- 15 -
correlations obtained between the image parimeters
(including physical metrics) and the projections )n each
dimension. The four dimensions were- int.,rpreted a:;
sharpness, noise, contrast, and the bandpass by contrast by
noise interaction, respectively. Since no inf)rmation
extraction performance measures were collecte,, no analyse:.
relating the projections to performance were conducted.
These MDS analyses can be compared with the present
analysis. To the extent that the data cillected in this
resear.h may differ from other MDS analyses of subjective
image quality, a basis for comparing digital versus analoq
photographs presents itself. While a well-mappe, sub ectiv'
qual ity space may serve as a device for screeni n t ho
interpretability of photographs in the future, this will be
useful only if a spatial configuration can be locat-,| that
is hilhly related to performance (i.e., projertions must
•orrelate with performance). This consideration has not
been addressed in past research.
Q,*.inq S;cales
Almost every rating scale used by experimenters in
.v 1lin< studies has a different number of cateqories. The
-hoice is to the number of categories to include in a scale
ir; generally arbitrary, depending on the resolution desired
in the obtained response. Why the existing NATO scale and
perhaps other similar scales used by P1s to r - t-, th4
interpretability of photographs has In catpeories i.-;
- 16 -
ulnknown There are no dat.a ti indicatr- this numlor ti be
optinal in rating qr,)und resolved distanc,? (GRD) or imaje
quaI i ty in jeneral.
Without just if i-uation, Gui Iford (1954) sug(Jted thjat
-ibokt 20 -- ttjoris ire optimal in general uses of rating
:;-ile:. The optimal number of cateqories is dependent on
the nunmbor of stimulus categories (and stinulis v-lriation)
t ) no s'al-ed (Erikson and Hake, 1955) The basic: issue is
t h, number of absolute judgements that can be made in a
parti,:ular %timulu-s dimension. In this researm-h, inlike
h-1i rese.irch in audito ry or visual per ption, the
!;timulus property to he scaled (e.g., frequency, luninan.:e,
chrominance) is not easily or objectively measureable. In
p-r7eption research it has been shown ,!,I t abour :en
'lteqor ies ;ire sufficient for a variety of st imU IAs
dimensions (Miller, 1956). However, Muller, SirorsKy,
Slivinske, Alluisi, and Fitts (19551 have shown that many
more categories (24) can he used efficiently when the julqes
iri -i lowed to pra-tice the ti;k. Thoref)re, it would se.,,m
reasonable to conclude that a scale with more han 11)
categories would better provide the r- ;pons, re uto .ution
needed by a well-practiced P1. For these reasons, a nd
because the subjective scaling technique to be used in
';ubsequlent stud ies in this proqram should u..e the s~ime scalp
(for comparibility of results), careful attention wa:; jiven
ti the scale selection.
- 17 -
SI. ';E1.;CTI')N OF THE ';'ALIN'; 'P1W "EDUR
A lrle D-rt of this res#.arch involve,) tne *t et, rnn.ti.n
of t h, Mo"It 0f f i -i e nt -j Il in'-j p ro e u re ifn t -it
.:onstraint;. int:; of the (:ost involved in generiting tn
i lwery an- collerctinq the data, .onslerahble time an:!
eff-)rt wer.- spen t in evaluating alternative sc a ing
ipproarh es. It. wa.s known during this per i,)d of eval ja ion
that 15 PIs would be available for about f)ur hour:i eac!i.
It was also known that an imagery database consisting of 25)
i miIes would be available. Because of t'ie prof,in;sional
-Ittitlvie holl by nost PiS, the experimental task would have
t) be parsimonious and, in general, similar to the typical
pra ,tie.s of the PI. Consequently, an unclassified NATO
-,-al- was found tiat would serve as 3 reference for scaling
irriva' ;piality (Anpendix A). The NATO scale was optimal in
t'li i" rosesed a high degree of similarity to the scales
J','i , t'pritionally by PIs. It is also presoned to relate
4 'iir-(7rIy t,) 3RD. The only remaining problem, and perhaps
the most lifficult to resolve, was to decide how to use the
scale to collect the most meaningful data without creating
an operationally meaningless task for the PIs.
- 18 -
NIL
i 'DTHE AL.TFINAT[ VE:;
wo , broad classes of subjective scaling nethod:; -xist:
.tiroct ,ind indire't. Each -. ,l enco)m|pa,(- s iumtor ji;, bu t
v,1ri,:d, approaches. The major difference between t 'it two
, i the spe':i f ic i ty of the 'jbta : ned r .:;p, n:;
i- forma tion. When the response given by the judqe (or PI)
provides a direct quantization (numerical representation) of
t'ie subjective effect of the stimulus, then the procedure
:an be said to be direct. Direct methods of scaling can
re.sul r in interval or ratio scale values; judges simply
report the scale values. Types of direct methods include
ratio estimation, magnitude estimation, fractionation, and
rross-modality mat-hin]. On the other hand, the indirert
nethi,s of -eal inq result in ordinal scale values that :-ai
h- transformed to an interval or ratio scale. The indirect
in.-thiods are typified by the pair comparison approach, where
ill of the experimental stimuli are paired and compared on
th. basis of some attribute. Other indirect methods include
th, methods of triads and rank ordering. In most ease,;,
both direct and indirect methods will produce satisfactory
scale values; however, experimental constraints often
,|ictite the most feasible approach to take.
- 19 -
TiE MOST FEASIBLE ALTERNATIVE
The direct methods of scaling are very easy to apply. A
seemingly acceptable task entailed each PI making 250 ratio
estimates in four hours. As PIs, in their daily work, us(
in internalized concept of GRD to draw inferences from
imaqery, the NATO scale seemed to be an ideal instrument to
use in the study of the subjective interpretability of
images. The NATO scale, in conjunction with ratio
estimation, sitisfied the needs of this research (e.g., a
task similar to that of the PIs' daily practices, a scaling
method that would provide scale values for subsequent MDS
analysis, and a method that could be used in light of the
constraints on time and the number of available PIs).
The indirect methods, particularly pair comparisons, are
perhaps more reliable than the direct methods because of the
procedure involved (relative judgements), but require a
large amount of time in data collection. Though there are
special pair comparison techniques for reducing the number
Of pairs to be judged, with limitations on the number of
judges available, pair comparisons could not be used in this
study without reducing the range of experimental parameters
currently existing in the imagery database.
Consequently, ratio estimation using the NATO scale as
the reference appeared to be the most reliable and time
efficient means of collecting subjective data.
- 20 -
III. PURPOSE OF THIH f;XP!ERMENT
The purpose of this experiment was to investij Ce t the
psychophysically scaled effects of blur and noise on digital
imagle quality. To maintain an experimental environment in
which PIs are most accustomed to working, the NAT) scale was
chosen as a reference for rating image quality. Because of
the inclusion of the NATO scale, issues relate d to the use
of scales were also studied.
The specific objectives/hypotheses of this research are
As follows: (1) digital images degraded by blur and/or
noise appear less interpretable as the degree of degradation
increases; (2) the optimal number of categories in the NATO
scale is greater than 10; (3) an MDS analysis can be used to
nap the subjective, spatial dimensionality of the imagery
database and predict information extraction performance; (4)
the correlation between information extraction performance
0ind ratings of apparent interpretability is high for trained
PIs, and (5) the scaling technique used will produce data
sufficiently consistent and meaningful such that the same
s'7aling technique can be used in the subseqient soft-copy
studies in this program.
-21 -
IV. METHOD
PHOTO INTERPRETERS
The Pis used in this experiment were 14 NATO-scale
trained military Pis who were stationed at Hickam Air Force
Base, Hawaii. One of fifteen PIs scheduled to participate
in the study declined because of the importance and urgency
of her regular work. The average age of the Pis was about
25 years and the average experience level about four years.
The study was conducted at Hickam Air Force Base during
normal working hours. The PIs were not paid extra for their
participation.
APPARATUS
A Perkin-Elmer microdensitometer was used to digitize 10
standard photographs consisting of three orders of battle:
air, electronic, and sea (i.e., four airfield scenes, two
scenes of typical research and development installations,
and four quay and shipyard scenes). The images were
digitized in a 4096 x 4096 picture element format. The
complete set of digitized images was planned to represent
all combinations of five levels of noise, 10, 20, 40, 80,
and 160 digital units (signal-to-noise ratios of 200, 100,
50, 25, and 12.5), five levels of blur, 20, 40, 80, 160, and
- 22-
I- - ...1 . .. . I
320 micrometers (,.mn), and 10 scenes. Blur was produced in
software by multiplying the frequency spectrum of each
digitized imaqe by an appropriate Gaussian filter function.
A Fast Fourier Transform yielded the frequency spectra. The
image matrices were then reconstructed by the inverse
Fourier transform. As blur was added to each image, h igh
frequency detail was removed. Noise was added to the images
by multiplying each picture element by a value randomly
selected from an appropriate Gaussian random noise function.
The scenes and amouints of degradation were scrutinized and,
in part, chosen by a senior PI who provided the scores "or
the information extraction study. For a more thorough
description of the imagery database and its development, see
Burke and Strickland (1982). For reasons given in that 4
report, the final hardcopy SNR levels were 75, 60, 42, 24,
and 12, while the final blur levels were 40, 52, 84, 162,
and 322 micrometers.
The 250 images were shown to the PIs in the form of
positive transparencies, 7.6 x 7.6 cm. A light tAble
(Richards Model 33H100) with binocular zoom stereo optics
and hand held tube magnifiers was available to all Pis
during data collection.
As mentioned, the existing NATO scale that was used to
scale the transparencies is based on interpretability and
GRD. Each increasing whole number, 0 to 9, represents a 50%
reduction of GRD, beginning with "useless for
-23 -
interpretation", scale score 0, greater than 9 .m, scale
score 1, and ending with less than 10 cm for a scale score
of 9. (See Appendix A for the scale.) In this experiment,
the PIs were asked to report not only a whole number scale
value but also a decimal. In other words, because each
whole number represeiits a range of interpretability and
MGRD, the PIs had to interpolate to the nearest 0.1 over the
rarge to report the decimal portion of their scale value.
Thus, the existing 10-point NATO scale was transformed to a
100-point scale.
PROCEDURE
Each PI was allowed approximately four hours to scale all
250 transparencies. Rest periods were allowed as needed.
The 250 transparencies were administered to each PI in a
different random order. The randomization scheme was
obtained using the Statistical Analysis System (SAS) Plan
Procedure (Barr, Goodnight, Sall, and Heiwig, 1976). The
scale values were reported verbally by the PI and recorded
manually by the experimenter. The instructions were
administered to each PI as follows:
This experiment will involve your rating numeroustransparencies on the basis of interpretability.It is assumed that you have had some experiencewith rating scales; if you have not, you shouldinform the experimenter at this time.
The rating scale that will be used here is a0-9 imagery interpretability rating scale. Thisscale is probably similar to those you arefamiliar with. Please look over the scale(attached) at this time. As you can see, larger
- 24 -
scale values represent a greater interpretability.The range of some interpretation capabilities thatfall into that range are given under each RatingCategory, 0-9. Your task will be to rate eachtransparency using this scale.
However, in an attempt to obtain moreinformation from your ratings, we would like youto report your ratings to the nearest tenth. Thatis, your ratings should take the form 3.9, 7.2, or5.0, rather than simply 0, 1, 2, 3, 4, 5, 6, 7, 8,or 9. It may be a good strategy to determine therange in which a particular transparency falls(say between 6 and 7) and then try to refine yourrating. Please remember to be explicit inreporting both portions of your rating, a wholenumber and a decimal. We hope to show that PIscan rate transparencies with more resolution thanconventional scales afford. if you have anyquestions regarding the scale or how we would likeyou to use it, please ask the experimenter at thistime.
In this phase of the experiment, you will see250 transparencies. The 250 transparenciesrepresent 10 scenes that have differing amounts ofnoise and blur. Your task is to rate eachtransparency using the scale we have justdiscussed. The scale will be available for useduring the experiment, and you may use anyequipment you feel would be helpful. However, youwill only have about 4 hours to rate the entireset of 250 transparencies; therefore, you shouldspend about I minute on each. The experimenterwill be in the room with you while you rate thetransparencies. He will hand you eachtransparency, and after you arrive at a rating,please hand back the transparency and verballyreport your rating.
Before we begin, do you have any questionsregarding this phase of the experiment?
The instructions given to and the procedure carried out by
the experimenter were consistent with the above
instruct ions.
Prior to participation in this experiment, each PI was
asked to read and sign an informed consent form to insure
that the rights of the participant were known and upheld.
-25-
V. RESULTS
ANALYSIS OF VARIANCE (ANOVA) ON RATINGS
The overall ANOVA of the scaling data includes the three
parameters of the imagery database, noise, blur, and scenes,
in addition to order of battle and PI. As indicated earlier,
the imagery database consists of scenes that were nested in
various orders of battle. The summary table for the overall
ANOVA indicates that almost every source of variation,
including interactions, was significant (Table 1).
Another factor was separately analyzed but does not
appear in the summary table. Two of the fourteen PIs who
participated in the study were actually Army Pls while the
other twelve were in the Air Force. Therefore, the main
effect of Service and the Service x Scene interaction were
analyzed. Both effects were non-significant (F1,1 2 =.26, p= .6184 and F7,4 , = 1.93, p =0.75, respectively) and
I deleted from the overall analysis.
-26 -
TABLE I
Summary of Analysi. of Variance on NATO Scale Scores
SOURCE df MS F p
Blur (B) 4 832.01 105.63 <.0001Noise (N) 4 104.20 82.59 <.0001Order of Battle (OB) 2 57.36 16.01 <.0001Photointerpreter (PI) 13 259.24Scene/OB (S/OB) 7 28.80 17.49 <.0001B x N 16 5.96 14.06 <.0001B x S/OB 28 2.30 5.14 <.0001B x PI 52 7.88B x OB 8 5.40 8.03 <.0001N x S/OB 28 .57 1.40 .0882PIx N 52 1.26N x OB 8 1.41 3.09 .0036PIx S/OB 91 1.65PIx OB 26 3. 39B x N x S/OB 112 .42 1.26 .0378B x N x PI 208 .42B x N x OB 32 .41 1.09 .3349B x PIx S/OB 364 .45B x PIx OB 104 .67PIx N x S/OB 364 .41
PIx N x OB 104 .46B x Pix N x S/OB 1456 .34B x PIx N x OB 41r .37Total (3499)
B1ur
Increasing blur results in decreasing scale values, as
plotted in Figure 2. A Newman-Keuls mtultiple comparisons
test (MCT) showed that all comparisons were significant (p <
.05) except between blurs of 40 and 52 m and between 52 and
84 m. The trend was monotonically decreasing with
increasing blur deqradation. The linearity of this erfect
is indicated by a linear r-orrelAtion of r .97, p < .n001.
-77-
:' ; T T .... T T-65 "
-J
()49k-
o 45h -Z 490 457 -
Lii33
29
250 40 80 120 160 200 240 280 320
BLUR,.Lm
Fi'iure > The ioffort- -) Flkr )nr IA ) Fcalm va~lue
The main effect of Noise was Viso highly significant, as
shown i n Fiiurc 3. Incrcases in signal-to-ncise ratio
jenerally increased scale values. The MCT showed that all
7omparisons were significant except between signal-to-noise
r3tios 75 and 60. This trend was also monotonically
de,-reasing and highly linear with degradation, r = .946, p <
.ooi.
Order of Battle
The main effect of Order of Battle (O8) is illustrated in
Fiqur? 4. The MCT showed that both air and sea orders of
battle were rated more interpretable than electronic scenes
(p < .05), although air and sea OBs did not differ from one
anethe r (p > .05).
- 28 -
69 -
6.5LiD 61.J
> 57
Lij~1 53
n £ > 4.9 -
FO 45I-
Z 41-z< 37Lt 33 --
29 -
25 1 1 1 I0 10 20 30 40 50 60 70 80 90
PEAK SIGNAL /RMS NOISEFigture 3: The effect of Signal-to-Noise ratio on NATO scale
value
697
65 -LdJ
> 57
L 53 -
1 49 -
0 45 - -
Z 41 -
Z 37 - -
S33 -
2.9 -
2.5AIR ELECTRONIC SEA
ORDER OF BATTLEFigure 4: The effect of Order of Battle on NATJ scale value
- 29 -
Di f fI oII, n-- a~ Iw
Diff~er'_n.'.]- , ::l I in, jo): ', ] .*,,; w.,F, t1;' , r,[][i i [ a,,{t , i
so en ii Fi Jir, (. Ap p.. dix V h' ."; t , i f t
Scenes 'I 0 rC t | t)y t 11 M(""'.
LUW-J
LU 4
o '
LUJS33-
2 3 4 5 6 7 8 9 IC
SCENEFi JlJr,2 5: The ef f (ct o f Scene on NATO scale vilu-
131 jr x Noise .
The Blur x Noise interaction is shown in Figure C. in
gcneril, the signal-to-noise effect is reduced as blar
increasees. With reduced blur, the decrease in NATO scale
vt Lue with decreasing signal-to-noise ratio hecomes i -re
pronounced. Simple F ratios were calculated for the Noise
- t ,?:t at each BIur level. All of these Noio3, s impl.-
effect- were highly significant, p < .0001, with the
exception of the 322 -m blur, p .003. Subsequent MCTs
3) -
sliow, A thit tewo r ind ft~w-t No i I .;u uev s di ffer,?i f r,)n on-
r 7 T-69'
6 5 40 G
-. 1
49
0 45 6 ,- ; 62
Z 41-
Z 3 7 rLAJ 32?
3 3-
29
2 51..- -
0 '0 20 30 40 50 60 70) so 9C
PEAK SIGNAL / RMS NOISE
Fi r,-~: Thi *f f ' i r N -) 3 ):i NAT*) -ji je
Thie iit.-raCtiOln e tWrQ ' - OR -A~ 11I IIt' V sh) F~j Jirt
Sly, t h. e qff e,-- r ir w i :; - - i r f r i' I ( i; ,
I o- f -:w :: -s s'v'cr :At: i i - e xis :;t t~ >uSO
sta ti -il7, s ijin if ir~ f t h n r ri t i )fn. ;i pP I
r j t ios w, r ? C aI-1' 3 t e ( f or O H -i ea h Bi1 i r ' 'K V Te T
simp I? - ect was si Inif fi(eaito- -ill lovo'is -f B ir c-p
it I hl ir of 4(0 mi. A hl ir of 320 of i2
A i ffo r ?nce s anong al I O 0B, whil o at~ the int. rnoieciit - levels
oi B ~I r 8 r4, 1r, l';1 r1 , thr._ w- s no di ff. r n ,-
betwon uir 1nd seI OBS Air .i n, r!'rrn i 7 O3:; A1, not
Ii ffer it S4 m.
69
ULJt 6.--
0 49.0 4I-0 45'Z 4 t ORDER
z ... BATTLEU.) AiR
33'- SEA
29-ELECTRONIC
0 40 80 120 160 200 240 280 320
SLUR,knm
Fi ure 7: The effect of Blur x 3rder :f Battle orn N:Td)
Tne interiztion betwe.n OR ani Noise is shown in Figure
3. The difference in scale valJes between air and se.a 35s
b-comes greaiter with de:reasing sijnal-to-noi -e ratios.
Simple effect F--ratios cal-ulited at each Noise level were
alI .3i-)i fi.:ant. Subsequent MCTs showedi that the only
nonsilnificant differences in OBn were between air and sea,
ex -ept at 3 siinal-to-noise ratio of 12.5, where all OBs
,i i ff.-r.d .
- 32 -
LAJ
5 7-AIR
-J SEA
V) ELECTRONICo 4:
z 4-
Z 37~--
uLJ3 33
29 -1
2 5 L ~ ~ ~L J~0 '0 20 30 40 50 60 70 80 90
PEAK SIGNAL / RMS NOISEFiqur? 8: The effect of Noise x Order of Battle on NAT')
scile value
3z=en, x Blur
The interaction between Scene and Blur was statistical~ly
sljnificant, Fiqure 9, but is difficult to interpret
loji 'Illy. Simple effect F-ratio-s were calculated for
,:e- t each Blor level. All Scene simple :3ffqcts were
hilhly sijnificant, p < .0001. NiCTs shower! v3rious
differonces among Scenes at each Blur level. While the
o)rlerid Scene means at each Blur level were similar to the
irder shown by the overall Scene main effect, each Blur
level produced unique differences among Scenes. kppendices
D-I1 Iepi(,t the Scene differences for Blur levels 40 to 322
ri, respectively. MCTs were conducted but provide little in
r the way of clarification of the interaction; apparently,
-33 -
some Scene'3 were af f(c ted miore by Bl ur than were o the r
Scenes.
577
U
2 9
o(08n20 10 20 24 8 2
BLU 45~
Fiur 4: Th ef5 to lrKSee nNT cl a
Scn 3 7os
Th zn oLLitrcinddnt ec ttsia
Thgue Scene xfo Blur x oine ineaton AsO statistially
siqnific:ant,. and is nr plotted nd aipures in0 EahFigure
represents the Blur x Noise interaction for a different
Scene: Figjure 11, Scene 1, Figuire 12, Scene 2, Figjure 13,
- 34
6.97/-7
65 ' 4ILl / t O: 61
> 5.7
L"I- 53
U 49
455z41 6
Z 37
LuJ 33 "'8
2.9
250 10 20 30 40 50 60 70 80 90
PEAK SIGNAL/RMS NOISE
Figure 10: The effect of Noise x Scene on NATO scale values
Scene 3, Figure 14, Scene 4, Figure 15, Scene 5, Figure 16,
cene ', Figure 17, Scene 7, Figure 18, Scene 8, Figure 19,
Scene 9, and Figure 20, Scene 10. Simple F ratios were
calculated for each Blur x Noise interaction at each Scene.
Only the Blur x Noise interaction for Scene 4 failed to
rrach significance (p > .05). While further simple-effect
tests were conducted on each Blur x Noise interaction for
each Scene, they tended to repeat findings previously
reported here and provided no further understanding of the
three-way interaction.
-35-
Y T -- - T---- T----- T69 -165 -SCENE I SLUR, im a
D 6 1 4
> 57 52LLJ 84 -
49 -4
450 - 162
Z 4,1
LLJ 322S33
29
0 10 20 30 40 50 60 70 s0 90
PEAK SIGNAL / RMS NOISE
Figure 11: The effect of Blur x Noise on NATO scale value,Scene 1
6.9
40w
S6.1 8452
> 5.7
-5 5.3L)4.U,
o 4. 162
Z 4.1
z 3.7
W . 322
2.9
2.5 I 1 1 1 I 1 I
0 10 20 30 40 50 60 70 80 90
PEAK SIGNAL/ RMS NOISEFigure 12: The effect of Blur x Noise on NATO scale value,
Scene 2
-36-
SI -040 J>57
w_j 53
o ~ f62
Z 41
Z 3
33 122,9
0 10 20 30 40 50 60 70 80 90
PEAK SIGNAL / RMS NOISEFigure 13: The effect of Blur X Noise on NATO scale value,
Scene 3
69 BLUR,im0
6.5 -52
S 6.1 - ----- 0 8
> 5.7 ]w_j 53
0. 4.9 -..... 162
o 0"z 4.1z 3.- 322
w
29
0 10 20 30 40 50 60 70 s0 90
PEAK SIGNAL / RMS NOISEFi-jre 14: The effect of Blur x Noise on NATOl sc.ale value,
Scene 4
-37 -
-- ---- T T- - -- -T69 -1
65-SCENE 5 BLUR.qimw 40M 6 1
4 52
o 2k49
Z 41
Z 3w
S 33 -322 -4
2,9
0 10 20 30 40 50 60 70 80 90
PEAK SIGNAL/ RMS NOISE
Figure 15: The effect of Blur x Noise on NATO scale value,Scene 5
69 I I I I
r,5 SCENE 6 BLURqh~mw:D 6.1 40
-j 52
> w 84-j5349
0 45
Z. 4.1
Z 3733 -322
29 3
0 10 20 30 40 50 60 70 80 90
PEAK SIGNAL / RMS NOISEFigure 16: The effect of Blur x Noise on NATO scale value,
Scene 6
-38-
69 T ~ ~ -- - T T --
6,5 -52 Hw 84
D 6.1
5 5 - LUR,,uAm> 537w -
< 16249 ) 4
o 5 -
Z 41 -
Z 3.7
w 322S33
2.9 -
2,5 I
0 10 20 30 40 50 60 70 80 90
PEAK SIGNAL/ RMS NOISEFigure 17: The effect of Blur x Noise on NATO scale value,
Scene 7
6.9SCENE 8 BLURqtLm
6.5 -4 0
:3 6.1 -52 2-AJ
-J 53 84
) 49 -U)
z 4.1
Z 3. -162
S3.3 - 0,
2.9 -322
2.5 - I L0 10 20 30 40 50 60 70 80 90
PEAK SIGNAL / RMS NOISEFigure 18: The effect of Blur x Noise on NATO . cale value,
Scene 8
-39-
6.9 l- 406,5 F- BU,
:3 6.11- 52
>151 84
53
4.9
Z 3
S3329 322
29
0 10 20 30 40 50 60 70 80 90
PEAK SIGNAL / RMS NOISEFigure 19: The effect of Blur x Noise on NATO scale value,
Scene 9
69
65-SCN10-05
Li
-5353
4,40) 4.59
Z 4.1
w GS3.3
0 10 20 30 40 50 60 70 80 90
PEAK SIGNAL / RMS NOISEFigure 20: The effect of Blur x Noise on NATO scale value,
Scene 10
-4U
OPTIMAL NUMBER OF RESPONSE CATEGORIES
When it was decided that the NATO scale woilf b, used in
this research, the only concern was that the original scale
might have too few response categories (10) to test
thoroughly the subjective resolution of the PIs. Therefore,
as noted earlier, the normal use of the scale was modified
to accommodate 100 response categories. Though 100
categories were believed to be more than necessary, it was;
felt that the PIs could use the whole number/decimal scale
with less difficulty than any other interpolation scheme.
The Shannon-Wiener measure of information (see Equation
t = -" p. 4 log", .
i=l
w- used to calculate the average informational value of the
100 scale alternatives (Attneave, 1959). Eighty-eight of
the 100 possible categories were used by the PIs. The
Shannon-Wiener formula indicated that 5.95 bits (average) of
information were required to arrive at each scaling
decision. Therefore, about 62 (25.95) categories would
theoretically suffice in scaling this imagery database. The
100-point scale, although larger than necessary and
different from the standard 10-point scale, appeared to be
easily understood and highly functional in scaling
interpretahil i ty.
-41 -
~P5 ANALYST",
Unfortunately, the Bell1 Laboratories <Y.ST2A p r oir am
purchased for the MDS analysis was not capahle of analyzing
aIl of tho data: at onno time. In f a ct, a maximum of 60
imatles hacd to be se lected from the imagery database (250
images) to comply wi th the- limitations of the routine. Th e
seetii wsloeasfllw. Because Blur lev:els of 40
and 52 m and S/,I levels of 75 and r60dB did not differ from
D ne a no ther, these l owes3t l evelIs o f d eqr ad a t ion we r e
d isca rd ed . Wh ilIe 10 Scenes in th e imagery d ata b a s we re
par t icul arl y responsible for its size, three of th1ie mo-)r e
,- 0 ns is t o ntlIy scaled Scenes, one from each 08, wert chosen.
clevalue data for those- Scenes (5, 8, aind 10) are plotted
in Figures 15, 18, and 20, respectively. Thus , 48 images
we rz submitted to the MDS analysis, 3 Scenes x 4 Blur levels
x 4 Nielovels. A dissimilarity matrix was calculate d for
-I input to NiDS by taking the absolute difference between
ev i h71 pa ir of the 48 images; the actual input to the, MDS
Einalysi3- wa-s a triangular matrix without diagonal element-,,
or -18(4'7)12 =1128 data points.
c.;everal runs of the Bell Laboratories routine were made
before t'le b)est model was selected. The best model is shown
iErj iat io()n 2a . The general form of the MDS model is given
n Equation 2b.
(Ii.. C x ik x (2k
Iik jk
-42-
T1 N .i r o ri 'Irc ;' t te o: t. "A,,
(sDo, FI ' c in I11 I k~n Io;. w~ i.-1 1m i !-'-
re~tienh C Pew* I ~ L r Ior radt> t L)
)i tw niner )1 I imnsi-ins in the mdol. Recaousestis
i mi n Irn I y -i I joe( it e c, F-1 ui t i n 3 f o)r the stress
V I ri ,ir 21 i n di:7t-1 th tie I celI w it " f ive
"I ... I.; I l tni !("I fit. Vil ; 1 if r pr o lit
K i ti n pq pre i ct- d by thle Model. r7, 11 Ll T1t V , th
aD iil;i;jve a proiection fo-r e a ch of tie, 48 imnajes on
.,)r:K thet five? Ji tns II.;n.;. The Prjeton re I ist~l in
Anr> l i i . h t i r i t in r s il te ol nn i ni n! T,)c str ess,
.,1:; 1 ffi-fel bv tII, lothocs of th e R] L'rtre
pr j r 1in. No i - r mi 1);tin p r f) oA m1 ii:i i miium §t r,
j~ n , 1- .,v, r - -7 7) 1~r ci:7u 1-3o ri 3 r t he >vc
off F fft~ e) nd s iq I)i f i c7a n in1 t 'I o 1',ro A N')VA .
'.'?~ o I R w., r- rep o ~e ;aol by rW -,n ohe .;ce[ilO3
Di In i i. -me 1 f " ir wo r a ppa r- nt] I tinr,-1 at -i to i v
theo If Ila2 t y Zel in th ov ra 1 ANOVIN. Ii,)w-o r
ii fef; i mfs two, t?1 hre :, n fiv- r r. U'f "1 V:;
-r~ ,~~ Id i ,vi I'P t 'y.
LA)CE
02
2 3 4 5 6
NUMBER OF DIMENSIONS
Tie ME-projecti-ms, ill five dimensions, and mean 'NN\TD
* ~ iVa I aeq we ro us'.r to Pre~dict thle previ ousl y ob)t i ielo
ivnfornatio-n -xtr-i,7r~on p -rformanze in a -mLltjflle r2qre ssion
jliat io n. 'f s 8 ira e; n ly o t'v, MD., ~3 ra)jri,
onl11y p4 *roo *p: i -ah Ie 1)o (7a u' B i r i v, Is af 5 2 a nd I 2 m
w~e n,) exi 1)1i to, by th u P I s Of :oir.3e, *-l1 24 monscaql-,
valIu e, averagedf across judgles, w e re : a 3i I - 1o. M~ultiple
r - ,jr,?s';in is ta n t moujnt to pe r fr m i iq mu tiplo
co rrel1,a7 ion, e xcept tha t 9 o w-~i-ihrinq fatr i
prov ideil fo r each predAi -to r tha t i~ IIow- fr an .ev,-i m on
441
"or'ltn r ri 1F.., If row~1d' ' I~'
cr rt ni: 3M1 In in31 ,* , * ~I
:,I n'. -2
1. , '- "-
r r' •
' " - , ... . ... ".8., ," (r ='.'!r, ,
- 2
4r -
th.k? cor re Iat i,-n, r - 70, ) .00 3. (Sta t ist i caI y, t his
cor re 1-3t i on w -i no t i j ni f i (anrit improvement -'v'r .47 )r
.52, z 1. 2?, 1 10f arvi 7 r, *9, 170
r (,!3 De cti v e y.
Fin~illy, over-ill meoin pefrmIz. cores (aver-3je1 oler
S~ I-]P1:-;) ae re c)(r. latod w ith c,- r r r;0o-iinq mean
-; s)rts frc - h of the I') Blir in'i Noise -~onhi:atins.
Vi 4)r!.1i! i.s improved and si,3niti:7ant, r = .898, o 3
.0). (This :,rrelation w - -- 3 stitisticA' 4 I impro-'enpnt
eh'i -7 , .52, in- .7'), z z3.22; ; .0007, z =.~
QU i) -, l z 2 .11; p .017, respe-)c t ivelIy.)
-46-
V I DISCuSsIoN
The? NATO scale and its usage here presented no problems
fjr thie P,;. In general, both the imagery rlatabase. and tne
;,ilinq task jera well received al' 'of tile Pl.i. Whlile
Pl.;, y 0 11 tIt i)n, 1r : vt.ry scrupolous of phot-)jrav:ii
, i ty :1:i, f) security reason.:, sensitive about giving1
in ;i]ht int. their work procedure, this s tijy w;al- not
'-inier3A bv the ise of these hi~hly trained judges-except
C)r th . one case of 7ttrition whi-h was luLj t- workload.
*401 M.LJR EFFECT',3
Iv o( ' 311 sttiStiCal nalysis confi rmer t'),e first
r:,"-t; r i;. 4ot noi so and blur degradations reduced t-e
i,,Ie1 interoretibil i ty of the images. In general, the
r .it r tihe phy.;izal image degradation, the greater was the
rl-!1iAtion in interpretability. However, for both noise and
I)! r, ther, were no differences in interpretability between
th, two) lowest levels of degradation. One -r beth of two
-i ,:itions iay have been responsiIle for this res1lt.
irst, image ; assumed to be free of noise had somewhat
Iii 1her la3? Ievels of noise even before any degrdation w-s
F fect-?'] . The si)nd possible explanation regards t'e Pls'
inntivity t: noise and blur. If the true ralation:shio
- 47 -
between noise or blur and interpretability is sigmoid-
shaped, then judges are psychophysically insensitive to
small changes in noise or blur level when the extent of
degraidation is either very slight or very great. Stevens
(1975, p. 136) ho. suggested that when the full range of a
stimulus parameter is used experimentally, the
psychophysical function tends to be sigmoid-shaped.
The combined eff .ct of blur and noise indicated that
these degradations ai. interactive. The MCT for the Blur x
Noise interaction indicated that noise has little or no
effect on a highly blurred image. But, blur did have an
effect on a very "noisy" image. It appears, then, that PIs
react differently to noise degradation than they do to blur.
Subjective comments by Pis substantiate this notion; it is
common for PIs to state that they can "look through" noise
hut cannot disregard blur.
NUMBER OF RESPONSE CATEGORIES
The determination of the optimal number of response
categories took an absolute judgement or information
processing approach. Given that the information processing
analysis is so ea.y to perform, future research might
attempt to determine the conditions under which the
technique and these results are valid. It should be kept in
mind that the optimal number of response categories is
affected by numerous factors, such as the psychophysical
- 48 -
range and spacing of the stimuli (Alluisi, 1957). This
imagery database was previewed by senior PIs and deemed
representative of the range of quality of imagery actually
interpreted by Pis in practice. This analysis showed that
about 62 categories would be required to allow 3dequately
for the sLbjective quality differences in the present
imagery database. Other imagery databoses might well
require more than or less than 62 categories. However,
because this 100-point scale, while larger than needed, wS
easy to use by the PIs, it is strongly recommended as a
replacenent for the current 10-point scale NATO scale.
MDS ANALYSIS
MDS analysis fits the data to a multidimensional spatial
configuration and calculates a projection on each dimension
for each cell in the experimental design. Wh hile MDS
analysis is simply a mathematical curve fitting procedure,
similar to factor analysis, difficulty arises in the
interpretation of the dimensions of the preferred
cr)nfiguration. It is only through the repeated replication
of an experiment that an investigator can, with !-ome degree
of confidence, begin to understand the meaning of the MDS
d inensions. In light of the numerous significant effects
reported in the overall ANOVA, it is beyond the present data
to interpret the five dimensions of the reported
configuration. It does appear that at least three of the
- 49-
dimensions map well onto known database variables of noise,
scene (or OB), and blur. Therefore, the MDS analysis, while
not conclusive in its defined dimensions, appears to be
consistent with known facts about the imagery database.
Minimum stress was achieved. However, because the MDS
routine is limited to six dimensions in fitting the data,
there is no assurance that the best model did not represent
a local stress minimum rather than a global minimum. While
the projections are listed in Appendix I, the usefulness of
this subjective response configuration remains uncertain.
The regression equation utilizing the projections to
predict performance failed to add any meaningfulness to the
MDS analysis. The ultimate goal of either scale values or
physical metrics is the prediction of performance. Future
MDS analyses in this area of research should concentrate on
validating MDS projections in terms of performance. Perhaps
s.:ale scores that correlate better with performance than
these did will result in projections that better predict
performance. it is possible that the predictive value of
the MDS projections would have been enhanced by a complete
set of performance data to accommodate the 48 images
selected for MDS.
In regard to past research and MDS analysis, it is
interesting that this MDS analysis showed that a similar
number of dimensions were required in the spatial
configuration. Also, the interpretation of the dimensions
- 50 -
was quite similar to past research. Blur is doubtlessly
related to sharpness. Considering the data of Marmolin and
Nyberg (1978), these data support the contention that the
Euclidean model, r 2.0, is more appropriate for MDS
analyses of subjective image quality.
RE[UTIONSHIP BETWEEN INFORMATION EXTRACTION AND SCALING
in light of the correlations reported by other authors,
the correlation (for individual PIs and Scenes) obtained
here between information extraction performance and NATO
scale values was disappointing though statistically
sign ficant. The most cogent explanation for the low
correlation, and the failure of the regression analysis
discussed above, is the difference between the experimental
desigIns in the scaling and information extraction studies.
The scaling study was a completely factorial design,
whereas, in the exploitation study, Scenes were confounded
4 with Noise for each PI, J10 E ur was treated as a between-
subjects variable. Within-subject factors have been shown
t- be more sensitive in demonstrating main effects (Grice
and Hunter, 1964). Blur, the more severe degradation, was
treated as a between-subject variable because of
experimental constraints and because only one of the
degradations could be treated within subjects in that
experiment. Unfortunately, the overall ANOVA for the
information extraction experiment failed to find a
51 -
significant main effect of Blur. Noise, the less impactful
parameter in this sLudy as shown by the Blur x Noise
interaction (see Figure 6), was significant in the
information extraction study. This difference in
experimental designs doubtlessly had an effect on the
issociation between scale values and information extraction
performance.
On the other hand, averaging across Scenes and PIs to
obtain overall sample means for the 5 Noise x 3 Blur
combinations yielded a very satisfactory correlation of r
.898 between NATO scale value and information extraction
performance. Thus, while individual scene/PT combinations
cannot be predicted very accurately, overall group
performance can. And, after all, the main objective in most
image interpretation research is to predict the performance
of the typi-al, not the individual, PI.
DIGITAL IMAGERY SIMILRITIES
A final point should be made concerning a comparison
between analog and digital modes of imagery presentation.
From the noise studies reviewed, recall that the middle
suhjective scale value corresponded to a signal-to-noise
ratio of about 30 d8 (or 32:1). From an inspection of
Figure 3, it can be seen that a scale value of five
(midpoint of NATO scale) corresponds to a signal-to-noise
raiio of 42:1, or ibout 32 dB.
- 52 -
The average signal-to-echo ratio at the middle scale
value in the studies reviewed was 20 dB (or 10:1). In
Figure 2, the NATO middle scale value of five corresponds to
115 *m of blur or a signal-to-echo ratio of about 17:1
(signal = 2000) or 25 dB. For both noise and blur, the
obtained ratios, "signal-to-degradation", at the middle
scale value were about the same as those reported in the
analog imagery literature.
On the basis of this comparison, digital images would
appear to be very similar to analog images in terms of
subjective quality or interpretability. However, due to
myriad differences in the experimental procedures, past and
present (e.g., electronic versus photographic presentation,
dynamic versus static noise), these conclusions can only be
viewed as cursory pending further research.
5I
- 53 -
VII. CONCLUSIONS
The main effects of Blur and Noise and the Blur x Noise
interaction showed that digital images, like analog images,
are poorer in subjective quality as the degree of Blur or
Noise increases. The Blur x Noise interaction indicated
that Blur was the more serious degradation at the levels
investigated.
Scene content currently unfamiliar to PIs does not appear
less interpretable than scenes commonly exploited by the Air
Force. In general, PIs seem to evaluate the information
available in an image in an objective manner, regardless of
practice or bias.
The optimal number of response categories for an image
interpretability scale is greater than 10. Because well-
practiced PIs are so attuned to resolution differences in
photographic imagery, as many as 62 categories may be needed
to accurately judge interpretability. The scale of 100
points used in this study provided useful information.
MDS can be used to represent the subjective
dimensionality of a large imagery database. The resulting
spatial configuration can then be used to determine the
physical parameters of the imagery that underly the
perzeption of interpretability. However, the attempt to
- 54 -
predict information extraction performance from DMDS
projections of subjective data failed. If MDS analysis is
to be used as a predictive response surface, the projectionrs
must be related, in a meaningful way, to performance.
Finally, mean scale scores correlated quite well wiw:
information extraction performance in spite of differences
in experimental design. The correlation proved he r wne;,
variance due to scene content and PIs was eliminated ,
averaging the data.
- 55 -
R1E F E R EN C' E-
Al rIa- .n a)nd Pre-sser DdT. h *
t, eev ision p i Ct Ir e Sim1p,] rd :v I oncg -e Iv -Proceed Lnjs of the TEE, Icq65, 112, 4,87-4ct..-
Allu isi, F'. A. oiljt lW -in'ormati.)n in absoliite~r~3~Ip- view, 191)7, 64, 97-103
AttriivL, F'. Au eton Of rvi1jach 0loI: A oiimory f b,! -i c conce -)
rs-Ft F NowYok Host Sre rt, ,
:IA r i , A. .1 i n 1, h t , J. 11 7 P I n.'AS sI 1 Ij I BRa I, rw t
In[c. 1976.
Re low , F ., Hu2 rtaa-Seni F ., F1 i t Z", a E, .o )(-'C) jec iP (e di 'm:r- a) n j of' f ec t of [ , e I Si
nc n r es . ERtJ Rev iew, Pir)63, 78, 41)
Tra in ar d, R. W. , Sa di (-ca , 'U. , L-')p t, ,,, L . W . St ~eI n',N . Devel )pment and evaluat-lfn oft- aza -), lte-ryi5i5( forrneassir 1m imarle goitvit. Washing ton, D.C. : U,)Per.sonnel Research Of fice, 'Fechnic al Resealr h Report1150, AT) 645-644, AugIust, 19').
Bu r k e, J. I. a nd S n ydIe r, H I. L . Quality I e V m 0 T-i i(Iit IlIy d er iv ed imTagr ry i nd t n ejir r el or - t
.1' t e r p r e t r p e r fo r ma 9n :, I . inrito r rr-~ t-- V ni! j LIdged,j La I i tV of ha rd CODY i iaje ry . Pape2r --i t the
o Ci e ty -f Photog. aph~ic Scijentist-s and L womsMeetino,Tiucso)n, Arizona, January, 1981.
Burko, .1. J. Ind Stricklan'i, Q. N. ")IIaL i t V ne t r C . 0~d iji t,)llIy (Ie rived i ma,]e-ry a rid !-ae ir r e ., it ic)r,interpreter per formance : I . Prepa r3t ion of a lat ge--scaleja ta bas e. Tec-h n icalI Re po rt 11 FL 8 1- I, V irq in i Polytechnic7Tinsti tote and State University, BiLacks hiicg, VA , 1982.
Cavanaugh,,I. R. and Lessman, A.MN. 1;111) i -- tIV e "t f0c ts o f(Iif foernt ial (lain and di fferential phase I :f-t rins inNTSC' color televisioPn pictuires,. o ur nzaI-,I- 1) re M I TE,1971, 80, 614-619.
- 56 -
Erikson, C. W. and Hake, H. W. Multdimnsi ialdifferences and accuracy of discrimination. Journ,! ofExperimental PsLcholoqy, 1955, 5-0, 153-160.
Geddes, W. K. E. The relative impairment pro'u-cy r rnon.noise in 405-line and 625-Iine television pictjres. F-FUReview, 1963, 78, 46.
Grice, G. R. and Hunter, J. J. Itiml : .nte .depend on the type of experimental desi jn. o 1i ciReview, 1964, 71, 247-256.
Guilford, J. P. Psyhomtric methods. New Y.r.: "'3raw-Hill, 1954.
Humes, J. M. and Bauerschmidt, D. K. Low i iht .v2" TVviewfinder simulation program: Phase B. Wright-Pa reAir Force Base, Ohio: Aerospace Medi -al ResearchLaboratory, Technical Report AFA-TR- 1 ->7I, Nov-m-ner,1968.
Klingberg, C. L., Eiworth, C. L., and Fi]leai, C. i0. -7 e
quality and detection performance of mi'itaryinterpreters. The Boeing, Company, Final Report ofUSAFOSR Contract No. F44620-9-C-0128, Anril, 1!g'.
Lessman, A. M. The subjective effects of echoes in 525-1 inemonochrome and NTSC color television and .:ne rasa1tinqecho time weighting. Journal of the qMP-i:, 1972, 81,907-916.
Marmolin, H. and Nyberg, S. Multidimens )al scaling fsubjective image quality. Forsvarecs forskningsanstalt,Huvudenhet 3, FOA Report no. C 30039-H9, Stockholm,November, 1978.
Miller, G. A. The magic number sev\'n, T o r rinus twoai:some limits on our capacity for pr c' nI on rmat i.Psychological Review, 1956, 63, 81-97.
Muller, P. F., Sidorsky, R. C. , Slivinsk A. 3. , A] 1,si,F. A., and Fitts, P. M. The symbo1ic c, ::I , finformation on cathode ray tiubes ancd simi ar is ia's.USAF, WADC Technical Report No. 51-371, 195.
Newell, G. F. and Geddes, W. K. E. Visibility of smallluninance perturbations in t elevision li oiay:-.Proceedinjis of the 1FE, 1963, 11 1, i 7
9.
Prosser, R. D. and Alnatt, 1. W. uL],:t iv, ;:O I I
television pictures impaired by random not)Proceedincs of the FEE, 19r5, 112, 1()Q-! f.
- 57 -
Prosser, R. D., Allnatt, J. W., and Lewis, N. .. Qualitygrading of impaired television pictures. Proceedings ofthe IEE, 1964, 111, 491-502.
Sadacca, R. and Schwartz, A. I. Psych phy'ical aspects ofimage quality--exploratory study. Washir-qtorn, D.C.: U.SArmy Personnel Research Office, Technical Res.-arch Note136, September, 1963.
Snydt r, H.L., Turpin, J.A., and Maddox, M.E. QualitVmetrics of digitally derived imagery and their relationto interpreter performanc e: I[. Hard--Cop'I Digital ImageryInterpretability. Technical Report 81-3, VirginiaPolytechnic Institute and State University, Blacksbur"4,VA, 1981.
Stevens, S. S. Psychophysics. New York: Wilevy and Sons,1975.
Torgerson, W. Theory and methods of scaling. New York:Wiley and Sons, 1958.
Weaver, L. E. Subjective impairment of television pictures.Electronic and Radio En_ ineerin9, 1959, 36, 170.
Weaver, L. E. The quality rating of color relevisionpictures. Journal of the SMPTE, 1968, 77, 610-612.
- 58 -
AppendiX A
THE: NAT') SCALE -- Ax M,- > *
Ritinci Cate-j-orv
j,;e I e F :-, f r. 'c i )r i n t
*e t en at dr r~ y.~
D e t tr o Iri es r n -r
D e t :'r - r y a~ -r iT1 IDe t t :, r -; ) . an 1 r- I I
; I~ I r a -7t d I r pP c r i r' - e, r t
*~~~ r- - r j~ - I
j a dim i n i . t r 'I J 1- 1: . 7 K- (I
; rl e ~ui l l 2, I
jt:eCt i re (I fa 7ti: --I C i t- f 17Ll r-ia irc r - t * y t y,)e, ,t i ; o: A
Rec ojn izo prrts and llarlors i n- luiiij )r7,irydocks).
,)e toct m 1n a t in s e q u pm ent (radio/rad.)r).De t t s irinI y I him n F (PQ0L/ r]n a n ceDe t trt aol I-,r in t ic r a ry al -'Ist ra irlh t- w i n - a r r af all
sws-pt-win) a ircraf ,, and allI del ta-w rn ir-craf t.Detec-t cinman I and -ontro)l hea~quarters.)Pet ~t i r f: e -t:D-,,r f ice -irn, - suLirf ic --, .-- ai r -ii .5 e
-n iii ve', -1e s a n d o t 'Itr p ie ces o f ? q jIPm e r, rDetecr lirl inefi~
Pe 01n z rilIqe sRe-)jnize -;irfact- <,:Ps (ds~qlb btween a -roi:3er an('
i -iesrr-)or by ro.;ative sizo .nni bil! snapn-,.fPeJ InJfl1; zo ,-1 fn(d I -1(4i:I1 bearches-S.
--,) Ini* i r'>3r,-i I yirls and sni)ps.
ITdlen -if y i irfield facil ities.Iden t ify urban ireas.rIenti Lt Y t Pr ra in .
RatinQ ('at tory 4~
'"-In zf r)OF" 'In itF.Te: 7 '' r 7raft;- (suc-h as FAGOT/M1IDGET whon sinqly
Rf- z'';;. le s es S SM/ SAM) D Dis t injuIsh b e twee nI oe -VT y tho presence and relative position
w4 i . n "InrO fins.R n z-i - lea wea-ion; components.
Re: - jn z, ii minre lelds.:,lent iy 3nr nd har!bors.Tr;"ntify railroal yards and shops.
udentify t rucks at Irround force instal latinsq as cargo,f 'atbePd, 'or vani.
Identi fy a KRESTA by the helicopter platform flush with thef~intaziI, a KREfSTA II by the raised helicopter platform(one deck level above fantail and flush with the main
de-ck)
-60 -
Ratijn_ Cat~k (rv
Detect the presence of cal I letters or rialphabetical country desijnator on the wi ngs of larglecommercial or carg3o ai r--raft (wher9- alpha'numer ics arethree feet hig1h or -1reater).
Recognize commandi and control headqUarters.Identi fv -a sing ly den nv oyoi rank at a q r pr-For :es
inlstrallatiofl is liht or inedium/heavy.Perform Technical Analysis, (PTA) on airfield facilities:.PTA on urban areas and terrain.
Rat U Cateory 1
Reco nize resnoc/radar equ an ebr.Recognize suppiy dumps (POL/ordnance)Recognize rc ts and rol artillery.Identify b rides.Identify troop (in I ts.Id en tify coast and i tenrr boaches .Identi fy a F AG OT o)r M lIY'jE T by c ano py cor f i qura tion wh e:-
sinIny dape poyed d.Identi zv tie ground farce equx)rent; T-S4,/55 tank, RTR-50
armored personnel arrier, 57 mm AA y..Identify by type, B installations . , 25O, series)
torpedlo tubes ( .21 1-/53.34 cm), and surface-to-air missile launchers on a KANIN DDG,KRIVAC DDGSP, or KRESTA IL.
Identi fy a ROMEo-class submarine by the prtsence of thecowling for the snorkel nductior _. nd the snorkelexohaus t.
identify a WHISKEY-class submarine by the a sence of thecowling and exhaust.
Rat inj Category
Identify ridar equipment.Identify major electronics by type on a KTLDFN DDGS or
KASHIN DLG.identify command and control headquarters.Identify nuclear weapons components.Identify land minefields.Identify the general configuration of an .... BN,/'5i1N submarine
sail, to include relati-e plncement -)f bridge peri-scope(s) and main electronics/a,/ioati n equipment.
PTA on ports, harbors, and roads.PTA on railroad yards and shops.
- 61 -
Ratinq Cal t eory 8
Identify supply dumps (POL/ordnance).Identify rockets and artillery.Identify aircraft.Identify missile sites (SSM/SAM).Identify surface ships.Identify vehicles.Identify surfaced submarines (including components such as
ECHO 11 SSGN sail missile launcher elevator guide andmajor electronics/navigation equipment by type).
Identify, on a KRESTA II, the configuration of the majorcomponents of larger electronics equipment and smallerelectronics by type.
Identify limbs (arms, legs) on an individual.PTA on bridges.PTA on troop units.PTA on coast and landing beaches.
Ratin 9 C-a-tegory 9
Identify in detail the configuration of a D-30 howitzer
muzzle brake.Identify in detail on a KILDEN DDGS the configuration of
torpedo tubes and AA gun mountings (including gundetails).
Identify in detail the configuration of an ECHO II SSGNsail including detailed configuration of electronicscommunications equipment and navigation equipment.
PTA on radio/radar equipment.PTA on supply dumps (POL/ordnance).PTA on missile sites.PTA on nuclear weapons components.
- 62 -
Appendix B
SCENE MCT RESULTS
A connecting line indicatas no differences among means, asdetermined by the Newman-Keuls multiple comparisons test, atthe p < .05 level of confidence.
ORDERED SCENES
8 1 6 9 5 2 3 10 4 7
- 63 -
Appendix C
BLUR X NOISE MCT RESULTS
A connecting line indicates no difference among means, asdetermined by the Newman-Keuls multiple comparison test, atthe p < .05 level of confidence.
40 jm BLUR
ORDERED SIGNAL-To-NZO1i;, EITIL),j
12 24 42 60 75
52 Pm BLUR
ORDERED SIGNAL-TO-NOISE RATIOS
12 24 42 60 75
84 pm BLUR
ORDERED SIGNAL-TO-NOISE RATIOS
12 24 42 60 75
-64-
162 ,mn BLUR
ORDERED SIGNAL-TO-NOISE RATIOS
12 24 42 60 75
322 .m BLUR
ORDERED SIGNAL-TO-NOISE RATIOS
12 24 42 60 75
- - - - - - - - - -
-65i
Appendix D
SCENE MCT RESULTS -- BLUR LEVEL 40
A connecting line indicates no difference among means, asdetermined by the Newman-Keuls multiple comparisons test, atthe p < .05 level of confidence.
ORDERED SCENES
6 1 8 5 2 4 9 10 3 7
- 66
Appendix E
SCENE MCT RESULTS -- BLUR LEVEL 52
N connecting line indicates no difference among means, asdetermined by the Newman-Keuls multiple comparisons test, atthe p < .05 level of confidence.
ORDERED SCENES
8 6 1 5 9 2 3 4 10 7
!
- 67 -
Appendix P
SCENE MCT RESULTS -- BLUR LEVEL 84
A connecting line indicates no difference among means, asdetermined by the Newman-Keuls multiple comparisons test, atthe p < .05 level of confidence.
ORDERED SCENES
8 1 5 6 9 10 2 3 4 7
- 68 -
Appendix G
SCENE MCT RESULTS -- BLUR LEVEL 162
A connecting line indicates no difference among means, asdetermined by the Newman-Keuls multiple comparisons test, atthe p < .05 level of confidence.
ORDERED SCENES
8 9 1 6 2 5 3 4 10 7
I
L
- 69 -
Appendix H
SCENE MCT RESULTS -- BLUR LEVEL 322
A connecting line indicates no difference among means, asindicated by the Newman-Keuls multiple comparisons test, atthe p < .05 level of confidence.
ORDERED SCENES
8 9 3 ( 5 2 1 10 4 7
0
- 70 -
Appendix I
PROJECTIONS OF MD5 A \L'3IS
Tabled values are the projections o ". r.
dimensions calculated for each ima- , -
parameters, Scene, Noise, and Blur.
SCENE NOISE BLUR IMAGE5 33 52 1 -0.3 . - .5 33 84 2 -0.795 0 ..5 33 162 3 - 0.443 65 33 322 4 0.769 -0.2-5 48 52 5 -0.825 .54--5 48 84 6 -0.852 G . -. -
5 48 162 7 0.052 - )'4-5 48 322 8 0.907 3 -5 83 52 9 -0.848 0 .1 55 83 84 10 -0. 722 -0 .3 2 . -.5 83 162 1i 0.210 -10. 9 " 7 r , S5 83 322 12 0.918 0. 05 167 52 13 -0.567 -0.59 1 - , '5 167 84 14 0. 196 -0.80 3' -5 167 162 15 0.593 -0.489 4 44 , 1 3 3 -5 167 322 16 0.859 0. 495 -C . . .8 33 52 17 -0.722 0.715 D ) 4 - I8 33 84 18 -0.838 0.090 ;-, 3 08 33 162 19 0.841 -0.0578 33 322 20 0.704 0.57; -8 48 52 21 -0.839 0.092 -) 0 2i8 48 84 22 -0. 577 -0. 57 1 - 58 48 162 23 0.900 0 .12 .-8 48 322 24 0.724 0 .5' 7 748 83 52 25 -0.764 -0. 22 - 4,(1) , 0 - , .0 ') ,8 83 84 26 -0.239 -0 .3 -0 . 1 ,9 :418 83 162 27 0.915 (1. 200 0, 0.8 83 322 28 0.740 0.5 ( 1) r -0. 4 18 167 52 29 0. 280 -0 7 t.8 167 84 30 0. 499 -3. r 878 167 162 31 0.91 t 1,f, 6 14' ,-' 9)8 167 322 32 0.534 S - . 4, 6 k 3 k'10 33 52 33 -0. 18' 0.4,10 33 84 34 -0.768 (o , . ,"-!',' ,; ) -- , 8r10 33 162 35 -0.644 -n.4 0 81 - F,,- J I n 36
- 71 -
13 322 36 0.902 0. 130 0 . -7
10 48 52 37 -0.510 0.745 .7 1 4
10 48 84 38 -0.772 0.667 0.28 ) -0.j
10 48 12 39 -0.151 -0.0 0.0? 0. 0 . 2 3
10 48 322 40 0.880 0.056 O. ' 28
10 83 52 41 -0.544 0.756 q6 8l - .342 -0. ,
, 0 83 84 42 -0.827 0.027 - , 3 2
83 1r.2 43 0.021 -0.849 0. 8 9)10 83 322 44 0.920 .,.
1;;67 52 45 -0.824 -0 .
10 167 84 46 -0.510 -0.51 - -
1r7 162 47 0.237 -0.78 3 :-4 8 .
1 167 322 48 0.889 0. 121 0 It . 3.
-72-
DA
FIL