NIR Clinical Dimensions

download NIR Clinical Dimensions

of 9

Transcript of NIR Clinical Dimensions

  • 7/29/2019 NIR Clinical Dimensions

    1/9

    CLIN. CHEM. 38 /9 , 1623 -1631 ( 1992)

    C LIN IC AL C HEMIS TR Y, V ol. 38 , N o. 9, 1992 1623

    Near-In fr ared Spec tropho tometry:A New D imens ion in C lin ica l Chemis tr yJeffrey W . Hall2 and A lan Pollard3T he n ea r-in fra re d (N IR ) sp ectra l re gio n (7 00 -2 50 0 nm) isa fertile source o f c hem ic al in fo rma tio n in th e form ofovertone an d com bination band s of th e fun dam entalinfrared absorptions and low -energy electronic transi-tions. This region w as in itia lly perceived as being to ocomplex for in terpre tat ion and consequently w as poorlyutilized. Advances in chem om etric techniques that canextract mass iv e amount s of c hem ic al in fo rm a tio n f rom t heh ig hly o ve rla pp ed , c omp le x spectra h av e le d to e xte ns iv eu se o f N IR s pec tro ph oto metry (N IR S) in th e food, agncul-ture, pharmaceutical, c hem ic al, a nd p olyme r in du strie s.T he a pp lic atio n o f N IR S in clin ica l laboratory measure-ments is still in its in fa nc y. N IR S is a s im ple , q uic k,n on de stru ctiv e te ch niq ue c ap ab le of prov id ing c l in ica l lyre le va nt a na ly se s o f b io lo gic al s am p le s w ith p re cis io n a nda cc ura cy c om pa rab le w ith th e m eth od used to d er iv e th eN IRS m odels. Analyses can be performed with little or n osample preparation and no reagents. The success ofN IRS in any particu lar case is determ ined by the com -plexity of th e s am ple m atrix , re la tiv e N IR a bs orp tiv itie s o fthe constituents, and the wavelengths and regressiontechnique chosen. We describe the general approach todat a a cquis it io n , c alib ra tio n , a nd ana ly sis , u sing ser umproteins, triglycerides, a nd g lu co se as ex am ples .AddItIonal Keyphrases:chemomet r ic techniques . onjnvasivetechnology

    For clinical laboratory measurements, an analyticaltechnique must meet certain criteria (1). It sh ou ldp ro vid e v alu ab le clinical information and be accuratean d precise over the range of values encountered. Inex-pensive disposables (or no disposables) an d ra pid a na l-ysis times are desirable. For routine operation , themethod should be reliable, rugged, and easily auto-mate

    Near-infrared spectrophotometry (N IRS) has the po-tential to satisfy many of th ese criteria. It needs noreagents, is r ap id an d n on destru ctiv e, a nd is su ita ble fo rmatrices that ar e strongly light scattering, com plex incomposition, and inhomogeneous.4 N IRS can p rovid eaccuracy an d precision comparable w ith those of pri-

    lDepare.ent of Clinical Biochemistry, M ount Sinai H ospital,60 0 University A ve nu e, T or on to , O nta rio , C an ad a M5G 1X5.2yma Inc., 12 7 F arq uh ars on B uild in g, York UniversityCampus , 4 700 K eele St reet , Toronto , O nta rio , C an ad a M 3J 1P3.3Depar tment of C linical B io ch em istry , B antin g In stitu te, Uni-v er sit y o f T or on to , Toronto, O nta rio , C an ad a M 5G 1L 5.4Nonstandard a bb re via tio ns : N IBS , n ea r- in fr are d s pe ct ro ph o-tom etry ; N IB , near-infrared; IR , infrared; PL S, partial least-sq uares; S EC , stan dard error o f calibration ; R M SC V, root-mean-square error of cross-validation; and SEP , standard error of pre-diction.R ec eiv ed M a y 8 , 1 99 2; a cc ep te d June 10 , 1992.

    mary reference m ethods and is sim p le an d reliab le inr ou ti ne imp lemen ta ti on .The near-infrared (NIR) region o f th e electromagnetic

    spectrum extends from the end of the visib le spectrum,at a wavelength of -700 rim , to the beginning of thefundam ental infrared (ifi) absorption bands at 2500 nm .A bso rp tio ns occ urring in the NIB are m ost often associ-ated with th e overtone and com bination bands of thefundamental molec ular vib ration s of -OH , -NH, an d-C H functional groups that are seen in the m id-lBregion. A s a result, most biochem ical species w ill ex -h ib it u niq ue a bso rp tio ns in the NIB . In addition , a fewwea k e le ctro nic tra ns itio ns of organometal li c molecules,such as hemoglobin , m yoglobin , and cytochrome, alsoappear in the N IR . These highly overlapping, weaklya bso rb in g b an ds w ere in itia lly perceived to be too com -plex for interpretation and too weak fo r p ra ctic al a pp li-catio n. H ow ev er, im pro vem en ts in instrumentation an dadvances in multivariate chemometric data analysistechniques, which can extract v as t am ou nts of chemicalinformation fro m N IB sp ec tra , allow meaningful resultsto be obtained from th e comp lex sp ec tr um .

    T raditionally, N IB S has been used to estim ate thenutrien t conten t of a gric ultu ra l c ommod itie s. More re -cently, NIBS has become w idely applied in the foodprocessing, chem ical, pu lp and paper, pharmaceutical,polymer, an d p etroc hem ic al ind ustries.

    D irect spectroscopic m easurem ents o f u nm od ifie dbody fluids with the m ore traditional u ltrav iolet, visi-ble, and LB regions of the spectrum ar e impract icalbecause of limited penetration depths, interfering ab -so rp tio ns , an d e xc ess iv e sc atte rin g with inhomogeneoussam ples. In contrast, th e weak absorption of N IB radi-ation by most biochem ical species makes NIBS usefulbecause body fluids an d soft tissu es a re relativ ely tran s-parent a t t he se w av ele ng th s. Medic al r es ea rc he rs beganinvestigating the N IB spectral properties of skin in the1950s (2). O th er c lin ic al ap plicatio ns d escribed includetra nscutane ou s m easurem ents fo r estim ating body com-position (3), blood and tissue oxygenation (4 , 5) , brainoxygen sufficiency (6), and breast cancer screening (7).In addition , N IBS in vitro analyses fo r total protein,albumin, g lu co se , c ho le st er ol , triglycerides, an d oth erlip ids in serum (8-10) and in model systems (11, 12)have been rep orte d. N IB S h as a lso been used to estimatefecal fa t content (13). D espite th ese publications therear e very few , if any, applications in common use inclinical laboratories.Here we describe th e p rincip les in volved in d ev elo p-

    in g robust NIBS algori thms, using serum protein s,triglycerides, and glucose as examples.

  • 7/29/2019 NIR Clinical Dimensions

    2/9

    Wavelength (nm)0.6

    0.4

    0.80)UC#{149}1,(

    0) -0.6

    1 62 4 C L IN ICA LCHEM IS TRY , V ol. 3 8 , N o . 9 , 1992

    M aterials and M ethodsD ata A cquisition and A nalysis

    Serum an d p1ma specimens were selected fromroutine clinical laboratory samples that had been ana-lyzed on the Kodak (Rochester, NY ) Ektachem 700.Samples w ith a na ly te concentrations that were d is tr ib -uted over th e typ ical clinical range were chosen an dallowed to reach am bient temperature before spectre-s co pi c a na ly sis .Calibration and validation sample sets were ran-domly selected. F or s er um p ro te in a na ly sis th e calibra-tion data set contained 23 5 samples and th e validationse t contained 85 samples. For serum triglycerides anal-ysis th e calibration data se t contained 33 9 samples andth e validation se t contained 15 0 samples.

    A serum sample (0.5 mL) was placed in to a quartzcuvette w ith a pathlength of 0.5 mm . The d if fu se tra ns -m ittance spectrum from 400 to 2500 r im w as c olle cte d byusing an NiRSystem s Inc. (Silver Spring, M D) Model65 00 research spec troph otom eter. T he sp ectra l band-pass was 10 nm with s pe ctr al c olle ctio n at 2-nm inter-v als . T h ir ty -t wo c o-ad de d sc an s were obtained fo r e ac hsample and 32 co-added scans of air w ere used for thereferen ce. A cq uisition of a single scan takes 0.9 s.Samples were scanned without temperature control.The instrument uses a silicon detector in the 400-1100rim reg ion and a lead sulfide detector in the 1100-2500rim r eg io n.Urea , human serum albumin, a nd y -g lo bu lin s were

    obtained from S igma Chemical Co. (St. Louis, M O).Solutions of each in pho sphate-buffered salin e weredrie d o nto filter paper and scanned in th e ref lectancemode. The spectral signature of the dried extracts ismore informative than that of the orig ina l m aterialsbecause of the in fluence of m atrix effects (i.e ., pH orionic strength) on the N IB spectrum .Data collection, spectra l m athematical treatm ents,

    multiple an d lin ea r le as t- sq ua re s r eg re ss io n a na ly sis ,an d partial least-squares (PLS) reg ression analysiswere performed with th e Near-Infrared Spectral Anal-ysis So ftw are (N SA S) fu rnished with the ins t rument .Regression analyses were performed on the second-derivative of th e absorbance data in a ll cases. Thederivative wa s d et erm in ed b y s umm in g adjacent 20-nmsp ectral se gm en ts an d calculating the difference be-tween neighboring spectral sum s.

    Results an d DIscussionNIR SpectraW ater is presen t in most biological specimens. Be-

    cause water absorbs very strongly in the NIB, it ca ncreate significant backg rou nd problems in the NIBSanalysis of biological fluids. To reduce the high waterabsorbance, we used a very short o pt ic al p ath le ng th of0.5 m m.The absorp tion spectra of representat ive serum spec-imens in the 400-2500 rim region a re illu stra te d inFigure l#{192}.h e d om in an t absorption bands situated at1450 and 1940 nm are th e fir st o ve rto ne a nd c om bin a-

    T

    -0.2.0.4-0.6

    400 700 1000 1300 1600 1900 2200 2500Wavelength (nm)

    FIg. 1. VisI ble (400 -700 nm ) and NIR (700-2500 nm) absorbancespectra of s e rum spec imens (A ) and cor responding second-deriva-tive absorbance spectra (B ).s.-, Clearserum;-, h sm oly ze d s er um ; , Ilpe mic s erum ; a ndl au nd ic ed s er um

    tion bands of water. Absorpt ions in the v isible regiondue to bilirubin and hem oglobin are also seen. However ,it is difficult to d is ce rn differences in the N IB spectralr eg io ns r ela te d to o th er a na ly te variat ions. The baselinefluctuations are due to changes in the scattering prop-ert ies o f th e indiv idual sam ples and to a lesser extent tosubtle differences in s am ple -c ell p os itio nin g, both ofwhich are potential sources of error if l ef t uncorrec ted .C alculation of the second-derivative of absorbance

    w ith respect to w avelen gth enhances spectral resolutionan d compensates fo r baseline shifts w ith no apparentdegradation in th e a na ly tic al result (14-16). Derivativetechniques have also been used to advantage in otherinstrumental methods of analysis, in clu din g p ola ro gr a-phy, differential scann in g calo rim etry, ultrav io let--visi-ble spectroscopy, and JR a nd f lu o re sc en ce spectroscopy(15-18). The corresponding second-derivative absor-bance spec tr a o f the rep resentat ive sera are plotted inFigure lB . Absorption maxima are now minima an deach negative band is flanked b y a p ositive satellite lob eon either side. Bandw id th is sharply reduced, allowing

  • 7/29/2019 NIR Clinical Dimensions

    3/9

    01UCto.0I-081.04N-00)(0>S.8)0C0U0)U)

    Wavelength (nm)

    a,UC(0.00

    2 0.0084001

    0.004

    Wavelength (nm)

    0)UCto.00U).04

    (00)CVC0U0)U)

    2034 2078 2118 2160 2202 2244 2286 2328 2370

    C LIN IC AL CHEMIS TR Y, V ol. 3 8, N o. 9, 1992 1625

    resolution of overlapping peaks, and baseline differ-ences ar e largely eliminated.The overwhelming absorp tions due to water still

    make structural fe atu re s rela ted to o th er constituentsless than obviou s. For clarity, th e visible, first overtone-and com bination band spectral region s h ave been ex -panded in F igu re 2A-C. Subtle spectral variations as -sociated with b io ch em ic al d iff ere nc es now become ap -parent in the m agnified spectral reg ions. For example,changes in lip id c on ce ntr atio ns a re reflected in the -C Hcombination band r eg io n ( Fig ur e 2C ) at 2300 nm andth e -CH first overtone region (Figure 2B) at 1700 rim.P ro te in va ria tio ns ar e easily discernible in the -N Hcom bination band region at 2060 nm and 2180 nm . Thereduction in baseline differences betw een th e lip em ican d clear serum sam ple is m ore obvious in the v isiblereg ion (Figure 2A ).Calibration

    N IB spectra o f u nmodif ie d biological samples areaffected by matrix variations, interfering substances,an d matrix effects such as hydrogen bonding. Thereforeth e traditional approach to calibration o f a na ly zin gp rim ary stan dard s in sim ple solu tio ns is n ot ap plicab le.Instead, calibration must be performed by using a cali-bration set of actual samples that have b ee n a naly ze df or t he constituents of in terest b y in depen den t m etho ds.It is necessary to ensure that the calibration set ofsam ples adeq uately reflects the variation in the samplepopulation to b e e nc ou nt er ed w ith re sp ec t to the rangeof values of the analytes, matrix variations, and in ter-fering substances. Because NIBS is a secondary calibra-tion technique, al l errors in the reference m ethod w ill beincluded in the NIBS determ ination . Therefore, it isimportant to select a reference method that is as accu-rate and precise as possible.The development of an NIBS algorithm can be viewedas a two-step procedure. In th e calib ration or f ir st s ta ge ,

    th e regression m odel that best relates the N IB spectraldata to th e referen ce analyte data is determined. Thisr esu ltin g c alib ra tio n algorithm is then validated in asecond stage by using an in de pe nd en t v alid at io n se t ofsamples to ensure that an overfitted solution has notbeen obtained and to e sta blis h ro bu stn es s.

    U n iv ar i a te r eg re ss io n analysis. B eers L aw is th e basicassumption behind quantitative NIBS . Equation 1

    A=sbc (1 )

    states that in the absence of scattering variations an doverlapping absorp tions, the absorbance A of lightchanges l in ear ly w ith a na ly te c on ce nt ra tio n c an d pa thlength b; #{128}s the molecular absorptivity unique to aparticular ab sor bin g m olecu le. Likewise in the deriva-tive dom ain, b ecause the derivative of th e a bs orb an cew ith resp ect to wavelength is a l inear o peratio n an dderivative techniques can m inim ize th e nonlinear ef -fects of scattering, agreement w it h B ee r s Law may stillb e a ss um e d.

    Wav ele ng th (nm)FIg. 2. Expanded v is ib le second -de r iv a ti ve abso rbance spect ra ofsera (A ) an d expanded NIP second-derivat ive absorbance spectra ofsera-overtone region (, an dcomb ina ti on band regi on (C )-u .u- , c lear s erum ;-, h em oly zedserum; , l ip em ic s er um ; a n djaundicedserum

    Rearranging equation 1 for concentration yields equa-tion 2

    c=KA(A)

  • 7/29/2019 NIR Clinical Dimensions

    4/9

    S.

    Ca,U0C.)C0(Ua,S.0C. ) .0.5

    a)UCto.00U,.040C,>to>S.U)C#{149}0C0UC,U)

    1lOO 1300 1500 1700 1900 2100 2300 2500 2034 2078 2118 2160 2202 2244 2288 2328 2370

    1626 CL IN ICALCHEM ISTRY, Vo l. 3 8 , No . 9 , 1992

    Wave leng th (nm)Fig .3 . Correlat ionplot ( ) and correlationspectrum (-) fo ral inear least -squaresNIRS determinat ionof human serum albumin

    which is th e simplest form of Beers Law . This statesthat, in the absence of error, th e concentration of amolecular s pe cie s is r ela te d to the absorbance A(A) byth e sample a t w av ele ng th A . K is a p ropo rt ional it yconstant in ve rs ely r ela te d to both th e m olecular absorp-tivity of the analyte an d th e op tica l pathlength.K is determined by measuring A(A) fo r a series of

    samples of k now n analy te composition. In p ra ctic e, t hec om p lex , h ig hly overlapping absorptions occurring inthe N IB require more sophisticated B eers L aw exp res-sions, often in vo lv in g mu lt ip le terms to characterize th erelation ship between N iB absorp tion an d analyte con-centration. Analytical error in b oth the NIBS techniquean d th e r efer en ce m eth od further com plicate the analy-sis. A s a r esu lt , th e d eriv ation of a B eers L aw algo rithmis not usually perform ed by solving sim ulta neo us eq ua -tio ns b ut in vo lv es statistical r eg re ss io n t ec hn iqu es .B y use of univariate least-squares regression analy-sis, the reference analyt ical data for each sample isr eg re ss ed a ga in st th e N IB s ec on d-d eriv ativ e a bso rb an cespectrum, providing a single-w avelength regressioneq uatio n at each spectral data point of the general form

    c = K(0) + K(1) .A(A1) (3 )where c is the analyte concentration, K(0) is the y-in-tercep t, K (1) is the slope, and A(A1) is the absorbance atwavelength A1.

    Each derived algorithm is assessed by considering thecorrelation plot an d c or re la tio n s pe ctr um (Figure 3). Acorrelation p lot is a p lot of the correlation coeffic ien tderived from the data (using equation 3) at each wave-length, plotted ag ain st w av ele ng th . A b ro ad c or re la tio nband is le ss lik ely to be affected by spectral variationsassociated with physical an d chemical changes an d willprovide a m ore reliab le region in which to perform th eassay (19). T he c orre la tio n spectrum is obta ined bym ultip lyin g th e c orrelatio n c oe ffic ien t at each wave-length by its cor resp on din g av era ge absorbance value ofth e calibration data set. The correlation spectrum pro-

    Wave leng th (nm)F ig . 4. N IR second-derivat ive reflectancespectra o f h um an serumalbumin( ), globulin (- - - -), an d urea (- ) i n the 2034-2370nm spect ra l r egi on

    vides a reconstructed spectrum of the analyte in thematrix under cons iderat ion . In th e correlation spectruman d correlation p lot, h igh correlatio ns an d ab sorp tio nbands sh ou ld co rrespo nd to spectral regions where theanalyte is known to a bso rb . Peaks in t he c or re la tio nspectrum and large coefficients in the correlation plotthat appear not to correspond with an analyte absorp-tion b an d are n orm ally rejected in th e initial analysis.The c or re la ti on spectrum and correlation plot for the

    N IB S determ ination of hum an serum album in are pre-sen ted in Figure 3. S ev eral p rom ising region s for d eter-mining albumin in serum can be seen . Com parison ofth ese p lo ts w ith th e se co nd -d eriv ativ e ref lectance spec-trum of a dried sa mp le o f human serum albumin (Figure4) directs us to the 2180 nm band for the albumindeterm ination . W ith 2178 rim as the analyt ical wave-length , a corre la t ion coef fi ci en t (r ) o f -0 .9 6 an d a c or re -sponding standard error of calibration (SEC) of 1.8 g/Lare obtained. T he c orre la tio n c oe ffic ie nt in dic ate s th ep ro po rtio n o f the v ariatio n in the reference data ex-p la in ed b y th e lin ear le ast -squa res m odel and should beas close to 1 as possib le. The SEC is the standardd ev ia tion of th e residuals ab ou t th e regression line an dshould be sim ilar in value to the reference standarderror of measurement. One standard error (S ) f or t heKodak Ektachem vs the fundamental r ef er en ce me th odfo r album in is 1 .3 g/L (20).A scatter plot of the NIBS c alc ula te d s erum a lb um inconcentration vs the reference concentration for the

    calibration data set is illustrated in Figure 5A . Thesignificant scatter about the regression line, a lso indi-cated by the high SEC, suggests t ha t a dd itio na l factorsin the N IB spectrum are influencing th e spectrophoto-m etric estim ation of album in in hum an serum . I n f ac t,the samples possessing the m ost extreme residualstended to have e xtreme c on ce ntra tio ns o f e ith er globu-lins, urea, or both. A closer look at Figure 4, wh ichc om pa re s th e spectrum of album in w ith those of g lobu-lins and of urea, suggests that it is necessary an d

  • 7/29/2019 NIR Clinical Dimensions

    5/9

    C LIN IC AL CHEMIS TR Y, V ol. 3 8, N o. 9, 1992 1627

    -I

    E.04ES.4,U)

    (UU(0C .)U)z

    CE.04ES.U)U)VU)toUto0U)z

    20 30 40Kod ak E k ta ch em Serum A lbumin (g/L)

    Fig. 5 . N IRS calculated vs Kodak Ek tachem va lu es for human serumalbumin fo r a single-wavelength (2 17 8 n m) l inear least-squarescal ibrat ionmodel (A : r = 0.96, SEC = 1.8 g/L) and for a tw o-wavelength(2178 + 2206 nm) multiplel inearleast-squarescal ibra-t ion mode l (& R 0.98, SEC = 1. 2 g/L)

    po ssib le to correct fo r the sp ec tro sc op ic in te rfere nc e o furea an d globulins on the NIBS albumin assay.

    M u ltiv ar i a te r eg re ss io n analysis. Multiple-term l inearleast-squares re gre ssio n m od els involving m ore thanon e wavelength may be used to correct fo r i n te r fe r ingabsorptions or scattering differences. The change inabsorbance at one or more wavelengths is correlatedwith th e reference values by using either a linear8umm ition, a ratio of terms, or a c om bin atio n o f b oth .The primary analyte wavelength A1 i s h e ld constant an dappropriate correction w avelengths are selected an devalu a ted spec tr oscop ica lly an d chemometrical ly.

    In terfering substances absorbing at the sam e w ave-length as the analyte, whether identified or not, can becorrected for by incorporating secondary term s whereth e a na ly te an d in te rfe re nt a bs or b d if fe re ntly . A lin ea rsum mRtion equation is derived of the general form

    c = K(0) + K(1) A(A1) + K(2) A(A2)+...+K(n).A(A,,)

    where K(0) is th e y-intercept an d K(1)-K (n) are theslope terms in n-dimensional space. T he m ajor contri-bution to th e a bs or ptio n at th e prim ary wavelength isdue to the analyte. Additional terms at secondary wave-lengths A2, A3,.. . A,, ar e included w here the absorptionA(A2 ), A (A3 ), . . . A(A,.) is due mostly to interferents.The use of l inear summ sttion is illustrated by thed eterm in ation of human serum album in. When a linearcorrection term is used at 2206 rim , where the absorp-tion of N IB radiation is due prim arily to globulins (seeF igu re 4), r in cre ase s fro m -0.96 to -0 .98 and the SECimproves from 1. 8 to 1.2 g /L . A t 2206 nm , an isosbesticpoint exists fo r urea and albumin, whe re as g lo bu lin sexhib it a subtle n eg at iv e a bs or pt io n. In th is exam ple itappears that a single w avelength term can be used tocorrect fo r tw o spectral interferences, alth ou gh n ot com-pletely, because the dom inant spectral in terferen ce isdue to the globulins. A scatter plot of the NIBS calcu-lated serum albumin concentrations vs the referencev alu es (F ig ur e 5B ) shows a s ub sta ntia l r ed uc tio n in thescatter of the data.To m in im ize m ultip licative and add itive scatter ef -

    fects that have not b ee n comp en sa te d for by usingsecond-derivative techniques, we may u se th e ratio ofth e a bs orb an ce s a t tw o w ave le ng th s. F or e xa mp le , v ari-ations in pathlength due to differences in light scatter-in g or in sample density can be compensated for byusing a refe ren ce w av ele ng th , A 2, a t w hich the spectraldata vary with pathlength reproducibly . T he e qu atio ninvolving the ratio of absorbance values w ill have theg en era l fo rm

    C = K(0) + K(1) {A(Ai)IA(Ap)} (5 )where K (0) and K (1) are they-intercept and slope of ther eg re ss io n lin e, respectively. A(A1) is the absorbancedatum at th e primary or analyte wavelength A1 , a ndA(A2) is the absorbance value at the reference wave-len gth A 2. T he n um era to r responds to changes in ana-lyte concentrat ion and the denom inator m im ics theen tire m atrix an d in heren t spectral variations.

    An example o f a d en om in at or being used as a normal-ization type of correction is illustrated by the NIBScalibration fo r to ta l p ro tein s in humRn serum. Theprimary wavelength for total protein is 2064 rim (i.e., aregion where both albumin and globulins exhib it ab-sorptions; Figure 4), wh ich is situated on the shoulder ofa s tr on g, broad absorption b an d of w ater. A co rrelationcoef fic ien t o f -0.98 and an SEC of 1.9 g/L were obtainedby using th e s in gle -t erm algorithm . Changes in th ew ater content as w ell as scattering differences due tol ipids affect th e to ta l p ro tein c alib ra tio n. T o c om p en sa tefo r these effects, the first overtone -OH water band at1440 rim was used as the denom inator term . Thisn orm aliz atio n te ch niq ue p rovid ed an R (multiple r) of0.99 and an SEC of 1.7 gIL. For tota l protein, 1 S , is 1.9g/L for the K od ak E ktach em vs the primary referencemethod (21).

    As th e c hemic al c om ple xity o f t he m atrix increases, it(4 ) becomes ve ry d if fic u lt t o is ola te u niq ue , n on ov er la pp in g

  • 7/29/2019 NIR Clinical Dimensions

    6/9

    -J0EE>C,U)

    -J0EEUwCo

    1.2

    o. e -0.S*

    0.4 - - -0. 2 -

    01 III l#{149}i 1I

    P L S F ac to r

    a bso rp tio n ba nd s fo r e ac h constituent. Additionally, inbiological matr ices th e an alyte is at low con cen tra tion sin the presence of a strongly absorbing w ater back-ground an d m ay be influenced by tem perature and byc hemic al s ur ro un din gs , for exam ple, by hydrogen bond-ing. In th ese c ircumstances, a B eers L aw approach doesnot prov id e accu rate quanti tation.To extract more fully th e meaningful information

    from th e spectrum , a mult ivariate P LS regressionapproach is often used. PLS uses all the spectralinformation that is relevant to the quantita tion of theanalyte. This provides fundamenta l advantages overtra ditio na l a pp ro ac he s (22). Because PLS uses eachwavelength, th e te ch niq ue e ncomp asses a ll th e signal-t o- no is e r atio of the spectrophotometer and can alsominimize aberrant effects of unknow n sources of spec-tral varia tion . This approach is ab le to extract chemicalinformation mor e e ff ic ie nt ly from a com plex spectruman d is better at m ea su rin g a na ly tes o f lo w concentrationbecause it is le ss affe cted by noise than mu lt ip le lin ea rle ast-sq uares re gre ssio n m od els. T hu s, for m in or c on stit-uents, P LS should p r ov id e s up e rio r results.In PLS, th e calibration m odel will have th e g en era lform

    A=TP+E (6)where m atrix A contains the N IB spectral data, matrixT contains the scores or w eights of each loading for eachsample, m atrix P contains th e vectors or loading s thatdescribe relevant spectral information, and matrix Econtains th e sp ec tr al residuals not explained by themode l (22,23).

    The procedures used to determ ine the loadings andscores have been described in detail (23). I n e ss en ce , th etechnique iteratively estim ates a lin ea r c om b in atio n ofspectral features that optim ally estim ate th e analyteconcentration. I n combina tion , th e first loading and firstsc ore d eriv ed fro m th e P LS reg ression w ill accoun t f or t hemajority of the chemically relevant spectral variationc on ta in ed w ith in th e data set. The residual errors be-tw een the reference and predicted values ar e determined,as are th e s pe ct ra l res iduals from th e curve-f it t ing pro-cess. T hese res idua ls are now used to determine th esecond load ing and second score. T his p ro ce du re is re-peated until sufficient loadings an d scores have beenderived to explain th e chemical data. The number ofscores an d lo ad in gs c alc ula te d re pre se nt th e number ofextractable phenomena (factors) in the ca libra tion set(e.g., an ideal two-component system would require nom ore than two sets of scores an d loadings to describe th esystem).A critical issue in PLS ca li br at ion is how best to selectt he approp ri at e number of factors to use in the calibra-tion equation. Too few factors and the model w ill notadequately describe th e system ; too many factors an dthe m odel w ill overflt the system and w ill b e in ap pro -priate for samples outside the calibration data set. Acommonly used method to determ ine the optim um num -ber of factors is called cro ss-v alidation . In c ro ss-v alid a-1 62 8 C LIN ICA LCHEM IS TRY , V o l. 3 8 , N o . 9 , 1992

    Fig. 6 . P lo t o f root-m ean-s quareerror of cross-valIdation (RMSCV;) a nd SEC ( ) againstnum berof PLS factorsfor h um anserum tr iglycerides NIRS calibrationHorizontalHneat 0.16 mm oULIndicatesone standarderror (S f or t he KodakEk tachem vs the pdma iy re fe rence method (24)tion , the calibration data are split many tim es in todifferen t calibration an d validation d ata sets. PL S mod-els w ith increasing numbers of factors are used topredict the analyte concentration for each of the valida-tion sam ples. The root-m ean-square error of cross-vali-dation (BM SCV) and the SEC are determ ined after eachfactor is added and p lotted vs the number of PL S factors.RMSCV is determ ined by calculating th e sq ua re root oft he a ve ra ge squared d if fe re nc e b et we en th e actual ana-lyte concen trations and the PL S predicted results fo r th evalidation data set. The number of factors where theSEC is similar to , but not less than, th e referencemethod error and the RMSCV reaches a m in imumvalue (and begins to p lateau) is taken as the number ofPL S factors to be used in the regression model. A s anexample, a p lot of the RMSCV and SEC vs the numberof PLS factors fo r hum an serum trig lycerides is shownin Figure 6. For serum triglycerides, 1 S for the KodakEktachem vs the primary reference method is 0.16mmol/L (24). By use of these factor-selection criteria inthe example shown, eigh t or nine factors a re a pp ro pr i-ate.The factors are also d isp layed and exam ined for cor-

    relation with spectral features related to th e analyte(ind icating real chem ical data) and spurious informa-tion re la te d to n ois e (arising fr om a n o ve rfitte d data set).Figure 7 shows the first PL S factor for serum triglycer-ides an d the second-derivative absorbance spectrum of aserum specim en contain ing 21 mmol /L triglycerides,plotted from 2034 to 2370 nm . Th is factor clearly con -tains spectral features attribu tab le to triglycerides.G enerally , the first few factors are in terpretab le. Laterfactors model the matrix effects and are more difficult tointerpret.The determ ination of serum t rig ly ce rid es e xemp lif ie s

    the superiority of PLS over multip le lin ear least-squares approaches. By use of a linear summntionequation with a primary wavelength for trig lycerides at

  • 7/29/2019 NIR Clinical Dimensions

    7/9

    2034 2076 2118 2160 2202 2244 2286 2328 2370Kodak EkLachemSerum Glucose (mmol/ L)

    C LIN IC AL CHEMIS TR Y, V ol. 3 8, N o. 9, 1992 1629

    W avelength (nm )Fig. 7. F ir st PLS fac to r fo r NIRS t ri g lycer ide ca lib ra t ion (-)comparedwi th NIR spectrum o f a s erum samp le w i th a t ri gl yce ri deva lu e o f 21 mmoL/L ( ) in th e 2034-2370nm spectra lregion

    2330 nm and a s ec on da ry c or re ct io n term for cholesterolat 1674 nm , a multip le r of 0.92 and an SEC of 0 .36mmol/L (i.e., a pp ro xim ate ly tw ic e th e r efe re nce m eth oderror) are obtained. For the PLS calibration , an eigh t-factor m od el containing th e 1635-1800 nm an d 2035 -2375 nm spectral segments produces an r of 0.99 and anSEC of 0.18 mmol /L . Including th e ninth PLS factordecreases the SEC by only 0.01 mmol /L . The PL Stechnique appears to m od el o th er sources o f c hem ic aland physical variation s, such as scattering in hyperlip-idemic specimens, other spectral interferences, and m a-trix effects that are not adequately modeled in them ultiw avelength algorith m (even by including addi-tional term s). Using th e entire N IR spectrum fo r PL Sinstead of the two selected regions produced poorerresults, illustrating that PLS models can be less sensi-tive th an m ultip le l inear least -squares models if super-fluous spectral information is in clu ded in th e c alib ra tio nmodel .

    PLS also provides outlier detection capabilities. Atho ro ug h descrip tion of the approaches used has beenpubl ished (23). Two commonly used outlier detectionmethods are X -residual and leverage. In X-residualoutlier detection, th e re sid ua l s pe ctrum of an unknownsample is obtained and compared w ith th e ca lib ra tio nresiduals (i.e., matrix E in equation 6). The presence ofan unmodeled spectral feature could produce an X -re-s idual outlier. Leverage ou tlier detection is based on theposition of the scores of an uncharacterized spectrumcompared w ith those of the calibration data set. Theposition of the scores can be determined graphically byusing a score plot, which is obtained by plotting th escores of the calibration data set against one another(i.e., score 1 vs sc ore 2 , e tc.). T he scores of the unknownspectrum are also plotted against one another andoverlaid on th e score p lo t. T he location of th e scores fo rth e unknown ar e now compared w ith the location of thescores for the calibration data set. A high leverages am ple w ill h ave sco res that are not located near th e

    Fig . 8. NIR S ca lcu la tedvs Kodak Ek tachem g lucose va lu es fo rhuman se rum an d NIRS predicted g lu cose va lu es for f lu o rld e /oxalateplasmasamples. ( .) P lasmasamplesare out liers .Cal ibrat ion model : 12 PLS f ac to rs ba sed on 1325 -1800 n m an d 2035-2375nm spectralsegm ents;n = 423. r = 0.99, SE C = 0. 8 mmol/Ltyp ic al samp le s co re s. L e ve ra ge outliers can occur whenth e concentration of a previously m odeled constituentfalls outside its reference range used for calibrationdevelopment.

    Outliers have been dete cte d in serum and plm sglucose d ete rm in atio ns . P re se nte d in F igure 8 is a scatterp lot for a 12-factor PLS serum g lu co se m od el conthiningth e 1325-1800 nm and 2035-2375 nm spectral segm ents.The calibration data set has 423 samples, and by usingthe above criteria fo r op tim ization , add ition al factorsco uld h av e been in clu ded . T his m od el produces an r of0.99 and an SEC of 0.8 mmol /L . Overlaid on this p lot isthe P12 predicted glucose result fo r several p lasm aspecimens collected w ith a f luoride-oxalate anticoagu-lant. Including th e p lasm a sp ec im en s worsens th e cali-b ratio n. C learly , p1nm specim ens are no t measuredcorrectly by using the serum glucose calibration equa-tion. These data are obvious outliers. The source of th euniqueness is attrib uta ble to th e presence of an unm od-eled spectral variance (i.e.,X-res idual outlier) arisingfrom the f luor ide/oxalate anticoagulant and perhaps duealso to a significant increase in hemolysis. T o date,sim ilar o utliers have not b ee n o bse rv ed for other ana-lytes.Validation

    T ra ditio na l le ast-s qu are s regression analysis m in i-m izes the sum of squares of the residuals in only thedependent variables. Multivariate regression tech-niques such as P12 are designed to m inim i7.e the sum ofsquares for a ll res iduals between the independent andd ep end ent v ar ia b le s (23). Thus, it is necessary to ensureth at so lu tio ns to equations 4, 5, and 6 have not beenoverfitted. An overfitted s ol ut io n r el at es unique but notrepresentative absorbance features in the calibrationset to the reference data. The analysis o f an i nd ep end en tvalidation set of samples (in addition to the cross-validation p ro ce du re s d es cr ib ed e ar lie r fo r PLS) is u sed

  • 7/29/2019 NIR Clinical Dimensions

    8/9

    1 2 3 4 5 SKodak Ek tachem Se rum T ri gl yce ri des (mmol /L )

    F ig . 9 . N IR S p red icte d vs K oda k E kta ch em tr iglyceridevalues inhuman se rum va li dat ionCal ibrat ionmodel :8 PLS factorsusIng 1635-1800 nm and 2035-2375 nmspec tra lsegments ; n - 339. r - 0.98, SEC = 0.18 mmol /L

    1630 CL IN ICALCHEM ISTRY , Vo l. 3 8 , No . 9 , 1992

    to av oid th is p ossib ility . T hese samples should not bethose o f t he c al ib r at io n data set but should resem ble thecalib ration sam ple se t in the range of an aly te con cen -trations a nd samp le variabil i ty. The d er iv ed c alib ra tio nalgorithm is used to determ in e the analyte concentra-tion for each of the predict ion sam ples (based on the N IBspectrum), th e result of which is compared with thereference analyte concentration. The c or re la ti on coeffi-cien t, th e stan da rd error of predict ion (SEP), and theslope a nd in terc ep t of the regression line are used tojudge th e adequacy of the calibration equation . Anaccu rate m ethod will have an r and a slope that ap -proach 1 and a y-in tercept close to zero. The SEP is thestandard d eviatio n o f th e differences between the N IB Sand reference values of th e prediction data set. A satis-factory NIBS method should yield an SEP that ap-proaches th e S EC . F or id en tic al distributions, the SEPwill a lways be greater than the SEC but w ill provide ab ette r estim ation of the true accuracy of the N IBStechnique.

    By use of the eight-factor P 12 m od el p reviou sly de -s cr ib ed f or triglycerides, th e ser um triglyceride concen-trations were predicted for a distinct se t of serumspecimens. A scatter plot of the NIBS predic ted vsreference concen trations is presented in Figure 9. Thev alid atio n r eg re ss io n equation was y = 0.96x + 0.04,with r = 0 .98 and an SEP = 0.19 mmol /L . Validat ionresults obtained for serum to tal proteins were y = 0.93x+ 4.1, r = 0.98, S EP = 1 .8 g/L , and for se ru m a lb um in ,y = 0.94x + 2.0, r = 0.98, S EP = 1 .3 g /L .Summary and Conclu s ions

    NIBS calib ra tions cannot b e d evelo ped by usingchem ical standards but are developed by using realprimary samples that have b een analyzed by othermethods for the analytes of interest. It is necessary toprocure a calibration se t of samples that accuratelyreflects the range of va lues to be encount er ed , in te rf er -in g s ub sta nce s, m atrix variat ions, a nd m atr ix effects.

    In biological specimens, traditional least-squares re -g ression techn iques based on Beers L aw a re some tim esadequate. More sophisticated mult ivariate analysistechniques, such as P12, are often necessary to fullye xtr ac t m ea nin gfu l inform ation from th e c omp le x spec-trum . A lgorithm s are developed and evaluated spectro-scopical ly and chemometri ca ll y.I n t rad it io n al techniques, algori thms ar e evaluated

    b y u sing the correlation spectrum a nd c or re la tio n p lo t.The prim ary ana ly tical w ave length should correspondto a k now n analyte absorp tion band . Correcting wave-length term s are added to compensate fo r interferingabsorptions an d scattering effects. In P12, all analyt i-cally relevant spectral information is used for analysis.The number of factors to be used is determined bycross-validation and by comparing the resulting stan -dard error of calibration w ith the reference m ethoderror. The factors should also be interpretable in termso f s pe ct ra l variations associated w ith the analyte ofinterest.Th e c alib ra tio n algorithm is va lidated by using an

    independen t set of samples to ensure that the calibra-tion data set is not overfitted an d to estab lish robust-ness. In addition to this, other possible interferencesfro m d ru gs or effects of d ise ase n eed to be checked for, aswith any clinical la bo ra to ry m e th od .

    This exploratory study has shown that, fo r selectedanalytes, N IB S correlates well enough to the KodakEktachem to be of clinical utility. This is demonstratedby th e fact th at th e stan dard errors of calibration andprediction of the NIBS models vs the Kodak Ektachemar e of the same order of magnitude as those of theKodak Ektachem against more fundamental referencemethods. It is now important to obtain sera analyzed byreference methods more fundamental than th e KodakEktachem to further eva lu ate th e accuracy of the N IB Salgorithms.The poten tial for N IBS in the clinical laboratory

    remains to be defined . W e are invest igat ing it s potentialfo r analyz ing whole blood, plasm a, serum , urine, feces,cerebrospina l flu id, breast m ilk, bone, and tissue fo rselected analytes. O ur prelim inary resu lts suggest thatNIBS is a nondestructive quantitative spectroscopictechn ique worth pursu ing. Incorporation of a small-volume flow -through cell and an au tosampler to m in i-m ize sample volume an d m anipu lation s cou ld lead tothe development of a c lin ic al a na ly ze r capab le of per-f ormin g mu lt ip le analyses on serum samples at up to 60samples per hour with no reagen ts and m in imal techn i-cal effort-a clear advan tage for clinical laboratories.W e thank Pau l J. Brinuner an d Stephen L M onfre for theirhelpful com ments and suggestions in the preparation of thismanuscript an d S tu ti J ag g i fo r skilled techn ica l a ss is tance .

    References1. Skogerboe KJ. Contributions o f an al yt ic al chem istry to thec l in i ca l labora tory . Ana l Ch em 1988 ;6 0 :1 2 71a -8 a .2. Jacques JA , Huss J, M cK ee ha n W , D ix nitro ffJ M, KuppenheimHF. Spectral r ef le ct an c e o f human skin in th e region 0 .7-2.6 mm .J Ap pI Ph ys io l 1 95 5; 8: 29 7- 9.3. Conway JM , N orris K H, B od well C E. A n ew ap proach for th e

  • 7/29/2019 NIR Clinical Dimensions

    9/9

    CLIN ICALCHEMISTRY, Vo l.38 , No .9, 1992 1631

    estimation of body compos it ion : in fra red in te ractance . Am J CliiiNutr 1984;40:1123-30.4. Cope M , Delpy DT. System fo r long-term measurement ofcerebral b loo d an d tissu e oxygenation on n ewbo rn i nf an ts by nearunfra-red transillumination. Med Biol Eng Comput 1988;26:289-94 .5. Takatani S , Cheung PW , E rnst B A. A non in vasive t issuereflectance o xim eter. A n instrum ent fo r measurement of t issuehemoglobin o xy ge n s at ur at io n in vivo. A nn Biomed E ng 1 98 0;8 :1-15.6. Jobsis FF. N onunvasive infrared m onitoring of cerebral an dmyocardial o xy ge na tio n s uf fic ie nc y a nd c ir cu la to ry p ara me te rs.Science 1977;198:1264-7.7. W arn er M D, Near- in f rared spec trophotomet ry in clin ical an al-ysin Anal Chem 1986;58:874a-6a.8. va n T oorenenbergen A W, Blijenberg BG, Le ijn se B . Measure-ment of to tal serum protein by near-infrared reflectance spectros-co py . J Cliii Chem C li ii B io ch em 1 98 82 6: 20 9- 11 .9 . Peuchant E , Salles C , Jensen R . Determ ination of serumcholesterol b y n ea r-in frare d re flec ta nc e Bpectrometry. A nal Chem1987;59:1816-9.10. Jensen R , Lugan I, Peuchant E . Bio log ica l analy si s withoutr ea ge nt. M yth or reality? Application to determ ination of serumtotal lip ids. Bull Soc Pharm Bordeaux 1986;125:43-52.11. D rennen JK , G ebhart BD , Kraemer EG , Lodder RA . Near-i nf ra re d s p ec tr ome tr ic determination of h yd rog en io n, g lu cose, an dhum an serum album in in a sim ulated bio logical m atrix . Spectroe-c o py 1 9 91 ;6 :2 8 -3 6 .12 . A rno ld M A , S m all G W . D ete rm in atio n of p hy si ol og ic al l ev elso f g lu co se in an aqueous m atrix with digital ly fi lt er ed f ou rie rtransform near-infrared spectra. A nal Chem 1990;62:1457-64.13 . B enini L , Caliari 5, G uid i GC, et al . Near in fr ar ed s pe ct rom -

    etry for fecal fa t m ea su rem ent: co mpariso n w ith con ven tiona lg ra vim etric a nd titrim etric methods. Gut 1989;30 :1344-7 .14 . O!Have r TC , B e gl ey T . Signal-to-noise ratio in higher orderd eriv ativ e sp ectro metry . A nal C hem 19 81;5 3:1 876 -8.15. Fell A F. B iom edical appl icat ions of der ivat ive spectroscopy.T rends A nal Chem 1983;2:63-.6.16. Levillain P, Pom peydie D . Derivative spectrophotometry :principles, advantages and lim itations, applications. Analysis1986;14:1-20.17. M errick MF, Pardue HL . Evaluation of absorption a nd firs t-an d seco nd -derivative sp ectra fo r s im u lt an eo us q ua nt ifi ca ti on ofbilirubin in h em og lo bin. C lin Chem 1986;32:598-602.18 . Herzyk E, Owen JS, Chapm an D . The s ec on da ry s tr uc tu re ofapolipoproteins in hum an H DL5 particles after chem ical m odifica-t io n o f t he ir t yr os in e, lysune, cysteine or arginine residues. Afourier transform infrared spectroecopy study. Biochim BiophysActa 1988;962:131-42.1 9. M o nta lv o JG , Faught SE , B uco S M, S ax to n A M . R elation ofthe acute pulmonary response to cotton dust and dust composi-ti on al a na ly si s by near-infrared reflectance spectroecopy. Part I:elutriated d u st . A p pl S p ec tr oe c 1 9 87 ;4 1 :6 4 5- 54 .20 . A lbum in test methodology, p u bli ca tio n n o. MP2 -1 7. R o ch es -ter, NY : Eastman Kodak Company, 1991.21 . Total protein test methodo logy , pub li ca t ion no . M P 2-1 8. R oc h-ester, N Y: Eastman K odak Com pany, 1986.22 . B eebe K , K ow aisk i BR An i nt ro d uc ti on to m ultiv ariate cali-bration an d analysis. A nal Chem 1987;59:1007a-17a.23. M artens H , Naes T. Multivariate calibration. New York: JohnW iley and Sons, 1989.24 . T rig lyceride test m ethodology, publication no. M P2-19. R och-ester, NY : Eastm an Kodak Co., 1986.