SPIN: A Method of Skeleton-Based Polarity Identification for Neurons

21
ORIGINAL ARTICLE SPIN: A Method of Skeleton-Based Polarity Identification for Neurons Yi-Hsuan Lee & Yen-Nan Lin & Chao-Chun Chuang & Chung-Chuan Lo # Springer Science+Business Media New York 2014 Abstract Directional signal transmission is essential for neu- ral circuit function and thus for connectomic analysis. The directions of signal flow can be obtained by experimentally identifying neuronal polarity (axons or dendrites). However, the experimental techniques are not applicable to existing neuronal databases in which polarity information is not avail- able. To address the issue, we proposed SPIN: a method of Skeleton-based Polarity Identification for Neurons. SPIN was designed to work with large-scale neuronal databases in which tracing-line data are available. In SPIN, a classifier is first trained by neurons with known polarity in two steps: 1) identifying morphological features that most correlate with the polarity and 2) constructing a linear classifier by determin- ing a discriminant axis (a specific combination of the features) and decision boundaries. Each polarity-undefined neuron is then divided into several morphological substructures (domains) and the corresponding polarities are determined using the classifier. Finally, the result is evaluated and warn- ings for potential errors are returned. We tested this method on fruitfly (Drosophila melanogaster) and blowfly (Calliphora vicina and Calliphora erythrocephala) unipolar neurons using data obtained from the Flycircuit and Neuromorpho databases, respectively. On average, the polarity of 8492 % of the terminal points in each neuron could be correctly identified. An ideal performance with an accuracy between 93 and 98 % can be achieved if we fed SPIN with relatively cleandata without artificial branches. Our result demon- strates that SPIN, as a computer-based semi-automatic meth- od, provides quick and accurate polarity identification and is particularly suitable for analyzing large-scale data. We imple- mented SPIN in Matlab and released the codes under the GPLv3 license. Keywords Neuronalpolarity . Dendrite . Axon . Drosophila . Automated neural reconstruction . Connectome Introduction Nervous systems process sensory information, transmit sig- nals and coordinate actions. Signal transmission in neural network is well-organized and correct directions of signal flow are crucial for the complex functions of nervous systems (Feinberg et al. 2008; Gordon and Scott 2009; Lin et al. 2013b; Pastrana 2013; Takemura et al. 2013). Therefore, directions of signal flow, whose accuracy relies on correct identification of neuronal polarity, have to be taken into con- sideration in the functional analysis of connectome. In recent years, several semi-automatic tracing programs, including Neurolucida (Glaser and Glaser 1990) (MBF Bio- science, Williston, VT, USA), Amira (Visualization Science Group), Vaa3D (Peng et al. 2010), as well as automatic neuron reconstruction programs have been developed (Bas and Erdogmus 2011; Chothani et al. 2011; Türetken et al. 2011; Wang et al. 2011; Zhao et al. 2011). These tools have greatly advanced our ability to construct connectome for various nervous systems (Donohue and Ascoli 2011; Parekh and Ascoli 2013). However, a high-throughput method to Electronic supplementary material The online version of this article (doi:10.1007/s12021-014-9225-6) contains supplementary material, which is available to authorized users. Y.<H. Lee : Y.<N. Lin : C.<C. Lo (*) Institute of Systems Neuroscience, National Tsing Hua University, Hsinchu 30013, Taiwan e-mail: [email protected] C.<C. Lo Brain Research Center, National Tsing Hua University, Hsinchu 30013, Taiwan C.<C. Chuang National Center for High-Performance Computing, Hsinchu 30076, Taiwan Neuroinform DOI 10.1007/s12021-014-9225-6

Transcript of SPIN: A Method of Skeleton-Based Polarity Identification for Neurons

Page 1: SPIN: A Method of Skeleton-Based Polarity Identification for Neurons

ORIGINAL ARTICLE

SPIN: A Method of Skeleton-Based PolarityIdentification for Neurons

Yi-Hsuan Lee & Yen-Nan Lin & Chao-Chun Chuang &

Chung-Chuan Lo

# Springer Science+Business Media New York 2014

Abstract Directional signal transmission is essential for neu-ral circuit function and thus for connectomic analysis. Thedirections of signal flow can be obtained by experimentallyidentifying neuronal polarity (axons or dendrites). However,the experimental techniques are not applicable to existingneuronal databases in which polarity information is not avail-able. To address the issue, we proposed SPIN: a method ofSkeleton-based Polarity Identification for Neurons. SPIN wasdesigned to work with large-scale neuronal databases in whichtracing-line data are available. In SPIN, a classifier is firsttrained by neurons with known polarity in two steps: 1)identifying morphological features that most correlate withthe polarity and 2) constructing a linear classifier by determin-ing a discriminant axis (a specific combination of the features)and decision boundaries. Each polarity-undefined neuron isthen divided into several morphological substructures(domains) and the corresponding polarities are determinedusing the classifier. Finally, the result is evaluated and warn-ings for potential errors are returned. We tested this method onfruitfly (Drosophila melanogaster) and blowfly (Calliphoravicina and Calliphora erythrocephala) unipolar neurons usingdata obtained from the Flycircuit and Neuromorpho

databases, respectively. On average, the polarity of 84–92 %of the terminal points in each neuron could be correctlyidentified. An ideal performance with an accuracy between93 and 98 % can be achieved if we fed SPIN with relatively“clean” data without artificial branches. Our result demon-strates that SPIN, as a computer-based semi-automatic meth-od, provides quick and accurate polarity identification and isparticularly suitable for analyzing large-scale data. We imple-mented SPIN in Matlab and released the codes under theGPLv3 license.

Keywords Neuronalpolarity .Dendrite .Axon .Drosophila .

Automated neural reconstruction . Connectome

Introduction

Nervous systems process sensory information, transmit sig-nals and coordinate actions. Signal transmission in neuralnetwork is well-organized and correct directions of signalflow are crucial for the complex functions of nervous systems(Feinberg et al. 2008; Gordon and Scott 2009; Lin et al.2013b; Pastrana 2013; Takemura et al. 2013). Therefore,directions of signal flow, whose accuracy relies on correctidentification of neuronal polarity, have to be taken into con-sideration in the functional analysis of connectome.

In recent years, several semi-automatic tracing programs,including Neurolucida (Glaser and Glaser 1990) (MBF Bio-science, Williston, VT, USA), Amira (Visualization ScienceGroup), Vaa3D (Peng et al. 2010), as well as automatic neuronreconstruction programs have been developed (Bas andErdogmus 2011; Chothani et al. 2011; Türetken et al. 2011;Wang et al. 2011; Zhao et al. 2011). These tools have greatlyadvanced our ability to construct connectome for variousnervous systems (Donohue and Ascoli 2011; Parekh andAscoli 2013). However, a high-throughput method to

Electronic supplementary material The online version of this article(doi:10.1007/s12021-014-9225-6) contains supplementary material,which is available to authorized users.

Y.<H. Lee :Y.<N. Lin : C.<C. Lo (*)Institute of Systems Neuroscience, National Tsing Hua University,Hsinchu 30013, Taiwane-mail: [email protected]

C.<C. LoBrain Research Center, National Tsing Hua University,Hsinchu 30013, Taiwan

C.<C. ChuangNational Center for High-Performance Computing, Hsinchu 30076,Taiwan

NeuroinformDOI 10.1007/s12021-014-9225-6

Page 2: SPIN: A Method of Skeleton-Based Polarity Identification for Neurons

automatically identify the polarity of reconstructed neurons isnot yet available. As a consequence, information about polar-ity is limited or absent in some of the neuronal databases(Ikeno et al. 2008; Chiang et al. 2011) and data reconstructionprojects (Brown et al. 2011; Gillette et al. 2011; Lee et al.2012). To address the issue, we developed a computer-basedalgorithm that can automatically identify the polarity (axon ordendrite) of each neuronal substructure by analyzing its mor-phological features. The algorithm is particularly useful foranalyzing data in large-scale neuron databases such as theFlycircuit database (Chiang et al. 2011), one of the largestDrosophila neuronal databases. The database contains~16,000 neurons with diverse morphology and the entire flybrain (~100,000 neurons) is expected to be mapped out withinthe next few years. The proposed algorithm is crucial forconstructing the whole-brain connectome and studying itscomputational properties which will bring the application ofthe large-scale neuron databases to another level.

Identification of neuronal polarity can be accurately doneby means of biomolecular labeling (Campagne et al. 1990;Craig and Banker 1994; Matus et al. 1981; Robinson et al.2002; Wang et al. 2004). However, the process is time-consuming and inefficient for large-scale neural networks.Importantly, the technique has to be deployed at the stage ofdata acquisition and therefore does not help the existing datain neuron databases. Nevertheless, some morphological fea-tures in the existing data may be used for polarity identifica-tion (Craig and Banker 1994; Squire et al. 2008). Takingunipolar neurons in insects for example, typical axonalneurites tend to be constant in radius, characterized by boutons(presynaptic specializations) and extend further away from thesoma than dendritic structures do. Dendrites, on the otherhand, are significantly tapered at the ends and form extensivebranches close to the soma, but do not have clearly definedpostsynaptic specializations (Hanesch et al. 1989; Rolls2011).

Therefore, it is possible for the polarity of neurons to beidentified based on morphological features. Intuitively, theidentification can be easily done by measuring the diameterof neurites in a substructure of a neuron. However, our eval-uations showed that, although diameter is a good feature forpolarity identification, combination of other morphologicalfeatures can push the accuracy of polarity identification tomuch higher level.

In the present study, we developed the method SPIN(skeleton-based polarity identification for neurons) that isdesigned to efficiently classify substructures of neuronal skel-etons into axons or dendrites for data generated from anautomated neuron image reconstruction pipeline such as thatin the Flycircuit project (Chiang et al. 2011; Lee et al. 2012).SPIN consists of the following procedures: machine learningalgorithms are first applied on neurons with known polarity(the training data) to extract morphological features that most

correlate with the polarity. Next, a linear classifier is construct-ed by determining the combination of the features that bestpredict the polarity. Finally, to test neurons with unidentifiedpolarity, SPIN clusters each neuron into several morphologi-cal substructures and identifies corresponding polarities usingthe classifier. We further developed a post-identification eval-uation system which returns warnings for potential errors ofpolarity identifications.

Materials and Methods

Neuronal Skeleton Acquisition and Data Selection

In the present study we analyzed 107 neurons in Drosophilabrain based on the skeleton (tracing lines) data that are avail-able from the Flycircuit database (http://www.flycircuit.tw/)(Chiang et al. 2011). The skeletons were reconstructed fromthe raw images using the algorithm proposed in Lee et al.(2012). We chose neurons that innervate the protocerebralbridge (PB) and the medulla (MED) for analysis.

PB Innervating Neurons

The morphological classification and neural polarities hadbeen well-described for PB innervating neurons (Lin et al.2013a) and therefore these neurons were ideal for testingSPIN. The neuronal polarities of PB neurons were identifiedexperimentally based on the presynaptic marker (Syt::HA)and the postsynaptic marker (Dscam17.1:: GFP) (Lin et al.2013a). In order to test the capability of SPIN, diverse mor-phologic classes must be represented. To this end, we ran-domly chose one neuron from each type of PB innervatingneurons that were classified based on their innervation pat-terns (Lin et al. 2013a).

MED Innervating Neurons

We chose MED neurons because their morphologies weredistinct from those of PB neurons and could therefore inde-pendently test the capability of SPIN. Moreover, MED is animportant structure in the Drosophila visual system that hadbeen studied extensively. In particular, the MED neuron’spotential polarity can be easily identified. In the present study,the “ground true” polarity of MED innervating neurons weredetermined by directly examining neuronal fluorescent im-ages using the morphological features (e.g., boutons andneurite thickness) described in Hanesch et al. 1989 andFischbach and Dittrich 1989.

Some of the tracing-line data in the database were poorlyreconstructed and exhibited marked discrepancies from theirraw images. After removing these neurons, we selected 30 PBneurons and 37 MED neurons as training data. Another 40

Neuroinform

Page 3: SPIN: A Method of Skeleton-Based Polarity Identification for Neurons

neurons (20 from PB and 20 fromMED datasets) were select-ed to serve as test data (see Online Resource Section B for alist of neuron names analyzed in the present paper).

Neuron Reconstruction and Neurite Diameter

To investigate whether diameter of neurites provides usefulinformation for polarity identification, we reconstructed thesame set of PB and MED neurons from their raw images andextracted radius of neurites using Vaa3D-Neuron 2.0 (Penget al. 2010) with the module APP2 (Xiao and Peng 2013). Toimprove the reconstruction, we supplied Vaa3D with markerfiles that contain reference points about soma, terminal andbranch positions. Two MED neurons were excluded from theanalysis due to the large discrepancy between the Vaa3Dreconstructed morphology and the original skeletons.

Blowfly Neurons

In order to test the performance of SPIN across differentdatasets and also to further test usefulness of neurite diameterin polarity identification, we downloaded all 56 reconstructedblowfly neurons from neuromorpho.org (Ascoli et al. 2007).Drosophi la neurons were no t chosen f rom theneuromorpho.org database because they were either peripher-al neurons or partial neuron structures which were not thesuitable for SPIN.

SPIN Workflow and Algorithms

Overview

The software implementation of SPIN was written in Matlab(download at http://life.nthu.edu.tw/~lablcc/SPIN), and theTREES toolbox (Cuntz et al. 2010) was incorporated to facil-itate neuron data processing. To avoid confusion, we followthe terminology of neuronal features (e.g., branch point, ter-minal point) used in the TREES toolbox (see Online ResourceSection A). A machine learning toolbox (Jang 2012) was alsoincluded to provide some of the functions required for polarityidentification. Currently, SPIN only supports the SWC fileformat (Cannon et al. 1998) for both the tracing line data inputand the resulting modified output. The entire workflow can bedivided into three stages: 1) manual data preprocessing, 2)classifier training, and 3) polarity identification (Fig. 1). Toconstruct a suitable classifier, SPIN must be trained by neu-rons with known polarity. Furthermore, the original data fromFlycircuit must be manually preprocessed into a format thatcan be used to train the classifier in the second stage. There-fore, the first two stages are only needed if a trained classifieris not available to SPIN. The actual polarity identification forneurons with unidentified polarity is performed in the thirdstage.

Stage 1: Manual Data Preprocessing

The data preprocessing stage is divided into two steps andSPIN provides tools for processing each of the steps (seeOnline Resource Section E). First, artificial branches notpresent in the raw image, often produced during reconstruc-tion due to noise, are removed manually to improve trainingdata quality. Second, the skeleton of each training neuron ismanually divided into several (typically 2–3) substructuresand the polarity is manually labeled for each of them. Uponcompletion, the tool returns SWC files for each training neu-ron and for each substructure with labeled polarity. Suchmanual data preprocessing can be skipped if correspondingSWC files are supplied.

Stage 2: Classifier Training

The classifier training stage consists of three steps: featureextraction, feature selection, and discriminant axis determina-tion. This stage can also be skipped if a pre-trained classifier issupplied.

& Step 1: Feature extractionSPIN measures morphological features (see Table 1

and Online Resource Section C for detailed definitions)for each substructure using the TREES toolbox (Cuntzet al. 2010). To avoid polarity-unrelated effects caused bydifferences in neuronal size, all features extracted from

Fig. 1 The three-stage SPIN workflow. First, the skeleton of each train-ing neuron is manually clustered into morphological substructures andeach substructure is then labeled with its polarity. Second, the classifier isgenerated based on polarity-relevant morphological features extractedfrom the training data. Third, the classifier is used to identify the polarityof test neurons or neurons with unidentified polarity

Neuroinform

Page 4: SPIN: A Method of Skeleton-Based Polarity Identification for Neurons

substructures are normalized by the value extracted fromthe entire neuron. For example, feature #1 is the summa-tion of all segment lengths within a specific substructure.This value is divided by the summation of segmentlengths for the entire neuron. Note that features #12–14are excluded from our analysis because they are spatialfeatures that are evidently unrelated to neuronal polarity.

& Step 2: Feature selectionTo reduce computational overhead and improve classi-

fier accuracy, features that most correlate with neuronalpolarity must first be identified. To this end, SPIN com-bines the k-nearest neighbor classifier (KNNC, k=11)with the leave-one-out (LOO) method (Whitney 1971) toevaluate the correlations between every possible subset offeatures and neuronal polarity. Specifically, for a given setof n features, SPIN plots each substructure of the trainingneurons as a point in the corresponding n-dimensionalfeature space. The axonal and dendritic substructures formtwo clusters that partially or completely overlap in thefeature space. Next, SPIN selects one substructure andtests whether its polarity can be correctly classified usingKNNC based on the polarity of the k nearest substructures.After repeating this procedure for every point in the fea-ture space, SPIN obtains a percent correctness that repre-sents the correlation between the subset of features and

neuronal polarity. The most highly correlated subset offeatures is selected for further processing in the next step.

& Step 3: Discriminant axis determinationOnce the most polarity-relevant subset of features has

been decided, SPIN applies linear discriminant anal-ysis (LDA) to determine a discriminant axis. Thegoal is to find the optimal weight combination forthe features that best separates the training dataaccording to neuronal polarity. The LDA algorithmused generates n possible weight combinations(which we call “discriminant axes”) for n features(Duchene and Leclercq 1988). By using KNNC andLOO again, SPIN finds the axis such that, when thetraining data are projected onto it, leads to highestcorrelation with neuronal polarity. The discriminantaxis then serves as the classifier that is passed on tothe third stage for polarity identification.

Since axons and dendrites tend to be segregatedinto distinct clusters when projected along the dis-criminant axis, it is possible to determine a universalboundary to serve as the decision criterion duringclassifier training. However, to increase the accuracy,such a universal criterion is not used. Instead, SPINdetermines the individual boundary for every neurontested by calculating the midpoint between the twomost distal substructures of each neuron on thediscriminant axis. We define the “discriminant score”as the projection of a substructure on the discrimi-nant axis. Suppose that substructures with higherdiscriminant scores correspond to axons and supposethat there is a test neuron that is divided into threesubstructures with the discriminant scores −2.1, 2.0and 3.7. For this neuron, SPIN determines the deci-sion boundary as (−2.1+3.7)/2=0.8 and the polarityof these three substructures are identified as dendrite, axonand axon, respectively.

Stage 3: Polarity Identification

In the third stage SPIN applies the trained classifier on neu-rons with unidentified polarity. The polarity identificationstage consists of three steps: morphological clustering, polar-ity identification, and evaluation & warning. The skeleton of aneuron must first be divided into several substructures, eachconsisting of a cluster of neurites, before feeding it into theclassifier because the classifier works on the basis of substruc-tures. In the second step (polarity classification), each sub-structure is processed by the classifier and the polarity isdetermined. Finally, the identification results are evaluatedand a warning is issued for neurons that do not pass certaincriteria due to potential misidentification. The detailed proce-dures are described below:

Table 1 Morphological features used in this study. Features #12–14 areexcluded from our analysis. Features #1–17 are built-in functions of theTREES toolbox

Morphological features

1. Summation of segment lengths

2. Maximum path length

3. Number of branch points

4. Mean ratio of Euclidean length to path length

5. Maximum branch order

6. Mean branch angle

7. Mean branch length

8. Mean path length

9. Mean branch order

10. Ratio of width (x direction) to height (y direction) of the substructure

11. Ratio of width (x direction) to depth (z direction) of the substructure

12. Center of mass of the substructure in the x direction

13. Center of mass of the substructure in the y direction

14. Center of mass of the substructure in the z direction

15. Volume of the convex hull

16. Mean asymmetry at branch points

17. Mean volume of Voronoi pieces

18. Balancing factor

19. Path length to soma

20. Branch order in a complete neuron

Neuroinform

Page 5: SPIN: A Method of Skeleton-Based Polarity Identification for Neurons

& Step 1: Morphological clusteringMost neurons in our samples have 2–3 visually distinct

substructures, each consisting of a cluster of neurites thatcorresponds to an axonal or a dendritic arbor. SPIN auto-matically divides the neuron arborization into such distinctsubstructures. This is the most crucial step of SPIN be-cause the quality of clustering affects the morphologicalfeatures of the resulting substructures and thus greatlyinfluences the accuracy of polarity identification. Theautomatic procedure is controlled by six tunable parame-ters (see Online Resource Section D for definitions andvalues used in the present study) and consists of two sub-steps: 1) artificial branch removal and 2) dividing pointscan.

1) Artificial branch removalTo improve the accuracy of morphological clus-

tering, artificial branches need to be removed before-hand. Based on our observations, a branch is likely tobe an artifact if it connects to the neuron “trunk”without further branching. Therefore, we first needto identify the trunk for each neuron, which is not aneasy task as it may sound. A trunk is an extendedbranch (usually the thickest in the raw image) origi-nating from the soma. In a unipolar neuron, a trunkmay split a few times before bursting into axonal ordendritic arbors (Fig. 2e). To isolate the trunk of aneuron, we first eliminate terminal branches, i.e.,those that are composed of one branch point andone terminal point. In the default setting, SPIN firstrepeats this process three times (specified by theparameter nCleanTimes) (Fig. 2a–d). Typically, at thisstage most but not all small branches can be re-moved from the trunk. In order to further re-move branches without accidently removing partof the trunk, SPIN performs the branch removalprocedure a fourth time with an additional cri-terion: SPIN only removes terminal branchesshorter than a threshold given by the criterion:

terminal branch length

length from terminal to soma≤ThRemoveLen;

where ThRemoveLen is a tuning parameter whoseoptimal value varies among brain regions (OnlineResource Section D). For terminal branches longerthan the threshold, SPIN only removes the portions ofthe branches that meet the criterion (Fig. 2e). Finally,SPIN compares the original skeleton with the identi-fied trunk and removes all branches that are connect-ed to the trunk without further branching. The“cleaned” neurons (Fig. 2f) are then used in the nextsub-step.

2) Dividing point scanNext, we perform morphological clustering by

scanning the neuronal skeleton for “dividing points”that represent the root of each substructure (Fig. 3).To this end, we consider two prominent morpholog-ical features that are typically present at the roots of

Fig. 2 Artificial branch removal. The procedure is required in order toincrease the accuracy of morphological clustering. a The original neuronskeleton. Typically, a branch is likely to be an artifact if it connects to theneuron trunk without further branching. Therefore, SPIN first needs toidentify the trunk. b–d To identify the trunk, terminal branches areremoved and this procedure is first repeated three times. At this stage,there are still some branches left. e During the fourth branch removal,SPIN implements a criterion to avoid removing the trunk. f For thebranches removed in (b–e), SPIN deletes those connected to the trunkwithout further branching (black) and replaces the rest (green). A“cleaned” neuron (red + green) is now ready for the next sub-step

Neuroinform

Page 6: SPIN: A Method of Skeleton-Based Polarity Identification for Neurons

substructures: long preceding undivided branchesand splits from trunks.

i. Long preceding undivided branches:Based on our observation, roots of substructures

are typically preceded by branches running for a

relatively long distance without encounteringbranch points. The distance criterion is defined bythe parameter ThLongBranch, which is normalized bythe distance between the soma and the farthest ter-minal point of the neuron. ThLongBranch is a tuningparameter whose optimal value varies among brain

Fig. 3 The procedure of morphological clustering. a A schematic neu-ron. b After artificial branch removal in the previous sub-step, twobranches are removed (gray dash line). The “cleaned” neuron is usedfor performing morphological clustering. SPIN first identifies potentialdividing points using two procedures illustrated in (c) and (d). c i. Eachnode is labeled by the number of descendant terminal points. ii–vi.Starting from the soma, SPIN looks for undivided branches that arelonger than ThCleanLen (yellow bars), and labels their ending branch pointas a potential dividing point (gray circles). d i. SPIN sorts the number of

descendant terminal points of each branch point in descending order andlooks for large drops (red numbers) in the series. ii. The branch pointbefore the drop is labeled as a critical point, and the following two branchpoints are labeled as potential dividing points. e i. Now SPIN discoversfour potential dividing points but only registers those with a large numberof descendant terminal points as “real” dividing points. ii. Using the threedividing points, SPIN extracts three substructures from the neuronalskeleton

Neuroinform

Page 7: SPIN: A Method of Skeleton-Based Polarity Identification for Neurons

regions (Online Resource Section D). SPIN scansthe neural skeleton and labels each branch pointpreceded by an undivided branchwith length greateror equal to ThLongBranch as a “potential dividingpoint.” (Figure 3c)

ii. Splits from trunksBased on our observation, roots of substructures

are typically branched out from a common trunk,and the numbers of terminal points descendingfrom each root are equal in magnitude. For eachbranch point in a neuron, SPIN counts the numberof terminal points (nTP) that are descendants from it(Fig. 3d). SPIN then sorts the nTP’s in descendingorder and looks for large drops in the series, whichindicate a split from the neuron trunk (Fig. 3d). Wecall the node prior to a drop a “critical point,” andlabel the following two branch point as “potentialdividing points.” Two criteria are used to definecritical points. The first is given by

nTP ið Þ−nTP iþ 1ð ÞnTP 1ð Þ ≥ThCritP;

where nTP (i) is the number of terminal points thatare descendants of the ith branch point when thebranch points are sorted by their nTP number indescending order, and ThCritP is a threshold set to0.35 by default. The value of ThCritP can be fine-tuned to improve performance and 0.35 wasfound to work well for both PB and MEDneurons.

To make sure the critical point lies on a trunk,a second criterion is given by

nTP of critical point

nTP 1ð Þ ≥ThnumTP;

where ThnumTP is a threshold percentage (set to85 % for both PB and MED neurons).

Upon completion of the skeletal scans, SPINusually labels several potential dividing points foreach neuron. SPIN then counts the number of ter-minal points descending exclusively from each po-tential dividing point (nTP,exclusive) and only regis-ters it as a “real” dividing point if the followingcriterion is satisfied:

nTP;exclusivenTP 1ð Þ ≥ThDP;

where the threshold ThDP is a tuning parameterwhose optimal value varies among brain regions(Online Resource Section D, Fig. 3e). These divid-ing points are then labeled on the original neurons(those prior to artificial branch removal) as the roots

of substructures and the labeled neurons are ready forpolarity identification.

& Step 2: Polarity identificationWith roots of substructures labeled, SPIN uses the

classifier trained in the second stage to identify the polarityof each substructure. Following a similar procedure de-scribed for training neurons in Stage 2, SPIN extracts(now pre-selected) features from the substructures, projectseach substructure onto the (now pre-trained) discriminantaxis, calculates their discriminant scores, determines deci-sion boundaries neuron-by-neuron and identifies the polar-ities accordingly. SPIN exports one SWC file for everyneuron with labeled polarity in the “structure type” field.

& Step 3: Evaluation and warningSince SPIN performs morphological clustering and

polarity identification automatically, neural substructuresare potentially misclassified. To this end, SPIN evaluatesthe result based on two criteria and returns warnings if theresult does not pass either of the criteria. There are twotypes of warnings (one for each criterion):

1) Type I warning indicates conflict between the identi-fied polarity and population characteristics.

For neurons being divided into two substructures,the polarities can be easily decided from discriminantscores. However, for neurons possessing three ormore substructures, the substructures with intermedi-ate scores (neithermaximum nor minimum) are proneto be misclassified. To address this issue, we checkedwhether each substructure “looked” more like axonsor dendrites within the sample population. Specifical-ly, SPIN calculates the mean and the standard devia-tion of the discriminant scores for the distributions ofaxon and dendrite substructures from all sample neu-rons. Next, SPIN calculates the z-score of the sub-structure in question with respect to the axon anddendrite distributions. If the substructure was origi-nally classified as an axon (dendrite) but the absolutevalue of its z-score with respect to the dendrite (axon)distribution is smaller than that of the axon (dendrite)distribution, SPIN returns a type I warning.

2) Type II warning indicates inconsistence between theidentified polarity and the substructure’s physicallocation on the neuronal skeleton.

Based on our observation, for neurons with amajor split from its trunk (equivalently, with at leastone critical point identified during morphologicalclustering, substructures located on the same descen-dent branch from the split typically have the samepolarity. If SPIN detects that different polarities wereassigned to substructures located on the same branchfollowing a major split, a type II warning is issued

Neuroinform

Page 8: SPIN: A Method of Skeleton-Based Polarity Identification for Neurons

(for an example see Fig. 10d).SPIN indicates warnings as negative values in the

“structure type” field of the resulting SWC files.

Results

SPIN achieved satisfactory accuracy in polarity identification.Here the accuracy is evaluated at the terminal level: for each testneuron, the identification results are compared with the known“true polarities” terminal-by-terminal. Specifically, SPIN returnsa number indicating the percentage of terminal points beencorrectly identified if information on true polarity is available.For test neurons in PB and MED, the mean accuracy was 90 %or higher. This number was significant, because it would lead toa highly accurate reconstruction of signal flow in a neural circuit(see Discussion). Below, the performance evaluation results ofboth the morphological clustering and the classifier, two com-ponents that critically influence the accuracy of polarity identi-fication, are presented. The detailed results of polarity identifi-cation are then presented under various conditions.

Morphological Clustering

SPIN in general performed very well in morphological clus-tering and most neurons were correctly clustered by SPIN into2–3 substructures (Fig. 4a). One advantage of the clusteringalgorithm was its independence from the spatial organizationof the neuron arborization. By mainly considering the relativelength of branches as well as how a branch splits, SPINcorrectly identified most substructures no matter whether theyformed spherical, linear or laminar shapes, and whether thesubstructures were loose or dense (Fig. 4a).

Despite its great performance, SPIN still produced faultyclustering for a small portion of the sample population. Weobserved two main sources of errors: 1) artificial branches and2) atypical morphology.

Artificial Branches

SPIN removed artificial branches prior to the dividing pointscan during morphological clustering. Because the artificialbranches seriously reduced the performance of the dividing

Fig. 4 Sample results of morphological clustering. a Five sample neu-rons with near-perfect morphological clustering. SPIN could correctlyidentify most substructures with diverse shapes and terminal point den-sity. Red filled circles indicate somas. Other different colors representdifferent substructures been identified by SPIN except for black, whichindicates trunks or branches that are not part of the substructures. b In thiscase, SPIN selected three dividing points which resulted in three sub-structures (green) (left). Black branches represent artificial branches

removed during morphological clustering. However, after the removedbranches were recovered later during the process of polarity identifica-tion, they caused undesirable morphological changes (see blue and yellowsubstructures) (right). c Because the parameters for morphological clus-tering were manually tuned for typical neurons in the sample population,neurons possessing atypical morphology may be separated into too manysubstructures

Neuroinform

Page 9: SPIN: A Method of Skeleton-Based Polarity Identification for Neurons

point scan, our removal algorithm was aggressive in order toremove as many artificial branches as possible. Consequently,real branches were sometimes accidentally removed as well.The accidental removals were acceptable because they do notsignificantly affect the result of the dividing point scan. How-ever, after the dividing points were identified, we placed allremoved branches back because we did not want to lose anyreal branches in the later polarity identification. In some cases,the recovered branches significantly altered the size and shapeof an identified substructure (Fig. 4b). Altered substructuressometimes led to incorrect polarity identification.

Atypical Morphology

Some neurons exhibited morphologies markedly differentfrom other neurons in the sample population. Because theparameters controlling morphological clustering were manu-ally tuned for the typical morphology in a sample population,neurons with atypical morphology were sometimes incorrect-ly divided by SPIN and led to either too many substructures orno identified substructure (Fig. 4c).

Classifier

Feature Selection

During feature selection in the classifier training stage, SPINevaluated all possible combinations and selected the subset offeatures (listed in Table 2) that led to the highest correlationwith neuronal polarity (99 % for PB neurons and 93 % forMED neurons). Only the feature “path length to soma” wasshared between the 6 features selected for PB neurons and the4 features selected for MED neurons (Table 2). Indeed, thepath length to soma was a common measure for quick iden-tification of neuronal polarities by the eye.We checked wheth-er including other features in SPIN really improved the corre-lation with neuronal polarity. We found that if SPIN only usedthe feature “path length to soma,” the correlation dropped to94 % for PB neurons and 71 % for MED neurons. This resultindicated that, although the path length to soma was a

prominent feature for identifying neuronal polarity, combiningit with other features significantly improved the classifier.

Feature Extraction and Decision Boundary

SPIN performed LDA on the selected features to extract severalpotential axes and then found the best discriminant axis thatmost correlated with the polarity (see Materials and Methods).To examine the effectiveness of the chosen discriminant axis(represented by the weightings listed in Table 2), we projectedall substructures of training neurons onto it. We observed a clearsegregation between the distributions of axonal substructuresand dendritic substructures (Fig. 5). Based on this observation,the most intuitive way to construct a classifier was to find asingle decision boundary that separates the two distributions.However, such a universal decision boundary was deemed notideal for two reasons: 1) the two distributions overlapped slight-ly. No matter where a universal decision boundary was placed,there would always be substructures being misclassified. 2)Obviously, a classifier should assign at least one axonal sub-structure and one dendritic substructure to a neuron. However, auniverse decision boundary could not guarantee that.

Therefore, we decided to use neuron-specific decisionboundaries for the classifier. For each neuron, SPIN deter-mined the boundary individually for each neuron by locatingthe midpoint between the substructures with the maximumand minimum discriminant scores (Fig. 6). We tested bothapproaches (universal and neuron-specific boundaries) on thetraining neurons and found an improved accuracy for the latterapproach in the case of MED training neurons (Table 3).

Polarity Identification

We evaluated the performance of SPIN by calculating a “ter-minal-level accuracy,” the percentage of terminal points in aneuron with correctly identified polarity. We applied SPIN onall neurons and found a good raw performance with terminal-level accuracies of 84 and 87 % for PB training and testneurons, respectively, and 89 and 92 % for MED trainingand test neurons, respectively (Table 4). However, if we

Table 2 The features selected for PB and MED and corresponding weights obtained using LDA

PB MED

Weight Selected feature Weight Selected feature

0.3594 3. Number of branch point 0.2183 1. Summation of segments length

−0.0661 4. Mean ratio of Euclidean length to path length 0.5721 13. Mean asymmetry at branch points

−0.7090 8. Mean path length −0.5137 17. Mean volume of Voronoi pieces

0.0031 11. Ratio of width to depth −0.6010 19. Path length to soma

−0.5944 19. Path length to soma

0.1021 20. Branch order in the complete neuron

Neuroinform

Page 10: SPIN: A Method of Skeleton-Based Polarity Identification for Neurons

excluded the neurons detected by the warning system andcalculated the terminal accuracy for the rest, the accuracyincreased up to 94 %, which represented the ideal perfor-mance of SPIN. This result suggested that the warning systemdetected a significant portion of misclassified neurons (Ta-ble 4). As we mentioned in Materials & Methods, artificialbranches seriously affected morphological clustering and thusreduced the accuracy of polarity identification. To further testthe capability of SPIN, we supplied it with “ideal” data bymanually removing artificial branches for each neuron. Underthis ideal condition, SPIN delivered an impressive perfor-mance with an up to 98 % accuracy.

It was informative to examine the performance of SPINneuron-by-neuron. We juxtaposed the discriminate scores ofthe substructures for each neuron in PB (Fig. 7) and MED(Fig. 8). We also indicated the type of warning (if issued) andterminal-level accuracies on the same plots. Two observationswere evident: 1) For most of the neurons, supplying SPINwith ideal data (manual cleaning of artificial branches)

significantly increased terminal-level accuracy. 2) Artificialbranch removal did not improve performance for only a smallportion of the neurons with poor terminal-level accuracy (forexample, neuron ID=9, 16, 26, 27 for PB training neurons and6, 8, 14 for test neurons). However, most of these neuronswere successfully detected by the warning system. Out of107 PB or MED neurons, only two (ID=1 in PB trainingneurons and 10 in MED training neurons) were seriouslymisidentified (with accuracy <50 %) and were neither detect-ed by the warning system nor affected by artificial branchremoval. The result suggested that only ~2 % of the neuronscannot be correctly handled by SPINwhen supplied with idealdata and with the warning system in place. Finally, we exam-ined several sample neurons to evaluate the capability of SPINin identifying the polarity of neurons with diverse morphology(Fig. 9) and also to locate factors that reduced SPIN’s accuracy(Fig. 10). Although their morphologies are diverse, thecorrectly-identified neurons had some characteristics in com-mon. First, these neurons all had good image reconstruction

Fig. 5 Discriminant scoredistributions of the axonal anddendritic substructures for PB(top) and MED (bottom) neurons.The red and blue vertical linesindicate the means of thecorresponding (color matched)distributions. The standarddeviations are represented by thewidths of the shaded areas.Although the distributions of theaxonal and dendritic substructureswere significantly different fromeach other, they still slightlyoverlapped. As a consequence, asingle decision boundary (theblack vertical lines) is notsufficient to completely separatethe two distributions. Crosses andcircles indicate the positions ofindividual axonal and dendriticsubstructures, respectively

Neuroinform

Page 11: SPIN: A Method of Skeleton-Based Polarity Identification for Neurons

quality, i.e., the reconstructed tracing-line data faithfully rep-resented their morphological features. Second, the distinction

in terms of morphological features between axonal and den-dritic substructures was clear. Last but not least, these neurons

Fig. 6 Neuron-by-neuron discriminant scores. A higher discriminatingaccuracy can be achieved if we set a specific decision boundary for eachneuron (black asterisks) rather than setting a universal boundary (blackhorizontal lines). Red arrows indicate the substructures that would beincorrectly classified if the universal boundary were used. Green arrowsindicate the substructures that were correctly identified with the universal

boundary but not with neuron-specific boundaries. Overall, using neuron-specific boundaries produced fewer errors than using universal bound-aries. The green and blue horizontal lines represent the population aver-age for the axonal and dendritic substructures, respectively, while thesizes of the standard deviations are indicated by the shaded areas

Neuroinform

Page 12: SPIN: A Method of Skeleton-Based Polarity Identification for Neurons

shared similar polarity-related morphological features withthose used to train the classifiers.

For neurons with lower than 80 % terminal-level accuracy,we observed three main types of errors (Fig. 10). The firstcame from incorrect morphological clustering (Fig. 10a–c).Sometimes this poor clustering arose from a choice of clus-tering parameters incompatible with the atypical morphology,often resulting in too many substructures (Fig. 10b). The poorclustering could also arise from faulty branch treatments, suchas a failure to remove artificial branches (Fig. 10a) or anaccidental removal of branches that were actually part of theneuronal arborization (Fig. 10c).

The second type of errors resulted from the ambiguity ofsubstructural features. Some neurons that were divided intomore than two substructures and those substructures withoutclear morphological characteristics were prone to bemisclassified (Fig. 10d). But this type of error was oftensuccessfully detected by the warning system.

The third type of error, though only occurred once in oursample of 107 neurons, reflected the limitation of SPIN. Thepolarity-related morphological features of these neurons werevery different from those of the training neurons and caused acompletely failure when using the trained classifier (Fig. 10e).For this neuron (MED training neuron #10), substructure #2was larger in both “summation of segments length” (indicat-ing dendrites) and “path length to soma” (indicating axons)than those of substructure #1. However, because the weight of“path length to soma” was much larger than that of “summa-tion of segments length”, the former feature dominated.Therefore, substructure #2 was identified as axons and sub-structure #1 as dendrites. However, a careful inspection of theraw image revealed the opposite.

In summary, when the data quality is near-ideal, the aver-age terminal-level accuracy could reach 90 % or higher.

Although there were some misidentifications, most errorswere successfully detected by the warning system.

Exchange of Discriminant Axes

So far, we had supplied SPIN with a brain-region-specificclassifier to ensure optimal performance; polarity identifica-tion for MED (or PB) neurons were performed using a clas-sifier trained by MED (or PB) neurons. However, for a large-scale identification project involving dozens or more brainregions/neuropils, such an approach would be very time con-suming because it would require manual preprocessing oftraining neuron for each brain region/neuropil. Therefore, weexamined whether the classifier trained for a specific brainregion can also be applied to other regions while maintainingsatisfactory identification accuracy. Here we conducted a sim-ple test by exchanging the classifiers used in the polarityidentification stage so that SPIN performed identificationsfor MED (or PB) neurons using the PB-trained (or MED-trained) classifier. We found that such an exchange onlyresulted in small decreases (less than 10 %) in the terminal-level accuracies for both neuropils (Table 5). In particular, theclassifier trained by PB neurons still produced terminal-levelaccuracies of 91–96 % when applied to MED neurons. Thistest, though far from being thorough, suggested that the com-bination of morphological features selected for PB neuronswould more suitably serve as general features for identifyingneuronal polarities than those selected for MED neurons.

Neurite Diameter

It has been reported that typical axonal neurites tend to beconstant in diameter, while dendritic neurites are significantlytapered at the ends (Hanesch et al. 1989; Rolls 2011). There-fore, we asked an important question: is the information aboutthe neurite diameter enough for the polarity identification andno other morphological features are needed?

In order to address the question, we first tested the same setof PB and MED neurons. We reconstructed these neuronsusing Vaa3D-Neuron 2.0 (Peng et al. 2010) with the moduleAPP2 (Xiao and Peng 2013) and extracted neurite radius ofevery node from the reconstructed neurons. For each dendriticand axonal substructure in a neuron, we individually tested

Table 3 The substructure-level accuracies based on different decisioncriteria for PB and MED training neurons. The substructure-level accu-racy indicates the percentage of the substructures with their polaritiesbeing correctly classified using the given decision criteria

Decision criteria PB MED

Universal boundary 96 % 93 %

Neuron-specific boundary 96 % 96 %

Table 4 Terminal-level accuracyunder different test conditions Data type Condition PB MED

Original Cleaned Original Cleaned

Training All neurons (raw performance) 84 % 90 % 89 % 91 %

Unwarned neurons (ideal performance) 87 % 93 % 91 % 94 %

Test All neurons (raw performance) 87 % 92 % 92 % 94 %

Unwarned neurons (ideal performance) 90 % 96 % 94 % 98 %

Neuroinform

Page 13: SPIN: A Method of Skeleton-Based Polarity Identification for Neurons

three radius-related features: 1) mean radius at the terminalpoints, 2) mean radius of all branches and 3) mean radiusdifference between each terminal point and the root of asubstructure. Based on the general observations aforemen-tioned, we expected that the axonal substructures should belarger in the features 1 and 2 but smaller in 3. In fact, thefeature 1 worked in 82% ofMED neurons but only in 50% ofPB neurons. The feature 3 worked only in 69 and 42 % of the

MED and PB neurons, respectively. For the feature 2, weobserved mixed results: there were 78 % of the MED neuronsshowing larger mean radius in dendritic than in axonal sub-structures (contrary to our expectation), while the same trendonly observed in 56 % of the PB neurons (Fig. 11a–f).

The result was quite surprising because radius informationis not as useful as one might expect. However, this result isunderstandable if we consider the nature of fluorescent

Fig. 7 Discriminant scores and terminal-level accuracy of PB training(top) and test (bottom) neurons. Blue circuits and green crosses indicatethe substructures been classified by SPIN as dendrites or axons, respec-tively. The accuracy is shown under two conditions: raw performancebased on the original tracing-line data (solid red lines) and improved

performance based on clean data with artificial branches removed(dashed red lines). Some of the poorly identified neurons can be detectedby the warning system as indicated by red crosses and diamonds. Indi-vidual decision boundaries are represented by black asterisks

Neuroinform

Page 14: SPIN: A Method of Skeleton-Based Polarity Identification for Neurons

imaging and how it affects thickness of reconstructed neurites.We come back to this issue in Discussion. Here we furtherasked: if accurately reconstructed volumetric data of neuronsare available, is the information about the neurite diameter betterthan other morphological features in polarity identification?

In order to answer the question, we analyzed the recon-structed blowfly data downloaded from neuromorpho.org(Ascoli et al. 2007). The reconstruction can be considered as“nearly perfect” because the image processing and reconstruc-tion were done with heavy manual fine-tuning and correction(Borst and Haag 1996; Cuntz et al. 2008; Strausfeld and

Hausen 1977). Furthermore, some of the neurons were pre-pared by silver intensified cobalt sulphide impregnation(Borst and Haag 1996; Strausfeld and Hausen 1977) whichcreated homogeneous staining and sharp images. Our analysisshowed that most of the neurons had their radius-relatedfeatures correlate with neuronal polarity as we expected: ax-onal substructures were larger in the features 1 and 2 butsmaller in the feature 3. These correlations held for 86–88 %of the blowfly sample neurons (Fig. 11g–i).

The above tests were done without using SPIN. Next, wetested whether SPIN performs better on the blowfly neuron

Fig. 8 Discriminant scores and terminal-level accuracy as in Fig. 7 but for MED training (top) and test (bottom) neurons

Neuroinform

Page 15: SPIN: A Method of Skeleton-Based Polarity Identification for Neurons

samples when the information about the neurite radius is includ-ed during the training stage. The parameters of themorphologicalclustering were adjusted accordingly (see the Online ResourceSection D). We added the three radius-related features to SPIN.After retraining the classifier for the blowfly neurons, the bestfeature selected by SPIN was mean volume of voronoi piecesand none of the three radius-related features were selected. Usingthis customized classifier (Blowfly classifier), the terminal-levelaccuracywas 98% (Fig. 12a). For comparison, the terminal-levelaccuracy by forcing SPIN to use the three radius-related featuresexclusively was only 90 % (Fig. 12b). Considering that theseblowfly neuron samples were also from the visual system, whichmight share the same features as the Drosophila MED neuronswe tested, we applied the MED classifier to the blowfly neuronsand SPIN achieved a similar accuracy of 98 % (Fig. 12c). Theresult demonstrated that, at least for the blowfly neurons wetested, the radius-related features performed well, but not thebest. A better accuracy could be achieved if we supplied SPINwith all available features and allowed SPIN to select featuresusing its machine-learning mechanism. We noted that a largenumber of type I warnings were given by SPIN. They were dueto the small deviation in the discriminant scores of the dendriticsubstructures in the training neurons. As a result, the z-score ofthe identified dendritic substructures tended to be large and thetype I warnings were often triggered.

Discussion

We proposed a semi-automatic procedure for neuronal polarityidentification and demonstrated the possibility of identifying

polarity solely based on the reconstructed tracing-line data.Using a classifier trained by neurons with known polarity, theraw accuracy at the terminal level was around 84–92 %. Wealso developed an effective warning system which detectedpossible misidentifications and increased the accuracy to 87–94 %. Furthermore, if the data quality was improved, theaccuracy can be boosted to 93–98 %.

SPIN adopts a modulatory design and many of its compo-nents can be replaced by different algorithms to improveperformance without affecting other parts of the system. Herewe discuss how each component can be improved:

1. Classifier training/Feature extraction: SPIN uses theTREES toolbox and only extracts morphological featuresthat are available in Flycircuit. In fact, any feature can beadded into the feature pool at this stage (Billeci et al.2013; Scorcioni et al. 2008). For example, if a datasetcarries information about the thickness of the neurites,radius-related features can be included here.

2. Classifier training/Feature selection: SPIN searches everypossible number of feature combination and uses KNNCand LOO to evaluate the correlations between each spe-cific feature subset and the polarities of the training neu-rons. This is the most time consuming step of SPIN andcan take several hours on a personal computer. A refinedsearch algorithm may greatly reduce computation time.

3. Classifier training/Discriminant axis & decision boundarydetermination: SPIN uses linear discriminant analysis toextract the discriminant features and selects only one opti-mal discriminant axis. A refined algorithm with more thanone axes used or with a nonlinear decision boundary ap-plied may further improve the performance of the classifier.

Fig. 9 Sample results of correctpolarity identification. a PBneurons b MED neurons. Thecolor of the backbone indicatesthe true polarity, while the color ofthe dots represents the polarityidentified by SPIN

Neuroinform

Page 16: SPIN: A Method of Skeleton-Based Polarity Identification for Neurons

4. Polarity identification/Morphological clustering: this isprobably the most crucial component that needs improve-ment. The clustering algorithm is currently based on sixtunable parameters (Online Resource Section D) that aredetermined empirically. Some of the parameters needed tobe changed across brain regions/neuropils in order toobtain optimal results. This prevents SPIN from

becoming a fully automatic system. There are at leastthree ways to improve the current algorithm. 1) Adaptiveparameter setting. Instead of manually setting a set ofparameters for the sample neuron population (which canbe diverse in morphology), an algorithm that quicklyevaluates the morphology of each neuron and automati-cally sets suitable clustering parameters specific for the

Fig. 10 Sample results of incorrect polarity identification (a–c) Theinfluence of incorrect morphological clustering. a The inclusion of oneartificial branch (in the red box) changed the characteristics of the sub-structure in yellow (left). As a result, the altered substructure was incor-rectly identified as an axon (middle). After manually removing theartificial branch (right), the polarity of the middle substructure wascorrectly identified. b This neuron was divided into too many substruc-tures because of its atypical morphology. c The actual dendrites (in the red

box) were treated as noise by SPIN and were eliminated during artificialbranch removal. The remaining two substructures possessed similarfeatures but were forced to be assigned as an axon and a dendrite. dTwo of the substructures did not possess clear features and wasmisclassified. This type of error can usually be detected by the warningsystem. e The features of these substructures were atypical (different fromthose of most training neurons) and almost all of the branches weremisclassified

Table 5 Terminal-level accura-cies under different test condi-tions (same as in Table 4) withclassifier exchanged. PB (orMED) neurons were classified byMED (or PB) trained classifier

Data type Condition PB MED

Original Cleaned Original Cleaned

Training All neurons (raw performance) 78 % 85 % 85 % 91 %

Unwarned neurons (ideal performance) 77 % 85 % 87 % 92 %

Test All neurons (raw performance) 84 % 90 % 92 % 94 %

Unwarned neurons (ideal performance) 91 % 88 % 95 % 98 %

Neuroinform

Page 17: SPIN: A Method of Skeleton-Based Polarity Identification for Neurons

neuron can be implemented. 2) Clustering result evalua-tion. In SPIN, the result of morphological clustering isdirectly passed on to the classifier. An evaluation systemcan be implemented so that a neuron can be re-clusteredusing different parameters if the clustering result does notpass certain criteria, e.g., certain numbers of substructuresor certain shapes of the substructures. 3) Inclusion ofneuropil information. Morphological clustering can beassisted by considering information about neuropils. Fora neuron that projects to several neuropils, branches tendto have the same polarity if they innervate the sameneuropils. Therefore, neuropil boundaries can be used asreferences for morphological clustering.

Finally, the quality of image data reconstruction is alsocrucial. In some cases, the reconstructed neurons lost a signif-icant portion of their arbors due to insufficient resolutionunder optical microscopy. This type of data defects causesmisidentification that cannot be compensated by any improve-ment in SPIN algorithms. SPIN is also affected by artificialbranches produced from rough image boundaries and byinaccurate branch connections within dense clusters. Al-though SPIN allows the users to remove artificial branches,caution should be taken for such operations because the neu-rons may be “shapped” into what are expected by the usersrather than what are actually presented in the raw images.Therefore, improving image quality and data reconstructionaccuracy are the most effective ways to increase SPIN’sperformance. In addition to improving data quality, pre-classifying neurons into several morphological types in ad-vance (Cuntz et al. 2008; Luczak 2010; Wang and Liao 2011;Wichterle et al. 2013) and feeding SPIN with training and testneurons of the same types can also efficiently improve theaccuracy of polarity identification. Specifically, analysis ofclonal development may provide useful information in thisregard because neurons belonging to the same lineage exhibitsimilar neurite trajectories (Ito et al. 2013; Yu et al. 2013). Inaddition, information about clonal development can also beused to evaluate the identification results because neuronswithin a lineage tend to possess similar patterns of polarity.

Regarding the diameter of neurites, why is such informa-tion not as useful as we expected? We argue that accuracy ofreconstructed neurite diameter critically depends on the imagequality and staining techniques. It has been reported that,

�Fig. 11 Examination of three radius-related features in PB (a–c), MED(d–f) and blowfly (g–i) neurons. We tested whether any of the threefeatures can be used to distinguish axonal substructures from dendriticones. a, d, g Mean terminal point radius. Error rates (number of error/number of total neuron) are 25/50, 10/55, and 8/56 for PB, MED andblowfly neurons, respectively. b, e, h Mean substructure radius. Errorrate: 22/50, 12/55, 7/56. c, f, i Mean radius difference between terminalpoint and root. Error rate: 29/50, 17/55, 8/56. Error cases are indicated bythe vertical red lines. The value of each feature are normalizedindividually by the maximum value of the neuron

Neuroinform

Page 18: SPIN: A Method of Skeleton-Based Polarity Identification for Neurons

Neuroinform

Page 19: SPIN: A Method of Skeleton-Based Polarity Identification for Neurons

comparing to the Golgi-stained image, a neuron may appeardifferently in terminal regions of fine branches if the neuron isstained by intracellularly filled dye (Heinze andHomberg 2008;Müller et al. 1997). For fluorescent images that use geneticallyencoded fluorescent proteins, the image quality is influenced byinhomogeneous distribution of fluorescent proteins. Further-more, some fine structures such as densely distributed thinneurites might not be accurately reconstructed due to “glowinghalo effect” of light emitting objects which inflate the image. Inconsequence, approximate fine branches may turn into a thickbranch after reconstruction. In contrast, some of the skeleton-based morphological features used in SPIN are less affected bythe distorted fine structures in the reconstructed neurons.

Although it has been doubted whether distinct polaritypresents in inset neurons, recent studies have shown that manyDrosophila neurons possess clearly identifiable polarities(Rolls 2011; Rolls et al. 2007). Several recent studies havealso reported well-labeled presynaptic and postsynaptic do-mains (Ito et al. 2013; Lin et al. 2013a) in the projectionneurons (neurons that innervate multiple neuropils), whichaccount for ~74 % of neurons in Drosophila brain (Chianget al. 2011). Hence the idea that most of the Drosophilaneurons are polarized is well supported. In regard to the localneurons that account for the rest ~26 % of neurons, some ofthem are characterized by co-localized presynaptic and post-synaptic terminals and therefore do not possess clearly sepa-rated axonal and dendritic domains (Chou et al. 2010). Someothers have been identified as endowed with segregated axo-nal and dendritic domains (Lin et al. 2013a). Therefore, webelieve that SPIN should work with most neurons, in partic-ular the projection neurons, in Drosophila or in other insects.

The purpose of identifying polarity is to determine thedirections of information flow, which are the most importantpieces of information in constructing connectome for a ner-vous system. Although ~90 % of terminal-level accuracy inSPIN may not sound impressive, it can lead to high precisionin terms of predicting information flow. For example, if weidentify that two neurons in our database potentially form asynapse, the direction of signal flow can be predicted bychecking the polarity of the two neurons at the contact point

(the potential synapse). The prediction will only be incorrect ifthe identified polarities by SPIN are wrong for the presynapticand postsynaptic sides simultaneously. Assuming a 90 %terminal-level accuracy (or an error rate of 10 %) on eachside, the probability of predicting a reversed signal flow(incorrect polarity identification on both sides) is only 1 %.There will be an 18 % probability of conflict polarity (bothsides are predicted axons or dendrites) and they can be pickedup automatically by computer programs and sent back toSPIN for re-evaluation.

SPIN can be further generalized so that it can be used toidentify not only the axonal and dendritic domains, but othercommon structural features. The generalization can be donemainly by modifying the input and output components ofSPIN. For the input component, the class indices will not beaxon and dendrite but something that user specified; for theoutput component, rather than reporting a substructure’s iden-tity (axon or dendrite), a number that indicates the similaritybetween the samples and the user-specified classes should bereported. Therefore, if a user has a set of neurons and he or sheknows that substructures of these neurons have some func-tionality in common, the generalized SPIN can test whetherthere is any morphological feature or feature combination thatcan be used to predict the functions and uses this informationto identify similar substructures in another set of test neurons.This generalized SPIN function could also allow the users toinvestigate whether substructures characterized by commonmorphological features may be endowed with common com-putational features. For example, can polar and apolar localneurons be distinguished by their morphological features?

In conclusion, the proposed SPIN system is significant inseveral aspects: 1) it provides not only a quick way to identifyneuronal polarities by considering multiple morphologicalfeatures, but also a tool to investigate the causal links betweenmorphological features and polarities. 2) The modularizeddesign of the SPIN workflow makes it possible to reconfigureor replace each component with better algorithms. Therefore,SPIN can easily be generalized and adapted for various data-bases and sample conditions. 3) After C. Elegans (Varshneyet al. 2011; White et al. 1986), Drosophila is likely to be thenext species with the full connectome mapped out within thenext few years (Alivisatos et al. 2012; Chiang et al. 2011).SPIN characterizes an important step toward this goal byproviding a pipeline to quickly and semi-automatically iden-tify polarity for neurons in a large-scale database such asFlycircuit (Chiang et al. 2011).

Information Sharing Statement

SPIN is an open-source software package available for down-load in the form of Matlab code at http://life.nthu.edu.tw/~lablcc/SPIN. SPIN contains codes from the TREES

�Fig. 12 Discriminant scores and terminal-level accuracy as in Fig. 7 butfor the blowfly neurons with three different classifiers. a The blowflyclassifier. The SPIN was retrained based on morphological featuresextracted from 8 randomly selected blowfly neurons. Most but one(#47) neurons achieved very high (>90 %) terminal-level accuracy. bThe radius classifier. We only supplied SPIN with three radius-relatedfeatures for training. The same eight neurons as in (a) were selected fortraining. The resulting classifier did not perform as good as the other twoclassifiers. The terminal-level accuracy was below 60 % for 4 out of 48test neurons. c The MED classifier. We skipped the training stage anddirectly used the classifier trained by MED neurons of Drosophila toidentify the polarity of the 48 blowfly test neurons. The MED classifierperformed as good as the blowfly classifier. Individual decisionboundaries are represented by black asterisks

Neuroinform

Page 20: SPIN: A Method of Skeleton-Based Polarity Identification for Neurons

toolbox (available at http://www.treestoolbox.org/) and themachine learning toolbox (available at http://neural.cs.nthu.edu.tw/jang/matlab/toolbox/machineLearning/). Select3d(available at http://www.codeforge.com/article/126126),uibutton (available at http://www.mathworks.com/matlabcentral/fileexchange/10743-uibutton-gui-pushbuttons-with-better-labels/content/uibutton.m) and mtit (available athttp://www.mathworks.com/matlabcentral/fileexchange/3218-mtit-a-pedestrian-major-title-creator/content/mtit.m) arealso included for developing GUI or facilitating plot display.The SPIN software package also contains data of sampleneurons with skeletal data available from the Flycircuitdatabase (http://www.flycircuit.tw/). SPIN is released underthe GNUGeneral Public License (GPLv3). The license mech-anism for third party tools is not restricted by the SPIN license.

Acknowledgments This work is supported by the National ScienceCouncil grants #NSC 101-2311-B-007-008-MY3 and Free ExcellentProjects, and by the Aim for the Top University Project of the Ministryof Education, Taiwan. We thank the National Center for High-performance Computing for providing the Flycircuit data; Drs. Ann-Shyn Chiang and Hsiu-Ming Chang for helpful discussion.We also thankDr. Chih-Yung Lin for providing PB data.

Conflict of Interest The authors declare that they have no conflict ofinterests.

References

Alivisatos, A. P., Chun, M., Church, G. M., Greenspan, R. J., Roukes, M.L., & Yuste, R. (2012). The brain activity map project and thechallenge of functional connectomics. Neuron, 74(6), 970–974.doi:10.1016/j.neuron.2012.06.006.

Ascoli, G. A., Donohue, D. E., & Halavi, M. (2007). NeuroMorpho.Org:a central resource for neuronal morphologies. The Journal ofNeuroscience, 27(35), 9247–9251. doi:10.1523/JNEUROSCI.2055-07.2007.

Bas, E., & Erdogmus, D. (2011). Principal curves as skeletons of tubularobjects: locally characterizing the structures of axons.Neuroinformatics, 9(2–3), 181–191. doi:10.1007/s12021-011-9105-2.

Billeci, L., Magliaro, C., &Ahluwalia, A. (2013). NEuronMOrphologicalanalysis tool: open-source software for quantitative morphometrics.Frontiers in Neuroinformatics, 7, 2. doi:10.3389/fninf.2013.00002.

Borst, A., & Haag, J. (1996). The intrinsic electrophysiological charac-teristics of fly lobula plate tangential cells: I. Passive membraneproperties. Journal of Computational Neuroscience, 3(4), 313–336.doi:10.1007/BF00161091.

Brown, K. M., Barrionuevo, G., Canty, A. J., Paola, V., Hirsch, J. A.,Jefferis, G. S. X. E., et al. (2011). The DIADEM data sets: repre-sentative light microscopy images of neuronal morphology to ad-vance automation of digital reconstructions. Neuroinformatics, 9(2–3), 143–157. doi:10.1007/s12021-010-9095-5.

Campagne, M. V. L., Oestreicher, A. B., Henegouwen, P. M. P. V. B. E.,& Gispen,W. H. (1990). Ultrastructural double localization of B-50/GAP43 and synaptophysin (p38) in the neonatal and adult rathippocampus. Journal of Neurocytology, 19(6), 948–961. doi:10.1007/BF01186822.

Cannon, R., Turner, D., Pyapali, G., & Wheal, H. (1998). An on-linearchive of reconstructed hippocampal neurons. Journal ofNeuroscience Methods, 84(1–2), 49–54. doi:10.1016/S0165-0270(98)00091-0.

Chiang, A.-S., Lin, C.-Y., Chang, H.-M., Hsieh, C.-H., Yeh, C.-W., &Hwang, J.-K. (2011). Three-dimensional reconstruction of brain-wide wiring networks in Drosophila at single-cell resolution.Current Biology, 21(1), 1–11. doi:10.1016/j.cub.2010.11.056.

Chothani, P., Mehta, V., & Stepanyants, A. (2011). Automated tracing ofneurites from light microscopy stacks of images. Neuroinformatics,9(2–3), 263–278. doi:10.1007/s12021-011-9121-2.

Chou, Y.-H., Spletter, M. L., Yaksi, E., Leong, J. C. S., Wilson, R. I., &Luo, L. (2010). Diversity and wiring variability of olfactory localinterneurons in the Drosophila antennal lobe. Nature Neuroscience,13(4), 439–449. doi:10.1038/nn.2489.

Craig, A. M., & Banker, G. (1994). Neuronal polarity. Annual Review ofNeuroscience, 17(1), 267–310. doi:10.1146/annurev.ne.17.030194.001411.

Cuntz, H., Forstner, F., Haag, J., & Borst, A. (2008). The morphologicalidentity of insect dendrites. PLoS Computational Biology, 4(12),e1000251. doi:10.1371/journal.pcbi.1000251.

Cuntz, H., Forstner, F., Borst, A., & Häusser, M. (2010). One rule to growthem all: a general theory of neuronal branching and its practicalapplication. PLoS Computational Biology, 6(8), e1000877. doi:10.1371/journal.pcbi.1000877.

Donohue, D. E., & Ascoli, G. A. (2011). Automated reconstruction ofneuronal morphology: an overview. Brain Research Reviews, 67(1–2), 94–102. doi:10.1016/j.brainresrev.2010.11.003.

Duchene, J., & Leclercq, S. (1988). An optimal transformation for dis-criminant and principal component analysis. IEEE Transactions onPattern Analysis and Machine Intelligence, 10(6), 978–983. doi:10.1109/34.9121.

Feinberg, E. H., VanHoven,M. K., Bendesky, A., Wang, G., Fetter, R. D.,Shen, K., et al. (2008). GFP reconstitution across synaptic partners(GRASP) defines cell contacts and synapses in living nervoussystems. Neuron, 57(3), 353–363. doi:10.1016/j.neuron.2007.11.030.

Fischbach, P. K.-F., & Dittrich, A. P. M. (1989). The optic lobe ofDrosophila melanogaster. I. A Golgi analysis of wild-type structure.Cell and Tissue Research, 258(3), 441–475. doi:10.1007/BF00218858.

Gillette, T. A., Brown, K. M., & Ascoli, G. A. (2011). The DIADEMmetric: comparing multiple reconstructions of the same neuron.Neuroinformatics, 9(2–3), 233–245. doi:10.1007/s12021-011-9117-y.

Glaser, J. R., & Glaser, E. M. (1990). Neuron imaging with Neurolucida–a PC-based system for image combining microscopy. ComputerizedMedical Imaging and Graphics: the Official Journal of theComputerized Medical Imaging Society, 14(5), 307–317.

Gordon, M. D., & Scott, K. (2009). Motor control in a Drosophila tastecircuit. Neuron, 61(3), 373–384. doi:10.1016/j.neuron.2008.12.033.

Hanesch, U., Fischbach, K.-F., & Heisenberg, M. (1989). Neuronalarchitecture of the central complex in Drosophila melanogaster.Cell and Tissue Research, 257(2), 343–366. doi:10.1007/BF00261838.

Heinze, S., & Homberg, U. (2008). Neuroarchitecture of the centralcomplex of the desert locust: intrinsic and columnar neurons. TheJournal of Comparative Neurology, 511(4), 454–478. doi:10.1002/cne.21842.

Ikeno, H., Kanzaki, R., Aonuma, H., Takahata, M., Mizunami, M.,Yasuyama, K., et al. (2008). Development of invertebrate brainplatform: Management of research resources for invertebrate neuro-science and neuroethology. In M. Ishikawa, K. Doya, H. Miyamoto,& T. Yamakawa (Eds.), Neural information processing (pp. 905–914). Springer: Berlin. Retrieved from http://link.springer.com/chapter/10.1007/978-3-540-69162-4_94.

Neuroinform

Page 21: SPIN: A Method of Skeleton-Based Polarity Identification for Neurons

Ito, M., Masuda, N., Shinomiya, K., Endo, K., & Ito, K. (2013).Systematic analysis of neural projections reveals clonal compositionof the Drosophila brain. Current Biology, 23(8), 644–655. doi:10.1016/j.cub.2013.03.015.

Jang, J.-S. R. (2012). Machine Learning Toolbox. http://mirlab.org/jang/matlab/toolbox/machineLearning. Accessed 12 June 2012.

Lee, P.-C., Chuang, C.-C., Chiang, A.-S., & Ching, Y.-T. (2012). High-throughput computer method for 3D neuronal structure reconstruc-tion from the image stack of the Drosophila brain and its applica-tions. PLoS Computational Biology, 8(9), e1002658. doi:10.1371/journal.pcbi.1002658.

Lin, C.-Y., Chuang, C.-C., Hua, T.-E., Chen, C.-C., Dickson, B. J.,Greenspan, R. J., et al. (2013a). A comprehensive wiring diagramof the protocerebral bridge for visual information processing in theDrosophila brain. Cell Reports, 3(5), 1739–1753. doi:10.1016/j.celrep.2013.04.022.

Lin, H.-H., Chu, L.-A., Fu, T.-F., Dickson, B. J., & Chiang, A.-S.(2013b). Parallel neural pathways mediate CO2 avoidance re-sponses in Drosophila. Science, 340(6138), 1338–1341. doi:10.1126/science.1236693.

Luczak, A. (2010). Measuring neuronal branching patterns using model-based approach. Frontiers in Computational Neuroscience, 4, 10.doi:10.3389/fncom.2010.00135.

Matus, A., Bernhardt, R., & Hugh-Jones, T. (1981). High molecularweight microtubule-associated proteins are preferentially associatedwith dendritic microtubules in brain. Proceedings of the NationalAcademy of Sciences of the United States of America, 78(5), 3010–3014.

Müller, M., Homberg, U., & Kühn, A. (1997). Neuroarchitecture ofthe lower division of the central body in the brain of the locust(Schistocerca gregaria). Cell and Tissue Research, 288(1), 159–176.

Parekh, R., & Ascoli, G. A. (2013). Neuronal morphology goes digital: aresearch hub for cellular and system neuroscience. Neuron, 77(6),1017–1038. doi:10.1016/j.neuron.2013.03.008.

Pastrana, E. (2013). Focus on mapping the brain. Nature Methods, 10(6),481. doi:10.1038/nmeth.2509.

Peng, H., Ruan, Z., Long, F., Simpson, J. H., & Myers, E. W. (2010).V3D enables real-time 3D visualization and quantitative analysis oflarge-scale biological image data sets. Nature Biotechnology, 28(4),348–353. doi:10.1038/nbt.1612.

Robinson, I. M., Ranjan, R., & Schwarz, T. L. (2002). Synaptotagmins Iand IV promote transmitter release independently of Ca2+ bindingin the C2A domain. Nature, 418(6895), 336–340. doi:10.1038/nature00915.

Rolls, M. M. (2011). Neuronal polarity in Drosophila: sorting out axonsand dendrites. Developmental Neurobiology, 71(6), 419–429. doi:10.1002/dneu.20836.

Rolls, M. M., Satoh, D., Clyne, P. J., Henner, A. L., Uemura, T., & Doe,C. Q. (2007). Polarity and intracellular compartmentalization ofDrosophila neurons. Neural Development, 2(1), 7. doi:10.1186/1749-8104-2-7.

Scorcioni, R., Polavaram, S., & Ascoli, G. A. (2008). L-Measure: a web-accessible tool for the analysis, comparison and search of digitalreconstructions of neuronal morphologies. Nature Protocols, 3(5),866–876. doi:10.1038/nprot.2008.51.

Squire, L. R., Berg, D., Bloom, F., Lac, S. du, & Ghosh, A. (2008).Subcellular organization of the nervous system: organelles and theirfunctions. In Fundamental Neuroscience (3rd ed., pp. 59–86).Amsterdam; Boston: Elsevier/Academic Press.

Strausfeld, N. J., & Hausen, K. (1977). The resolution of neuronalassemblies after cobalt injection into neuropil. Proceedings of theRoyal Society of London. Series B: Biological Sciences, 199(1136),463–476. doi:10.1098/rspb.1977.0154.

Takemura, S., Bharioke, A., Lu, Z., Nern, A., Vitaladevuni, S., Rivlin, P.K., et al. (2013). A visual motion detection circuit suggested byDrosophila connectomics. Nature, 500(7461), 175–181. doi:10.1038/nature12450.

Türetken, E., González, G., Blum, C., & Fua, P. (2011). Automatedreconstruction of dendritic and axonal trees by global optimizationwith geometric priors. Neuroinformatics, 9(2–3), 279–302. doi:10.1007/s12021-011-9122-1.

Varshney, L. R., Chen, B. L., Paniagua, E., Hall, D. H., & Chklovskii, D.B. (2011). Structural properties of the Caenorhabditis elegans neu-ronal network. PLoS Computational Biology, 7(2), e1001066. doi:10.1371/journal.pcbi.1001066.

Wang, T., & Liao, D. (2011). Neuronal morphology classification basedon SVM. In Computer Science and Service System (CSSS), 2011International Conference on (pp. 3344–3347). doi:10.1109/CSSS.2011.5972187.

Wang, Y., Narayanaswamy, A., Tsai, C.-L., & Roysam, B. (2011b). Abroadly applicable 3-D neuron tracing method based on open-curvesnake. Neuroinformatics, 9(2–3), 193–217. doi:10.1007/s12021-011-9110-5.

Wang, J., Ma, X., Yang, J. S., Zheng, X., Zugates, C. T., Lee, C.-H. J.,et al. (2004). Transmembrane/juxtamembrane domain-dependentdscam distribution and function during mushroom body neuronalmorphogenesis. Neuron, 43(5), 663–672. doi:10.1016/j.neuron.2004.06.033.

White, J. G., Southgate, E., Thomson, J. N., & Brenner, S. (1986). Thestructure of the nervous system of the nematode Caenorhabditiselegans. Philosophical Transactions of the Royal Society ofLondon B. Biological Sciences, 314(1165), 1–340. doi:10.1098/rstb.1986.0056.

Whitney, A. W. (1971). A direct method of nonparametric measurementselection. IEEE Transactions on Computers, C-20(9), 1100–1103.doi:10.1109/T-C.1971.223410.

Wichterle, H., Gifford, D., & Mazzoni, E. (2013). Mapping neuronaldiversity one cell at a time. Science, 341(6147), 726–727. doi:10.1126/science.1235884.

Xiao, H., & Peng, H. (2013). APP2: automatic tracing of 3D neuronmorphology based on hierarchical pruning of a gray-weighted imagedistance-tree. Bioinformatics (Oxford, England), 29(11), 1448–1454. doi:10.1093/bioinformatics/btt170.

Yu, H.-H., Awasaki, T., Schroeder, M. D., Long, F., Yang, J. S., He, Y.,et al. (2013). Clonal development and organization of the adultDrosophila central brain. Current Biology, 23(8), 633–643. doi:10.1016/j.cub.2013.02.057.

Zhao, T., Xie, J., Amat, F., Clack, N., Ahammad, P., Peng, H., et al.(2011). Automated reconstruction of neuronal morphology based onlocal geometrical and global structural models. Neuroinformatics,9(2–3), 247–261. doi:10.1007/s12021-011-9120-3.

Neuroinform