PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision...

44
Research Paper: YOLO9000: Better, Faster, Stronger Real-Time Object Detection Project Management: Computer Vision Software Projects Guests: Arthur Chan - Waikit Lau AIDL - Artificial Intelligence and Deep Learning Facebook Group The magazine of the algorithm community A publication by Project: Navigation Systems for Orthopedic Surgery Spotlight News Tool: SIFT3D - SIFT in 3D April 2017 Women in Computer Vision: Tal Arbel Application: Riverain Technologies Lung disease detection DAVIS Challenge Video Object Segmentation We Tried for You: Cyclops: Video-conferencing with Computer Vision and AR Events: Vision Monday Leadership Summit

Transcript of PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision...

Page 1: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

Research Paper: YOLO9000: Better, Faster, Stronger Real-Time Object Detection

Project Management:Computer Vision Software Projects

Guests:Arthur Chan - Waikit LauAIDL - Artificial Intelligence and Deep Learning Facebook Group

The magazine of the algorithm community

A publication by

Project:Navigation Systemsfor Orthopedic Surgery

Spotlight News

Tool:SIFT3D - SIFT in 3D

April 2017

Women in Computer Vision:

Tal Arbel

Application:Riverain TechnologiesLung disease detection

DAVIS ChallengeVideo Object Segmentation

We Tried for You:Cyclops: Video-conferencingwith Computer Vision and AR

Events:Vision Monday

Leadership Summit

Page 2: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

2Computer Vision News

Read This Month

We Tried for YouCyclops videoconferencing with CV/AR

ChallengeDAVIS - Video Object Segmentation

Project - RSIP VisionNavigation in Orthopedic SurgeryProject Management TipComputer Vision Software ProjectsResearch Paper YOLO9000 Real-Time Object DetectionComputer Vision EventsVision Monday Leadership Summit,CVVC, Calendar of April-June events

Editorialby Ralph AnzarouthApplication of the MonthRiverain TechnologiesTool of the MonthSIFT3D - SIFT in 3DSpotlight NewsFrom elsewhere on the WebWomen in Computer VisionTal Arbel - McGill UniversityGuestsWaikit Lau, Arthur Chan, Samson Timoner

030408131419

252628303441

Project of the MonthNavigation Systems for

Orthopedic Surgery

Women in Computer VisionTal Arbel

ChallengeVideo Object Segmentation

26

Tool of the MonthSIFT3D

08

Spotlight News

13

GuestsWaikit Lau Arthur ChanSamson Timoner

19

1428

Upcoming Events

41

Research - YOLO9000 Real-Time Object Detection

34

Application of the MonthRiverain Technologies

04

Page 3: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

Computer Vision NewsEditor: Ralph AnzarouthPublisher: RSIP Vision

Contact us

Give us feedback

Free subscription

Read previous magazines

Copyright: RSIP VisionAll rights reservedUnauthorized reproductionis strictly forbidden.

Dear reader,

With this April issue of Computer Vision News we startthe second year of publication: last month, we weighedthe outcomes of this journey, and they aretremendously positive. Now it is the time to think aboutthe plans for the future. We treasure your feedback: wewill build on it to keep giving you fresh, proprietary andoriginal readings, to fulfil the need for a communitymagazine which (according to what we hear) no othercommunication tool does. One major example is the on-site coverage of the major conferences, with which wepartner to publish the CVPR Daily, the ECCV Daily andthe MICCAI Daily. Another example is the freesubscription to our magazine which is offered to all,including every member of your team and of your class.

Last month, we also promised you our first specialedition: the challenge we took on ourselves wassuccessful, with the publication of Boston Vision, amagazine that uncovers some of the vibrant vitalityanimating the image processing and computer visioncommunity in Boston. When you finish readingComputer Vision News, follow the link and read BostonVision too!

This Computer Vision News of April features (as always)plenty of unique content and great guests for you: youwill be fascinated by the work of Riverain Technologieson lung disease; you will be captivated by the interviewwith Waikit Lau and Arthur Chan, founders of ArtificialIntelligence & Deep Learning, the most successfulFacebook group in our community; you will beenchanted by Tal Arbel, an impressive woman scientistdoing great work with her students in both computervision and medical image analysis. Don’t miss the Tool,the Research, the Project, the Spotlight News, the Eventsand last but not least the Challenge, which this monthhas been specially written for you by Jordi Pont-Tuset.

We are confident that you will love reading this Aprilmagazine. Please continue sharing it with colleaguesand friends.

Enjoy the reading!

Ralph AnzarouthMarketing Manager, RSIP VisionEditor, Computer Vision News

Computer Vision News

Welcome 3

Page 4: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

Riverain Technologies providessoftware tools to aid clinicians inefficient and effective early detectionof lung disease. With about five milliondeadly cases per year and survival ratebeing related to the stage in which thedisease is diagnosed, early detection isa key priority. In addition to tools tofacilitate early detection, Riverain toolsallow faster reading, increasinglycritical in radiology. We talked about itwith CEO Steve Worrell and ChiefScience Officer Jason Knapp. Riverain’score competencies are machinelearning and image analysis. They arevery much dedicated to medicalimaging applications, more specificallythoracic imaging in x-ray and CT. Theyhave four FDA cleared X-ray productsand one FDA cleared CT product they

offer to the marketplace.

The CT product they launched this pastSeptember uses deep learning as wellas many other novel technologies.Their thoracic imaging solutions helpradiologists read more efficiently(faster) and more accurately in thedetection of lung disease; in particular,the CT product aims at aiding theradiologist in detecting lung cancerwhile allowing them to read faster.

When asked about what makes theirtechnology stand out, Worrell explainsthat their solutions have uniqueattributes regarding both deployment

4Computer Vision News

Application

Riverain Technologies

Ap

plicatio

n

ClearRead CT’s output for a non-solid nodule: this and next page’s image show a slice from theoriginal CT series at left, while the corresponding slice from the vessel suppressed volume is atright. ClearRead CT locates, segments, and makes measurements for each region. It provides theradiologists with the vessel suppressed series and the automatic segmentation/measurements forconcurrent review - that is, they can see content while looking at the original data upon their initialreview. The company claims that to the best of their knowledge, this is the first automatic detectionsystem approved to do so. This is why radiologists can read faster, while finding more suspicious regions.

“We use modeling extensively for constructing training data, which I think is quite unique”

Page 5: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

and development: they use modeling(data synthesis) quite extensively aspart of the development process. In hisown words: “In machine learningapplications, you face the challenge ofcollecting and labeling data in order totrain algorithms to recognize things ofinterest. One of the novel things thatwe do as part of the developmentprocess is, rather than rely exclusivelyon acquired data, we use modelingextensively for constructing trainingdata, which is quite unique, I think,relative to other companies in themedical imaging space; that’s how wego about the development process froma training perspective to ensure wehave representative data across thespectrum of situations that can arise,whether in the context of the diseaseitself or the anatomy of the patient.”

Their work involves other noveltiesdeserving to be known. For instance,image normalization technology (ordomain adaption, in the machinelearning vernacular). One thing thatthey try to do across all of theirproducts is to build a solution thatscales across, not just one device, butall devices in a clinical setting. Hospitalsmay have multiple CT devices in onefacility or across a health network,coming from different manufacturers.Furthermore, different sites might havedifferent acquisition protocols. Throughthe use of image normalizationtechnology, their products workseamlessly across acquisition devicesand imaging protocols. This has a lot ofpractical value as it provides scalabilityand ease of installation. They candeploy their solutions across an entiresite or (in some countries like theUnited States) across a large network ofhospitals and health centers throughthe change of a license file.

The CT product facilitates noduledetection by the reader. A significantimpediment to nodule detection in thelungs, not just for machines but also forradiologists, is represented by thevascular structures in the lungs. One ofthe unique aspects of their product isthat it suppresses the vascularstructures with the objective of makinglesions more conspicuous and easier todetect by the radiologist. Theyestimate, for each voxel in the 3D CTscan, what fraction is occupied by avascular structure versus a nodularstructure. It’s a continuous real valueprediction, forgoing the use of explicitsegmentation techniques.

Worrell comments: “Through thisprocess, we can achieve the objective ofsuppressing vascular structures andhave it be much more robust than wewould if explicitly segmented thevascular structures, which can be afairly noisy process.”When you ask Knapp about thealgorithmic techniques used to achievethis task, he jokes: “Well, surprisingly,we’ve found deep learning quite useful.Seriously, though, as Steve suggested,we use a lot of modeling and datasynthesis. We generally form a type offorward process, and use deep learning

Computer Vision News

Application 5

Ap

plic

atio

n

ClearRead CT’s outputs for a solid nodule

“What you really want is the unusual cases, which can

unfortunately be hard to come by”

Page 6: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

6Computer Vision News

Application

Ap

plicatio

n

to learn the inverse process. This designprinciple is applied again and againthroughout our applications. It isremarkably powerful.”

He then adds: “What some people newto the medical domain are finding out isthat models built in a lab setting mightnot work well in a clinical setting due tothe wide range of conditions that existclinically. This is not a result of poortechnique, but a consequence of usingmodels driven by statistical correlations- and acquisition processes, as oneexample, can have a substantial effecton the nature of these correlations.Dealing with acquisition may not soundvery exciting, but it is critical if youwant to build clinically viable systems.Successful development really comes downto these types of fundamental issues.”Knapp explains that the one problem issimply statistical size versus samplesize. When people collect data setsthey really focus on sample size, notstatistical size. “What you really want isthe unusual cases, which canunfortunately be hard to come by. Thisis one of the hardest aspects aboutmedical data: it has a really long tail!”To that Worrell adds: “That’s whatradiologists tend to be really good at:extrapolating to those scenarios thatare statistically infrequent, butnonetheless very important from apatient health standpoint.”And Knapp: “I think this is all tiedtogether. Radiologists draw upon atremendous amount of abstractknowledge when interpreting medicalimages, which allows them to reason inlow SNR situations, handle ambiguityeffortlessly, utilize causality, and learnfrom very few examples. And it is not at

all clear how much of this knowledgecan be learned just from annotatedvolumes. For tasks that are largely justpattern recognition, like finding tumorsor segmenting organs, I think we[medical image analysis community] arein good shape, but there is obviously alot of work to be done. It’s a veryexciting time to be working in the field!”

Worrell concludes: “What I personallythink is interesting with the productsthat we’re developing is how the userinteracts with the products and how theuser perceives the product. Thedynamic between the machine and thehuman is quite interesting. One of thethings that Jason and I have learned isthat it only takes a couple of bad resultsto totally turn a user against a machinesolution: robustness and predictabilityof the machine are critically important.”

“It’s a very exciting timeto be working in the field!”

An image of those directly involved in the algorithm work (from left to right): Praveen

Kakumanu, Steve Worrell, Jason Knapp.

Page 8: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

SIFT3D is a 3D image extrapolation of SIFT (scale-invariant feature transform). Itextracts a robust description of the content of 3D images by leveragingvolumetric data to detect keypoints. It can also be used for 3D image registrationwith the RANSAC algorithm, by matching SIFT3D features and fitting geometrictransformations.

SIFT3D is an open-source code, implemented C, and includes Matlab (Mex)wrappers.

Identification and extraction of local keypoints is a broad, well-established andpopular area of computer vision. One of the best known and most popularmethods is SIFT. Techniques for feature extraction (and SIFT in particular) weredeveloped with 2D images in mind and cannot be trivially expanded to 3Dimages. SIFT3D is a robust tool for extracting a local keypoint description from 3Dimages; and is particularly useful and relevant in the medical domain for imagingtechniques such as CT and MRI.

In the original 2D SIFT algorithm, described by Lowe in his paper "DistinctiveImage Features from Scale-Invariant Keypoints", he broadly separated theprocess into three main stages:

1. Scale-space detection and keypoint localization: identifies points of interest(which the algorithm terms keypoints) that are independent of scale orlocation within the image. And a stable parabolic model is fitted to eachkeypoint.

2. Orientation assignment: assigns one or more orientations to each keypoint.This orientation assignment makes the points invariant with respect torotation and viewpoint.

3. Keypoint descriptor: here, a gradient for each selected keypoint in each scaleis computed. The gradients are transformed into a representation using ahistogram, which enables their quantification and comparison relative toother keypoints.

Next, a detailed explanation of each of the three stages will be given, togetherwith means used to expand and adapt the algorithm to enable SIFT3D to handlethree-dimensional images.

1. Scale-space detection and keypoint localization: The first stage searches forpotential points of interest, over all image locations and scales. The keypointsare derived in an efficient manner, identical to that of 2D SIFT: first, differentGaussian scalespace (GSS) levels are calculated by a Gaussian function withchanging variance. Then, the Difference of Gaussians (DoG) is calculated bysubtracting consecutive GSS levels, arriving at the local gradient per scale,which we denote as 𝑑 for that point. The potential keypoints are denoted by𝑑(𝑥, 𝜎), where 𝜎 represents the scalespace level and 𝑥 represents thelocation. Each point whose 𝑑(𝑥,𝜎) value exceeds the predetermined threshold

8Computer Vision News

Tool

Too

l

SIFT3D

Page 9: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

𝛼 is selected as a keypoint. Finally, as in 2D SIFT, we fit a parabolic model to eachkeypoint and the model is constructed by interpolating neighboring keypoints.

2. Orientation assignment: The second stage assigns an orientation to eachkeypoint derived at stage 1. The orientation for each keypoint is assignedbased on image gradient directions. From here on out all operations areperformed on keypoints that have undergone transformation based on theirlocation and scale. This process produces keypoints invariant to imagerotation.

In two-dimensional SIFT the orientation angle θ at any keypoint is determinedby the gradients around that keypoint, and the rotation matrix R(θ) is derivedfrom θ. However, the orientation is more difficult to arrive at in 3D.

There are several approaches to determining 3D orientation, SIFT3D implementsone relatively straightforward approach, calculating the correlation betweengradient components in the image using the following formula, referred to asthe structure tensor:

where 𝛻Ix is the derivative of the image I at location x. And ωx is the Gaussian weight component for window width W. This structure tensor is real and symmetric, and thus it has an orthogonal eigen-decomposition:

K = QΛQT

Matrix Q still doesn't give us an unequivocal orientation, for it is ambiguous as tothe direction of change along each axis. To avoid this ambiguity, SIFT3D adds thecondition that the derivatives along each axis be positive. For this, each columnri of the rotation matrix R is calculated as follows:

where qi is the i-th column of Q.

At this point, after the gradients have been calculated for each keypoint at theselected scale, they are transformed into gradient histogram representation toallow for their quantification and comparison relative to other keypoints.

Too

l

Computer Vision News

Tool 9

K =

x∈W

ωx𝛻Ix𝛻IxT

ri = sgn qiTd

d =

x∈W

wx𝛻Ix

Page 10: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

The gradient histograms represent the distribution of gradient directions in thewindow around the keypoint. Gradient histograms are considered a stable,reliable representation.

In 2D SIFT a 10-bin histogram was constructed by dividing the anglespace into 10equal bins, each covering a 36°range. To adapt the histogram to 3D, sphericalcoordinates were used. The spherical anglespace was divided into triangular tilesof identical shape and area, identically arranged in space1. The old and new tilingdivisions of the spherical anglespace can be seen in the illustration below:

Old New

As in 2D SIFT, here too, interpolation is conducted between the histogram bins tomitigate the effects of quantization. Specifically, at the orientation selected,interpolation between the vertices of the triangular tile, as presented in thefollowing illustration:

3. Keypoint descriptor: Descriptor selection, too, is similar to regular SIFT, forthe 3D case the descriptor is derived by taking a 3D region surrounding thekeypoint and subdividing it into 4x4x4 sub-regions – for each of which a 12vertice histogram is constructed as described, giving us a features vector with43*12=768 dimensions.

As in 2D SIFT, a Gaussian window with parameter σ is used to give featuresnearer the keypoint a higher weight than those further away. Next, tri-linearinterpolation is performed between the eight connecting sub-region centers.Thus, the added value to the bin at vertex v𝑖 is:

where wwin is the window width, ws is the weight assigned the interpolatedsub-region, and λi is the barycentric coordinate of 𝛻I with respect to v𝑖.

1 This is in contradistinction with previous approaches which divided the space intoequal angles along a single plane - forming tiles of different shape and area - creatingan unnatural and inappropriate extrapolation of the 2-dimensional histogram.

10Computer Vision News

Tool

Too

l

f vi = wwinwsλi| 𝛻I |

Page 11: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

Computer Vision News

Tool 11

Finally, the descriptors are L2 normalized, truncated by a constant threshold δ,and normalized again.

Let's dive in and look at the capabilities of the tool itself, you need to get SIFT3D from the website, we highly recommend you download the binary version:

SIFT3D consists of 2 main functions:1. kpSift3D - Extract keypoints and descriptors from a single image.2. regSift3D - Extract matches and a geometric transformation from two images.

Now let's look at two code snippets for these two functions:

In this first, short code we'll read theimage, call the detectSift3D functionto get the keypoints, and theextractSift3D function to get theirdescriptors.

task using a simple script with SIFT3D. In a nutshell, keypoints are extracted fromthe two images to be matched, and are then matched using the RANSAC algorithm.

Too

l

In our secondexample we'll showyou the use of thefeatures of SIFT3Dto register twoimages. We havefull MRI scans andcropped versions ofthose scans, and weare seeking toprecisely locate thecropped MRIs'original positionwithin the full MRI.Obviously, this canbe done manuallybut it would be atime-consuming andtedious task. Wewill show you howto easily perform this

Page 12: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

Let's look at the script's code: the first 2 lines read the MRI scans, where the src isthe cropped image and the ref is the full image.

Next, we call the registerSift3D function, which gets the following parameters,the ref and src image, each with its units, as received from the imRead3Dfunction.

The function can have the following additional parameters:

nnThresh - the matching threshold, in the interval (0, 1); default: 0.8. errThresh - the RANSAC inlier threshold, in the interval (0, inf); default: 5.0. numIter - the number of RANSAC iterations; default: 500.

resample - if true, resamples src and ref to have the same resolution; default:false.

In the example below, we see the cropped image above, and the full imagebelow. In red, we see one matched keypoint. You can see that the identifiedkeypoint is located at corresponding locations in both images (the same locationfrom the point of view of salient features). Also note, that the keypoint is locatedat an edge, making its features (the histogram representing the keypoint) stableand robust.

Image processing and linear algebra can perform file IO for a variety of medicalimaging formats, including DICOM and NIFTI.

12Computer Vision News

Tool

Too

l

As a side bonus, the SIFT3D shipped with imutil, a library we found particularly useful for efficient

saving and loading of NIFTI files into Matlab

Page 13: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

Computer Vision News lists some of the great stories that we have just foundsomewhere else. We share them with you, adding a short comment. Enjoy!

Identify overweight people from social media face photosLet’s start with one of those ideas that you never know ifit’s good or bad. It’s about researchers from MIT and a Qatarinstitute who built an algorithm to identify overweightpeople based on their social media face photos; apparently,to alert those whose face presents patterns of obesityagainst risks of cardio-vascular diseases and diabetes. Read…

Salesforce Joins Computer Vision With Image Recognition ToolFollowing last year's acquisition of MetaMind and as partof the launch of its new AI technology Einstein, Salesforceis also introducing Einstein Vision, a set of APIs that bringcustom image recognition to CRM, allowing users to easilyembed the power of image recognition in any app. Read…

EmotionRecognitionintheWildviaCNN&MappedBinaryPatternsFlipping hamburgers is not the app that comes to mind incomputer vision and deep learning. But this Pasadena, Californiarestaurant decided to give a robot named Flippy the job of“collaborative kitchen assistant”, in charge of grilling andflipping burger patties! Read… and don’t miss the video…

Want Perfect Hair? Just Send Madison Reed Your SelfieMadison Reed is an online shop for hair coloring products.Founder Amy Errett and her team designed a 12-questionquiz-based algorithm to determine the perfect dye foreach customer's hair. After women began sending their selfies,Errett turned to computer vision and developed a scalableapp providing a quick and convenient response. Read…

Apple Granted Patent for Advanced Facial DetectionApple is always granted many patents, but Enhanced FaceDetection using Depth Information obviously caught ourattention more easily than some magnetic buckle band forthe Apple watch. This fine article gives some insight aboutApple’s face recognition system for desktops. Read…

Stopping GAN Violence: Generative Unadversarial NetworksLet’s conclude with something funny: Read this paper…

Computer Vision News

Spotlight News 13

Computer vision applications, both evolving and emergingNice report by Embedded Vision Alliance’s Editor-in-Chief,based on lectures and talks by some of their Summit’sspeakers indicating the trajectory of technologies whichare becoming ubiquitous and "invisible". Read…

Page 14: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

Tal, where did you grow up?I was born in and grew up in Montreal,Canada.

How did you discover that you had apassion for technology?My father is an electrical engineer andhe had his own business. Mine is acommon path when you have a parentin engineering. He introduced me toprogramming at a very young age. Itwas an old TRS-80, and I really liked to

program. He also let me solder. I wasalways interested in math and science,but through him I realized that I wasvery interested in engineering.

And that’s why you started universityin that field?

Yes, I studied Pure and Applied Scienceat CEGEP, the pre university collegesystem we have here after high school.Then I studied electrical and computerengineering at McGill University for my

14Computer Vision News

Women in Computer Vision

Tal Arbel - McGill University

Wo

men

in Scie

nce

We continue our series of interviews with women in computer vision. Thisnew section, which we started with the CVPR Daily at CVPR 2016, hopes tohelp mitigate the severe gender imbalance in the computer vision communityby getting to know better some remarkable female scientists and their careerpaths. This month, we interview Tal Arbel, who is currently AssociateProfessor in the Department of Electrical and Computer Engineering, Centrefor Intelligent Machines at McGill University in Montreal, Canada.

“We need more women in leadership roles”Photo: Eva Blue

Page 15: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

Bachelors. I didn't intend to go on tograduate school, but I got involved insome projects in robotics andcomputer vision and ended up stayingfor my Master’s and PhD at McGill... soI did all of my degrees at McGill.

Were you ever tempted to chooseanother field?No, I was very passionate aboutcomputer vision and interested inprobabilistic methods and robotics, so Ijust continued along the project that Iwas excited in.

You are very much involved ineverything. You chair conferences,teach many students, manage a laband take on so many tasks. Whatmakes you put so much on yourshoulders?I have many different interests. Mymain field of work is computer vision,but I became exposed to medical imageanalysis when I did my postdoctoralfellowship at the MontrealNeurological Institute. I realized thatthere are amazing opportunities toapply what I learned in the domain ofcomputer vision to a wide variety ofinteresting problems in medicinewhere real improvements to healthcarecan be reached. So I started to work inmedical image analysis in the contextof neurology and neurosurgery,developing new methods and providingtools to help clinicians, contribute todrug development and so on.

Is this what brought you to MICCAI?Yes, I began to publish in more of theMICCAI field. I continue to publish inboth fields and my students areworking in both areas: computer visionand medical image analysis. I haverecently become much more involvedin conference organization andoutreach in areas promoting women in

science and engineering because I feelthat it's important and it’s an initiativethat I'm passionate about contributingto.

We can bear witness to that from theMICCAI Women Networking Lunch inAthens. What do you take with youfrom that meeting?That was our first open forum. GozdeUnal, Ipek Oguz, Parvin Mousavi and Iput together this lunch at MICCAI to tryto bring together women in the fieldfrom all different levels of their careers,from students to professors. We alsohad some male colleagues present andyou. We tried to open it up to have adiscussion about various topics,obstacles that women might beexperiencing in terms of promotionand networking and so on. We didn'thave very much time at that particularlunch. We put together a “Sub-committee on Women at MICCAI”whose mandate is to get more femalescientists interested in fields ofrelevance to the MICCAI community, totry to get some policies in place thatwill encourage more women to enterthe field, and to try to achieve moreequitable career opportunities.I remember two opposing opinionsabout the actions needed to improvethe position of women in ourcommunity: one recommending positive

Computer Vision News

Women in Computer Vision 15

Wo

men

in S

cie

nce

Page 16: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

discrimination and the other viewingany assistance or benefit as adisservice to the cause. How do youreconcile these two views?It was our first meeting, and that is avery common initial reaction. In an idealworld, we wouldn't need thesemeetings promoting women in thefield, and everybody would have thesame opportunities, but the reality isthat we really don't. I think that asking:why do we need this? is a typical firstthought. However, I think that therehave been several successful examplesof how these networking events can bevery helpful, particularly to studentsand to new faculty members. I wasinvolved in the first Women inComputer Vision meeting as part ofCVPR. There, in addition to workshopswhere women presented, there was asuccessful dinner where female facultymembers were teamed up in tablesover dinner with PhD students in amentor/mentee relationship. Studentshad an opportunity to ask about careerpath, express any obstacles they might

be feeling and have that kind ofrelationship established. I had positivefeedback from female students on thisevent.I think that at MICCAI, it's time toexpose some of the issues that arecommon to many fields in science, inwhich women are underrepresented,expose implicit biases that existwhether we recognize them or not, tryand create an opportunity for them totalk to other women and to network,and to hear speakers on the topic. Ourparticular sub-committee on women isworking to organize a few morenetworking events.

If you could magically solve one of thediscrimination issues, which one wouldyou pick?I think that in fields of medical imageanalysis and computer vision, we needmore women in leadership roles. Weare making strides to increase thenumber of women on the MICCAIboard, to increase the number ofwomen organizers at conferences andso on... but I do feel that moreleadership role models should beencouraged. There are many excellentresearchers in the field, so it shouldn’tbe too hard to create gender balance atthe leadership level. This will encouragewomen, show that they are welcomeand that they are able to attain andachieve success on par with their malecolleagues. Overall, I'm quite surprisedat the low percentage of women in thefield of medical imaging, becausemedicine generally tends to attract ahigher percentage of women. I thinkwe need to do more outreach to promote

16Computer Vision News

Women in Computer Vision

Wo

men

in Scie

nce

Snorkeling in Bora Bora, French Polynesia

“In an ideal world, we wouldn't need these meetingspromoting women in the field”

“Oh! If I knew that was a careerpossibility, I would have gone

towards science and engineering!”

Page 17: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

the field as a possibility for female highschool students thinking about theircareers. When I describe what I do toyoung women, they always say thesame thing which is: “Oh! If I knew thatwas a career possibility, I would havegone towards science andengineering!” Personally, I'm chairing acommittee on promotingundergraduate careers in Engineeringat McGill where we are working onthese issues, but there is still morework to be done.

Can you share a few words about yourcurrent work?Sure. I head the Probabilistic VisionGroup and the Medical Imaging Lab inthe Center for Intelligent Machines.My work focuses on developingprobabilistic graphical models andmachine learning techniques incomputer vision and in medical imageanalysis. I have been focused in theareas of neurology and neurosurgery.An example of recent work is where mystudents and I have developed machinelearning techniques to automaticallysegment lesions and tumors from brainimages of patients. I have been workingwith a local company (NeuroRxResearch), that develops softwareanalysis tools for drug development forMultiple Sclerosis clinical trials. Themethods we have developed have beenintegrated into their system, significantlyimproving accuracy, cost, and speed.In addition, my team is part of arecently awarded a multi-million dollargrant focused on progressive MultipleSclerosis, led by McGill and includingother universities worldwide, such asUniversity College London andHarvard. My part of the project will beto develop automatic machine learningtools to find biomarkers in the imagesto predict progression in patients with

Multiple Sclerosis.I work on a number of other projects,including image registration in time-sensitive domains, such asneurosurgery, where I collaborate withthe Montreal Neurological Institute.

In terms of conferences, I am anorganizer for MICCAI 2017, where I amSatellite events Chair. This year, we’llhave 30 workshops, 14 challenges, and4 tutorials. We have made severalchanges to the conference. One of thethings that we have worked on forchallenges is to try to streamline theprocess. Challenges are very importantfor the field of medical imaging wherepeople don't necessarily have access todata. Challenges permit us to comparealgorithms. You put data, ground truth,and metrics out there, allowing peopleto compare their algorithms on equalfooting. It's extremely important in thefield of medical imaging and we wantto make sure that our challenges arefair and permit more scrutiny.

Computer Vision News

Women in Computer Vision 17

Wo

men

in S

cie

nce

Uluru, Australia

“Those are the most satisfying parts of my job without a doubt: working

with my students and helping with their career”

Page 18: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

I see a pattern here: you are so willingto help young women succeed in thefield; you are so willing to contributeto the good health of people byhelping medical research and drugresearch; you are so willing to helpstudents progress. Is giving to otherssuch an important drive for you?[laughs] Yes, that is important to me.Those are the most satisfying parts ofmy job without a doubt: working withmy students and helping with theircareer; I find the area of medicalimaging incredibly satisfying and thefact that I am actually working on thingsthat will be used on real patients andreal procedures is very important tome. There are many really interestingproblems. As long as there areinteresting problems, I'll still bemotivated to work on them.

What is the main lesson that youlearned from your students?I choose my students very carefully. Ihave excellent students and I loveworking with them in groups. Mystudents come to me with papers, withtechniques and with ideas. I considerthem much more my colleagues thanmy students. The kind of students that Ichoose tend to love working in teamsand to be very enthusiastic.

What is the greatest satisfaction youhad from your students?One of my students worked on MultipleSclerosis. She worked on Gadolinium-enhanced lesion segmentation. Herthesis was applicable to a wide varietyof problems including heart pathologysegmentation. She won best paper inone of the MICCAI challenges on thattopic. Her PhD thesis ended up winningthe top PhD thesis award in computervision in Canada at CRV, the CanadianConference on Computer and Robot

Vision. I love to see students come in,be very unsure, and just watch themdevelop confidence as researchers,develop passion, and see how differentthey are 4-5 years later… how they owntheir project, how they present it withsuch confidence, and apply it to somany different things. That gives megreat satisfaction.

What would you like to achieve in yourteaching or research career that youhave not yet achieved?In terms of career goals, I plan tocontinue to develop mathematicalmodels and to apply them to manyopen problems in medicine. I'd like todevelop machine learning andprobabilistic algorithms to automaticallylearn what, in the patient images, arecontributing to progression of diseaseand which patients would be moreattuned to taking a particulartreatment, resulting in tools forpersonalized medicine. I really believethat applying computer vision andmachine learning to medicine hasenormous possibilities in terms ofimproving patient outcomes.

18Computer Vision News

Women in Computer Vision

Wo

men

in Scie

nce

Tal (right) with Zahra Karim-Aghaloo, author of the award-winning PhD thesis

about detection of Gad-enhancing multiple sclerosis lesions

Page 19: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

Why did you start this Facebook group?Waikit Lau: [laughs] Why? I have noidea why! I’ll give you some context onhow this started. Arthur and I workedtogether at my last startup. Arthur wasmy resident machine learning expert inmy last company which was insemantics indexing of videos, speechrecognition and NLP. So a year ago I hadthis thought about how in AI and deeplearning there’s a lot of activity. Thereare a lot of things that are changing. Iwas talking to Arthur about this, and we

were thinking that there wasn’tanywhere we can go to ask questionsand get responses fast or get the latestdevelopments. We said “Hey, let’s starta group on Facebook!” You know,chances are it will fail, and we won’t getmore than 100 people, but that’s fine.We did it, and Arthur probably has donea lot of the heavy lifting in curating thegroup. People always ask us thisquestion: how come your group isthriving so much compared to othergroups? I say, there is only one reason:

Waikit Lau, Arthur Chan, Samson Timoner

Artificial Intelligence & Deep Learning (or AIDL as we use to call it) is no doubt thesurging Facebook group in our community. They will certainly deny it, but the mainreason for their adding now about 1,000 new community members every week (!)is the perfect moderation put in place by curators/founders Waikit Lau and ArthurChan. We have prevailed over their well-known modesty and were able to obtainan exclusive interview for Computer Vision News. Samson Timoner, who (amongmany other things) is also owner of another brilliant Facebook group (ComputerVision), joined the discussion moments later. What they tell us is a delight to read…

“Hey, let’s start a group on Facebook! ”

A fascinating conversation on Cyclops! Clockwise: Waikit Lau (top left), Arthur Chan(top right) and Samson Timoner (bottom left) interviewed by Ralph Anzarouth.

Computer Vision News

Guests 19

Gu

est

Page 20: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

Arthur Chan! Arthur spends a lot oftime curating, responding, and beingthoughtful about it. Otherwise, wewould just become like any other groupwhere people just post whatever theyread elsewhere and no one comments.It’s not that interesting. It’s not really acommunity… so that’s how we gotstarted.Arthur Chan: At this point, wheneverwe talk about the AIDL group, he lovesto exaggerate my role in it, but Waikit isevery bit as important as I am for thegroup because I remember there weremultiple decisions such as how do wegovern the group. All these fascinatingthings such as the YouTube “OfficeHour” channel and the AIDL newsletterare Waikit’s ideas, but he’s a humbleman. That’s why we call ourselves“humble administrators”.

I know you are both very humble, so Iwill ask you to put your modesty asidefor my next question. The group addsabout 1,000 new users every week.How many are we now?Arthur Chan: Close to 14,000.

Why do you think it is so successful?Arthur Chan: I think there are twomajor reasons. The first one is reallyabout the deep learning craze that isstill going on. People we talk with in thelast couple of years believe that thedeep learning trend will still go on for afew more years. The most importantpart is that the movement of deeplearning actually covers multiple fieldsrather than just one or two fields. Evenin the traditional deep learning fieldssuch as speech recognition, computervision and translation, we are stilltalking about how they are improving ata very rapid pace. I think that’s numberone because it is where deep learning isreally going. Of course, there’s always

this existing base of people who arefascinated by AI. We have both of thesetwo keywords [in our name]. Thesecond reason is the very specialstrategies we have in curating thegroup, as Waikit mentioned. You canthink at this this way: usually, there aretwo extremes of groups on Facebook.The first one is groups which don’t haveany rules. Everyone knows that therewill be a tons of grabs and reposts.Some people put their nails on that siteand claim that it's theirs. The otherextremes are when the administratorsare supposed to read all the postsbefore posting, but then there’s nodiscussion in those groups at all. Ourway reflects how Waikit and I see thesekind of discussion. We let people infirst, and then we exercise certaincommon sense judgment to decidewhether things should be in the group.After the group has developed for closeto a year, we developed a guidelinewhich is very clear on what should be inand what shouldn’t. A lot of membersunderstand that there are certain postsyou want and certain you don’t want. Ithink that is the healthiest way, and

20Computer Vision News

Guest

Arthur Chan

Gu

est

Page 21: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

that is why we are still growing in arelatively fast way.Waikit Lau: Yes, I agree with everythingArthur said. The one thing we cannotdiscount with all of this non-deterministic endeavor is luck. I thinkluck plays a big part in that maybe wegot to a critical mass fast enough at theright time with the right set of folks in acritical mass, so more begets more.We often say “Wow!” when reading animpressive paper or learning aboutnew techniques. Which was the latestpaper or finding that made you say“Wow”?Waikit Lau: There have actually been abunch of them recently. I think in thepast week, Kaiming He from AI researchhas a really interesting new version ofinstance detection and segmentation.The reason why I think it is sointeresting is that you may think thatinstance detection, segmentation andcharacterization do not have a broadapplication, but that’s what makes itreally interesting. That’s why people inmachine vision are really interesting.Every little niche application can workwell. What Kaiming has done is that he

has taken what is essentially a nichesegmentation, detection andclassification problem and make it workreally well. You can imagine the nexttime someone wants to do Xclassification or gesture classification,you can apply a similar thought process.Arthur Chan: What I want to bring up isthe timing of this [Kaiming's] research. Ithink the way I look at is a bit moretechnical: as every time people try tomake a deep learning-based methodtrained end-to-end, the performanceimproved. That makes me feel that theMask R-CNN paper is an importantmoment of the instance segmentationresearch.This segmentation example is very new.But it happened to detection a coupleof years ago. In the past, detection hasbeen done in multiple stages andwasn't done end-to-end. As you knownow, Faster R-CNN is one of the best, itis trained end-to-end, and you canadapt it to different classes.You can see that trend in many fields. Ifyou move this away from detection tosegmentation, say speech recognition,people have been trying to train speech

Computer Vision News

Guests 21

Gu

est

Page 22: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

recognizer end-to-end as well. Again, Ithink if it succeeds, it would be a majorbreakthrough. In any case, I think thiskind of investigation of neural networkarchitecture is fascinating for me. It’ssuper AI nerdy and deep learning nerdy.

Today most of the deep learningresearchers are employed by bigcompanies like Google, Facebook,Amazon... Don’t you think this mightharm the diversity of future research infavor of certain industries?Waikit Lau: I think that’s actually a hugeproblem for AIs or its more genericmachine learning developments. Whatis potentially the next industrialrevolution powered by AI is now mostlycontrolled by a handful of largecompanies or organizations withresearch pointing at a very specificagenda. Google has done some veryinteresting stuff, just having a teamwork on cancer diagnostics andhealthcare diagnostics, with largerbudgets than the pharma anddiagnostics companies; and startupscannot compete with all the Facebooksand the Googles of the world. It’s achallenge. Hopefully this imbalance willcorrect itself over time, maybe as moreand more venture capitalists and fundcompanies come in; they may not stillpay as much as Google, but the thrill ofbuilding a startup and the thrill ofsolving a big idea would attract AI talentaway from the Facebooks and theGoogles into startups that are doingthings that are maybe more worthwhilefor humanity.

What kind of companies?Waikit Lau: I’m thinking of healthcare.It’s worthwhile, and those may not haveimmediate payoff now, but maybe 5 or10 years down the road, all the wayfrom discovery to diagnostics to

whatever. There’s so much impact, butthat’s not what Google and Facebookare doing. They are all about helpingpeople to click more ads and things likethat, which is great, but I think startupsthat work in the healthcare field andbio machine learning through healthcare will need to compete in a way thatis non-monetary, like appeal to AI talentand say “What we are doing is going tochange the world. It’s not just about themoney.”

Do you think young engineers can stillbe idealistic in their missions?Waikit Lau: Yes, I think that as long asthe company pays well enough, it’s anincentive. After a certain level of pay,they may realize that making another$50,000 a year is not going to move me.I’d rather work on somethingworthwhile.

Which tutorial or book would yourecommend to a new person joiningthe field?Arthur Chan: That is the mostfrequently asked question in the forum.One of the things I wrote is an articlecalled “Learning Deep Learning - my top5 list”: it is a kind of a starter list thatyou can follow on various things. Usually,it starts from a basic machine learningclass such as Ng's. You can take otherclasses such as a statistics class. After oneor two of them, you will feel comfortable

with machinelearning andcan start deeplearning. Youcan take AndrejKarpathy’s class(CS231n). Ifyou also feelcomfortable withthat, you canalso take Geoff

22Computer Vision News

Guest

Waikit Lau

Gu

est

Page 23: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

Hinton’s neural network class. That is amore difficult class in terms of concept.If you are smart, you can get a hang ofit rather easily.

You have spoken about what othersdo. We would like to know what youguys do.Arthur Chan: I have been an engineeron speech recognition for many years. Icurrently work for a small companycalled Voci, a speech recognitioncompany. Waikit, Samson, and I are inthe same area so we go to jam ideas allthe time. The rest of the time, Iadminister the AIDL Facebook group.What about Cyclops, the video-conferencing app we are using to talkright now?Waikit Lau: Samson and I, with Arthuras an adviser, are working on Cyclopswhich is simple-to-use videoconferencing. With video conferencing,you still have to download the packageto use it. We said, as a start, let’s makeit super simple. With one click, youdon’t have to download: it just works inyour browser.

This is what I did a few minutes ago.

Waikit Lau: Exactly. Was it simple?

Extremely simple.

Waikit Lau: Exactly! It ought to be thatway. That’s just a start. What if youcould actually make video conferencingmore productive? We think about whatteams do in the same office. They gointo the same room. They whiteboardin front of each other. They talk aboutideas. It’s very collaborative. Peoplecan’t do that on video conferencingtoday using Google Hangout. Thewhiteboard is very washed out. Youcan’t really see it.

We said “Why can’t we solve this?” Wesaid, on top of this video conferencing,let’s build something that allows peopleto build applications and bots on top ofit just like Slack. We want to open anAPI platform to allow people to buildapps on top of it, so that people can bemore productive. We started off bysaying “Let’s build the apps ourselvesfirst”. One of the apps we build, forexample, is a whiteboarding app. Here,you can’t really see too well, it’swashed out. Now I turn on thewhiteboarding app, and you can see it

Computer Vision News

Guest 23

Gu

est

Page 24: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

more clearly. This is an enhancement,without which it is very blurry. The coolthing is that you can annotate.

Was it done by augmented reality?Waikit Lau: Right - It has an augmentedreality component in the sense that, inthe real time, we are taking all of thepixels on the white board and we arebasically enhancing the writing and thedrawing. Then we can take this shot,and can send it to everyone.We can toggle back and forth betweenwhite board mode and people. It alsohas an audio transcription mode. It is stillin beta, and it doesn’t work very well yet.

What were the challenges in building that?Waikit Lau: Samson Timoner willanswer that. He is a purist in computervision and works with Arthur andmyself at our company Cyclops.Samson TImoner: First, as I’m sure youknow, building a robust product thatworks across so many people’scomputers in so many countries is notan easy task. Not everything isstandardized, and the things that arestandardized still need to work. Yourbrowser version is different than mine.Your computer is different from mine.On the white boarding part, we all havedifferent cameras with differentresolutions, different signal to noiseratios. You may have more or lessprocessing power than I do. One of thereal challenges is having an algorithmthat runs smoothly on everybody’scomputer when you don’t know how muchpower is going to be in their computer.

What is the tool that helped you themost solve this challenge?Samson Timoner: That’s a really greatquestion. A lot of the standard webtools were great at profiling. It’s whatwe used. A lot of those were simplybuilt into Chrome. What it involves is

getting feedback and literally walkinginto a store, trying it on a bunch ofcomputers, taking a look at processmanagers, and seeing how we are doing.That’s not an easy problem by any means.

What’s your advice for a youngengineer entering the field?Arthur Chan: My advice is, don't enterthis field because it takes a hugeamount of time to work on stuffs, waitfor experiments... learn how to tradestock! [everybody laughs] Seriously, lotsof people who want to work on datascience feel like it doesn’t take toomuch time. Mostly because it feels likeyou just work on math and think. Thetruth is you also have deadlines andsome situations are hopeless.Sometimes you don’t know how toimprove the system more. Make surethat you prepare, and you learneverything. Don’t stop learning, otherwiseyour skills will be stagnated very easily.Samson Timoner: I’ll add on thecomputer vision side that there’s a bigdifference between making analgorithm work and shipping a product.I would advise anybody coming to thisfield to be part of a team and see theprocess... how an idea becomes analgorithm, how an algorithm becomespart of a module, then the moduleactually goes to shipping and it remainsin a mode where people feel confidentthat it’s working.Waikit Lau: Go find a team that is A+.Frankly, it matters less about thecompany, because you learn frompeople. Traditionally, besides tier-1companies like Google or Facebook,there are interesting, small companiesdoing interesting stuff. You also have touse due diligence and do some research.Plus, good chemistry is important.

24Computer Vision News

Guest

“Learn how to trade stock!”

Gu

est

Page 25: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

Cyclops.io is the first video conferencing platform that uses computer vision. Weused it to interview our guests (see pages 19-24) and we were able to have a veryclean group discussion with four people, without any delays or no excessivelymetallic voices.

Cyclops has productivity apps on top, to replicate the in-the-same-room teamdynamic. It works in your Chrome browser without any download or login. You canget up and running in seconds. Waikit Lau sent us a link to click, we clicked on itand we were instantaneously in video conference with him. We can assume thatminutes later Arthur and Samson did the same to join the video conference.

It is great for short stand-ups as well as for hour-long meetings; you can also leaveit as an always-on video/audio channel with your team. Twitter, Houghton Mifflin,Sapient and other companies are currently using the platform.

Apps and features:(on top of the video conferencing platform)

• Remote Whiteboarding: if you tend to point your camera at a whiteboardduring video conferences for team collaboration, Cyclops has a whiteboardmode that uses computer vision to dramatically enhance your writing so yourremote teammates can read it. They overlaid annotation tools on it, so thateveryone can mark up the whiteboard. You can post a screenshot to Slack oremail it.

• Audio Transcription: Cyclops can audio transcribe in real-time, so no moretaking notes in conference calls. It works best if you are using a headphone.You can post the transcript to Slack or email it.

• Use it as an Always-On channel with team: Unlike other solutions, Cyclopsdoesn't time out and reconnects automatically on Wi-Fi dis/reconnect. Ifyou're on a different browser tab, it auto-pauses video but leaves audiorunning (no embarrassing moment or big brother feel).

The developers promise that more apps are coming...

Computer Vision News

We Tried for You 25

We

Tri

ed

fo

r Yo

u

Cyclops - Video conferencing with no download

Page 26: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

Every month, Computer Vision News reviews a challenge related to our field. Ifyou do not take part in challenges, but are interested to know the new methodsproposed by the scientific community to solve them, this section is for you. Thismonth we have chosen to review the Densely Annotated Video ObjectSegmentation” (DAVIS), organized around CVPR 2016 (held in Las Vegas) and2017 (to be held this Summer in Honolulu). The website of the challenge, with allits related resources, is here. Read below what Jordi Pont-Tuset, a post-doctoralresearcher in Prof. Luc Van Gool’s ETHZ vision group and one of the organizers ofDAVIS, tells us about this challenge (read also about Jordi’s work here and here).

BackgroundVideo object segmentation is aboutobtaining the binary masks of anobject in all frames of a video, given itssegmentation in the first frame.Applications of this can be objectinpainting (removing one object andfilling in the hole), interactive contentvisualization, etc.In CVPR 2016, we presented the“Densely Annotated Video ObjectSegmentation” (DAVIS) dataset andbenchmark, which consists in 50 fullhigh-definition sequences annotateddensely (every frame) with pixel-levelunprecedented quality. It goes beyondprevious datasets in video objectsegmentation in that it hassignificantly better annotation

accuracy, it is much larger (thousandsof annotated frames), its images arefull high definition and it covers manyreal-world challenges (occlusions, fastmotion, etc.).

MotivationThe scale and quality of DAVIS allowedthe rapid deployment of deep learningtechniques that boosted theperformance of the state of the artsignificantly. Two recently-acceptedCVPR 2017 papers (OSVOS andMaskTrack), for instance, reachperformances around 80%. We quicklyrealized, therefore, that we needed tostep up the level and so we createdthe DAVIS Challenge on Video ObjectSegmentation 2017.

Ch

alle

nge

26Computer Vision News

Challenge

Densely Annotated Video Object Segmentation (DAVIS)by Jordi Pont-Tuset

10.000 annotated frames, 500 annotated objects!

We have collected and annotated a new set of sequences withthe following highlights:• 150 sequences (100 new) adding up to more than 10.000

annotated frames.• Multiple objects per sequence annotated, totalling

around 500 annotated objects.• The majority of the sequences are high-quality 4K UltraHD

footage.• Same pixel-level quality annotations as the original DAVIS.

Jordi Pont-Tuset

Page 27: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

To promote research and competition in the field, we will host a challengewhose results will be presented at the CVPR 2017 workshop, that will work asfollows:• April 1, 2017 - Open end: Test-Dev 2017 phase, 30 new sequences, available

1st of April. Ground truth not publicly available, unlimited number ofsubmissions.

• June 19, 2017 to June 30, 2017: Test-Challenge 2017 phase, 30 newsequences. Ground truth not publicly available, limited to 5 submissions in total.

• Also on April 1, we will make public 90 sequences and ground truth: the 50original DAVIS 2016 sequences (reannotated with multiple objects) plus 40new sequences.

The top entries will receive an NVIDIA Titan X GPU as a prize and will be able topresent their work at the CVPR 2017 workshop on July 26.We encourage all readers to consider participating and good luck to everyone!

Ch

allen

ge

Computer Vision News

Challenge 27

Page 28: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

Every month, Computer Vision News reviews a successful project. Our mainpurpose is to show how diverse image processing applications can be and howthe different techniques contribute to solving technical challenges and physicaldifficulties. This month we review RSIP Vision’s sophisticated NavigationSystem for Orthopedic Surgery based on advanced image processingalgorithms designed to support surgeons in their task. RSIP Vision’s engineerscan assist you in countless application fields. Contact our consultants now!

BackgroundNot so long ago, surgery was simply theoperation of a surgeon making cuts inthe interested region, feeling withhis/her fingers the bones beingoperated and fixing the mechanicalsupports which helped complete thetask. There was no navigation systemand as an alternative only sensorialtouch and feel were available.

Navigation is a new concept andprocess: we use pre-op imaging of thepatient, usually CT images, to positionmechanical fixtures inside the patient’sbody without having to search for thespecific location. CT image is used as amap to navigate the body of thepatient.

What enabled RSIP Vision to achievethis major breakthrough is theintegration of previously knowntechniques with our expertise in imageanalysis and our background inmathematics to perform calibrationand registration.

Required StepsLet’s see what are the steps involved inthis task; to do that, we shall first lookat what the final result should be andat what is needed in advance (beforethe operation takes place) in order toachieve it. During the operation itself,we want the system to enablenavigation across the CT of the patient.

We therefore need to know the exactreal-time location of every tool(screwdrivers, surgeon’s instrumentsand the like) in the real world, i.e.inside the body of the patient. This taskis the tracking of the tools. We alsoneed a registration of the tool with CTimage. In addition, we want to locatethe patient’s body in the surgery room.Achieving all these goals requiresseveral steps, like the following.

Segmentation: CT scan is performedbefore the surgery. Our segmentationalgorithm detects the vertebrae in theCT.

Calibration and Distortion correction:C-Arm, X-Ray portable camera is used tolocate the patient in the surgery room.

Pro

ject

28Computer Vision News

Project

Navigation System for Orthopedic Surgery

Example of CT segmentationin orthopedic surgery

Page 29: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

29Computer Vision News

Project

We superimpose the tool imageon the CT itself, helping the surgeon navigate with the greatest accuracy

Pro

ject

The C-Arm scan distortion is correctedand the camera is calibrated.

Registration: during the operation, weneed to register the patient’s 2D C-Armscan with the 3D CT scan. Registrationis done object wise, in the sense thatactual bones are registered. This is whybones need to be segmented inadvance. Additional registration needsto be done between the C-Arm and thetracking system, to know where thelatter is currently located with respectto the camera. The registrationbetween the tracking system and the2D scan and after that the registrationbetween the 2D scan and the 3D CTyield a transformation from thetracking system to the CT scan (The C-Arm is used as intermediate aid).

At this stage, all is ready for surgery:every movement of the tool will be

tracked by the tracking system and itsposition converted into coordinates ofthe CT, so that we superimpose thetool image on the CT itself, helping thesurgeon navigate with the greatestaccuracy.

Each of the aforementioned steps hasits own challenges. For example:segmentation requires to analyze andlocate each bone in all its details,because knowing its precise position iskey to achieve high accuracy.Registration too is a challenging task,because it requires registrationbetween bodies, which is far fromsimple: every bone has its own shape(think at the vertebrae) and we have tomake sure that this shape is perfectlymatched. Also tools tracking needs toovercome temporary occlusions bythe surgeon and other obstacles too.RSIP Vision has all the experienceneeded to successfully solve all thesechallenges. Find out how we do it infive articles about our in-op navigationsystem. You can read them at this linkwith many other projects by RSIP Visionin orthopedic surgery.

Surgery tool as seen on the CT scan

Page 30: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

RSIP Vision’s CEO Ron Soferman has launcheda series of lectures to provide a robust yetsimple overview of how to ensure thatcomputer vision projects respect goals, budgetand deadlines. This month we learn about theSteps in computer vision software projects.

This month we compare a generalframework of project management tospecific aspects of computer visionproject management. We shallintroduce the subject from a standardproject management scheme and thenshow our recommended outline forcomputer Vision related projects.A typical software-related project canbe described as composed by thefollowing 3 phases:• Specifications - Project conception and

initiation, project definition and planning

• Development - Project launch orexecution, project performance andcontrol

• Testing - Project close

SpecificationDuring the first phase, project/productspecifications must be received fromthe marketing department (or, ifapplicable, from sales) - [for additionalinformation, refer to SRS - Softwarerequirements specification IEEE 830].Specifications should be converted into

Steps in Computer Vision Software Projects

30Computer Vision News

Project Management Tip

Man

ageme

nt

Page 31: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

R&D definitions, indicating challengesand risks. At this point, R&D preparesits own document, pointing out thealternative options for the developmentand the risks involved in each, includingthe time which will be potentiallyneeded to handle those risks: thehigher the risk, the longer its handlingmight prove to be [SDD - Softwaredesign description IEEE 1016, SCM -Software configuration managementIEEE 828, STD - Software testdocumentation IEEE 829].

Once this is done, marketing and/orsales are presented with a solution, toobtain their approval to the proposedoption, meaning that its objective is notfar off from the desired result and thatit matches the requirements [SQA -Software quality assurance IEEE 730]

When computer vision products areinvolved, it is important to obtain asmany sample images as it is possible toget. These samples should span thewhole range of the problem at handand this should certified by the fieldengineers: the presented cases, storedin a dedicated data base, presentimmediate and long term datasetswhich need to be supported. Missingcases may degrade the long term’sproduct performance.

The task of having all possible imagesfor a computer vision project seemsimpossible. And indeed, we actuallystart with a limited dataset.

It is therefore important that thecomputer vision project managerunderstands the different complexitiesand the different directions that theexamples database has to represent, aswell as the quantities that need to beincluded in the database for eachphenomenon.

Now, in the age of machine learning (aswas the case in previous days, in theage of optimization and heuristics),algorithms parameters are numerous: asmall set of images might limit theperformance of the algorithm due tothe narrow scope it represents. This issimply because deep learning isreplacing the complex logical rules(coded in software) with automaticlearning of the features in the image set(hence, deep learning). The larger andbroader the image set, the better willbe the image classification obtained.

DevelopmentOnce the solution is agreed upon, it istime to set up the R&D plan. This planincludes phases and milestones, whichclearly define the schedule and theexpected deliverables through thatschedule. At every milestone, specifiedobjectives must be met following thepredefined schedule. In case thoseweren’t met, the development planmust be modified accordingly, assessinghow this will affect the final product, interms of capabilities, functionalities anddelivery date. The project managerfollows the development on a weeklybasis and all the correspondingmembers should participate in thefollow-up meetings and in thedecisions, since these affect the R&Dand (more importantly) the productitself [SPM - Software projectmanagement IEEE 1058].

This development phase and the follow-up performed by the project manager ismore complicate in computer visionproject management, owing to the very

At every milestone,specified objectives must be met

following the predefined schedule

31Computer Vision News

Project Management Tip

Man

agem

ent

Page 32: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

32Computer Vision News

Project Management Tip

Man

ageme

nt

wide solution options: for example, it ispossible to add algorithm work to solvemarginal cases. Project managersencounter a trade-off: they are forcedto take into account ideas coming fromthe algorithm developers and at thesame respect the productspecifications designed in advance.Each engineer might ask for more timeto complete his task bringing new andground-breaking ideas, so that theproject manager is called to display thesoundest professionalism inunderstanding the different options. Atthe same time he/she adopts a globalview to comprehend the meaning ofeach and see the complete picture ofall the development and of theresulting product. One example may bestarting with a limited version of theproduct, by selecting only limitedfunctionalities in some areas or optingfor solving only a selected range ofcases (albeit in a more rigorous and

precise manner), in view of enlargingthe scope of the product in a furtherrelease. This method enables toprovide a high-confidence version,leaving the next level of challenges forthe next version. In complex computervision projects the strategy of startingwith a simplified version built todemonstrate the method isrecommended.

Among the many parameters whichneed to be taken into account, specialattention should be given to thosederived from the level of difficulty ofinput images, the level of certaintyrequired from the algorithm and thelevel of desired results: making surethat the results are correlated to thedesired project outcome.

Make sure that the resultsare correlated to

the desired project outcome

Page 33: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

The project manager might find quitechallenging to deeply understand thecomputer vision algorithm developerson one side, and at the same timetranslate that in real world terms onthe other side (Managers, Sales andMarketing). In other words, how anuance in the algorithm or a set ofparameters used to control it willpotentially impact the final product.Without this understanding, the finalproduct may be too complex tooperate and potentially viable only forexpert operators.Note: in multidiscipline products, a finalintegration is performed here, at thisstage. Multi-discipline projectmanagement will be discussed in onethe next articles.

TestingOnce the final milestone is reached, thefirst complete version of the product isavailable and can be tested in the AlphaSite. It may be performed by runningunit testing over a given examplesdatabase. Alpha is usually performedin-house, using field environmentconditions: samples and operatorsmimic the field. Once it is completedsuccessfully, a Beta Site testing isstarted. Beta site is performed inselected customer sites. The maincustomer benefit is that they can havean impact on the final products, nearlyas if it was partly custom-tailored to

their needs. In exchange, they offertheir resources (time, expertise) tofine-tune the product and later test itin a real-world environment [V&V -Software verification and validationIEEE 1012].The Testing phase has specialcharacteristics which may be special tocomputer vision based products (andthe like): the real world imposes muchbroader cases that the product needsto handle. You might have many bugs,which is natural when you expose anew computer vision system to anatural environment. Algorithm gapsmay start to appear in the form ofdeficiencies resulting from the largedata set including additional variations.The computer vision project managershould analyse the problematic casescarefully: evaluate their complexitiesand possible solutions. This informationis used to set priority of handling oradopting alternative solution methods.This process, consisting in both Alphaand Beta sites, might expose someparts of the system’s limitations andbugs, which may even result in failureto meet the planned time-to-market. Itis a very intensive period, during whichthe project manager ranks andprioritizes the different phenomena:doing this wisely enables to convergeto the desired final product. Failure indoing so might cost precious time,preventing the meeting of the expecteddeadline. All participants should beinvolved in this step, to set the bestsolutions respecting the productspecifications.In the following articles, this sectionwill discuss multi-discipline projectmanagement, common computervision project methods and similar keytopics.

Computer Vision News

Project Management Tip 33

The real world imposes much broader cases

that the productneeds to handle

Man

agem

ent

Page 34: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

34Computer Vision News

Research

Re

search

YOLO9000: Better, Faster, Stronger Real-Time Object Detection

Every month, Computer Vision News reviews a research from our field. Thismonth we have chosen to review YOLO9000: Better, Faster, Stronger, a paperproposing a state-of-the-art, real-time object detection system able to detectover 9000 object categories. We are indebted to the authors (Joseph Redmonand Ali Farhadi) for allowing us to use their images to illustrate this review. Thepaper is here.

Prior detection methods are generally classifier-based systems: they apply themodel to multiple locations and scales within a single image; they calculate thescore for each individual region and elect high-scoring regions of that image asdetections.

Our review will be focusing on two new versions produced by the YOLO team(YOLO stands for You Only Look Once), improved from what they presented atthe CVPR conference in 2016 (which we will be referring to as YOLOv1): YOLOv2and YOLO9000. YOLO9000 implements the authors’ proposal to train a networksimultaneously on object detection and classification; on the COCO detectiondataset and the ImageNet classification dataset, which have over 9000categories between them.

Here are examples of a few of the impressive results, from video(!):

Introduction:The original version of YOLO from CVPR-2016 (YOLOv1 here on after), describedin the illustration below, consisted of 24 convolutional layers, followed by 2 fullyconnected layers.

Trained using a special new technique of training simultaneously on both the COCO and ImageNet datasets,

which together include over 9000 categories

Page 35: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

Computer Vision NewsComputer Vision News

Research 35

Re

sear

ch

In YOLOv1 the image was divided into SxS cells of equal size, the cell that thecenter of an object occupies is responsible for detecting that object. Each cell isassigned to B (overlapping) bounding boxes; for each one, the cell has a scoreindicating the measure of confidence for that box, i.e. to what degree thealgorithm is certain it includes a real object, and the accuracy it assigns to thecategory identification.

Each bounding box has 5 parameters: x, y, h, w and conf. x and y indicate itscenter, h and w indicate its height and width, and conf indicates the IOU betweenthe (predicted) bounding box and any ground truth boxes in the image.

Moreover, each of the SxS cells estimates the probability of what object class isfound in that cell; that is the function . Each cell of an imagevotes on only one class, regardless of the number of boxes it is assigned to. Attest-time the system multiplies all the probabilities to arrive at a class-specificconfidence score for each cell, according to the following formula:

This score indicates both the probability of what specific class the object in thecell belongs to and to what degree the algorithm is confident of the accuracy ofthis classification.

This is summarized in the figure below: YOLOv1 identifies objects by means ofregression. It divides the image into SxS cells, and for each estimates a boundingbox B, and measure of confidence for each box, and a probability C for each classof objects. All of this data is saved in a tensor with the dimensions: SxSx(B*5+C)

𝑃𝑟(𝐶𝑙𝑎𝑠𝑠𝑖 |𝑂𝑏𝑗𝑒𝑐𝑡)

𝑃𝑟(𝐶𝑙𝑎𝑠𝑠𝑖|𝑂𝑏𝑗𝑒𝑐𝑡) ∗ 𝑃𝑟 (𝑜𝑏𝑗𝑒𝑐𝑡) ∗ 𝐼𝑂𝑈𝑃𝑟𝑒𝑑𝑡𝑟𝑢𝑡ℎ = 𝑃𝑟 (𝑐𝑙𝑎𝑠𝑠𝑖) ∗ 𝐼𝑂𝑈𝑃𝑟𝑒𝑑

𝑡𝑟𝑢𝑡ℎ

Page 36: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

36Computer Vision News

Research

Re

search

Now, let’s describe YOLOv2 and YOLO9000:

Method:YOLOv2 is an improvement of YOLOv1; YOLO9000 has the same networkarchitecture as YOLOv2, but was trained using a special new technique of trainingsimultaneously on both the COCO and ImageNet datasets, which together includeover 9000 categories.

YOLOv2The basic network, called Darknet-19, was constructed based on YOLOv1 andsimilar research in the field. Like VGGNet it uses filters of size 3x3, and like NIN ituses 1x1 as well as global average pooling to make predictions. Its specificarchitecture is as follows:

YOLOv2 includes several key improvements over YOLOv1, which we will discussnow in detail:

The following table summarizes the percentile performance improvement perchange.

Page 37: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

• Batch normalization was added to the network and led to a significantimprovement (faster convergence) at the training stage, and made otherregularization methods (e.g. dropout) redundant. Normalization was added toall convolutional layers in YOLOv2 - leading to a 2% performanceimprovement.

• High resolution classifier: Current state of the art object detection networksrely on networks pre-trained on ImageNet. These networks usually run oninput images smaller than 256x256. YOLOv2 is pre-trained on ImageNet data,at a resolution of 448x448, for 10 epochs. Then, as a second stage, it is trainedto classify data into categories (already at the higher resolution). This processresults in a 4% performance improvement.

• Anchor boxes were added to predict bounding boxes; and both fullyconnected layers of the network were removed.

• Convolutional With Anchor Boxes: YOLOv1 predicted the coordinates of thebounding box directly using the fully connected layers. YOLOv2, similarly toFaster RCNN, uses anchor boxes and offset directly on the convolutional layer,which authors admit slightly lowered mAP from 69.5 to 69.2; however, recallrose from 81% to 88%.

• Dimension Cluster: In the training process, when initial size of anchor boxes isset manually - if the manual selection is poor, though the training processcompensates somewhat, mAP remains very low. Therefore, YOLOv2 uses theK-mean algorithm on the box, and centroid with a distance function torandomly select better initial anchor box size.

where IOU is the intersection over union between boxes with centroid. Authorsreport k=5 gave a good tradeoff between recall and complexity of the model.

• Direct location prediction: To prevent a situation where every point of theimage has a potential to be an anchor box regardless of location, even if it isat the very edge of the image. (With perfectly random initialization the modeltakes too long to stabilize.) YOLOv2’s approach is to predict locationcoordinates relative to the grid of SxS cells. The network predicts 5 boundingboxes for each cell and 5 coordinated for each bounding box. The last twoconstraints introduced, the size of the box and its location - make thenetwork’s predictions more stable and easier to train, and led to nearly 5%performance improvement.

• Fine-Grained Feature: Adding passthrough layers that bring features from anearlier layer at 26 × 26 resolution. The passthrough layer combines higherresolution features with the low resolution features by stacking adjacentfeatures into different channels similar to identity mappings in ResNet.

• Multi-Scale Training: YOLOv1 used input images of size 448x448, however,since YOLOv2 uses only convolutional and pooling layers it can resize on thefly. To achieve actual multi-scale training, the network would randomly selecta new image size every 10 batches. The size-pool for selection was increments

Computer Vision News

Research 37

Re

sear

ch

)𝑑(𝑏𝑜𝑥 , 𝑐𝑒𝑛𝑡𝑟𝑜𝑖𝑑) = 1 − 𝐼𝑂𝑈(𝑏𝑜𝑥 , 𝑐𝑒𝑛𝑡𝑟𝑜𝑖𝑑

Page 38: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

38Computer Vision News

Research

Re

search

of 32 between 320x320 and 608x608, thus enabling the network to train onimages of different sizes. At high resolution YOLOv2 is a state-of-the-art detectorwith 78.6 mAP on VOC 2007 while still operating above real-time speeds.

As you can see, most of the network redesign decisions listed in the above tableled to significant performance improvements (mAP).

Results:The results achieved by YOLOv2 on the VOC-2007 and VOC-2012 datasets arereproduced in the figure below. You can observe the tradeoff between imagesize, run-time and accuracy YOLOv2 offers, and in each run it achieves betterspeed or accuracy than all other algorithms (SSD, Fast R-CNN and Faster R-CNN).

Page 39: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

YOLO9000YOLO9000 implements the authors’ proposal to train a network simultaneouslyboth for detecting objects and for classification. First, the COCO detection datasetand the ImageNet classification dataset needed to be combined: using theWordNet concept graph (which is diverse enough to cover most datasets’ needs),a hierarchical tree of visual concepts was constructed, named WordTree. Belowis an illustration of using WordTree to combine labels from COCO and ImageNet.

The authors used this combined dataset - WordTree - to train YOLO9000. TheYOLOv2 network architecture just described was used. In training the networkbackpropagates loss as normal for image detection. For classification, however,only loss at or above the corresponding level of the label is back-propagated. Forexample, if the label is “dog” the network doesn’t assign any error to predictionsfurther down in the tree, “German Shepherd” versus “Golden Retriever”, becauseit does not have that information

YOLO9000 was evaluated on the ImageNet detection task, the task shares only 44object categories with COCO, meaning YOLO9000 has only seen classificationdata and not detection data for the majority of test images. YOLO9000 got 19.7mAP overall, and an impressive 16.0 mAP on the 156 object classes not shared byCOCO, for which it has never seen any labelled detection data. Note, it’ssimultaneously detecting for 9000 other object categories, running in real-time.

Computer Vision News

Research 39

Re

sear

ch

Page 40: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

40Computer Vision News

Research

Re

search

Conclusion and Demo:The Authors introduce YOLOv2 and YOLO9000, two real-time detection andclassification systems. YOLOv2 is state-of-the-art and faster than other detectionsystems across a wide range of datasets. YOLO9000 is a real-time convolutionalnetwork for detecting more than 9000 object categories by jointly optimizingdetection and classification. WordTree - the combined dataset of COCO andImageNet - was used to train YOLO9000. In addition, YOLO9000 is a strong steptowards closing the dataset size gap between detection and classification.

Click for a demo to integrate.

Let’s give it a quick try, just so you have a taste of how to use it: this demo canalso be found at the official YOLO9000 website. It demonstrates how to use a pre-trained model for detection.

Install Darknet simplyby typing:

Download the pre-trained weight file:

Then run the detector!

You will see some output like this:

Click for more demos and fullinstallation instructions.

Page 41: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

The Vision Monday Leadership Summit took place in New York on March 29,with the title: Supercharging Knowledge and Decision Making. That was only afew days before the publication of Computer Vision News. However, thanks to ourspecial correspondent Lior G., we can show you a little bit of what went on there.

“There's a huge learning curve to applying AI to healthcare”Pearse A. Keane, Moorfields Eye Hospital (UK)

Vision Monday Leadership Summit

41Computer Vision News

Artificial Intelligence Events

Session 4: A-Eye - How Artificial Intelligence is Transforming EyecareEv

en

ts

From a slide by Chandra Narayanaswami, chief scientist and senior manager, IBM“Analytic decision overload for Data Scientists” (content belongs to IBM Corp.)

Page 42: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

42Computer Vision News

Upcoming Events

Even

ts

Last year, Intel invited us to attendCVVC2016, their Computer VisionValidation Conference, in Haifa.

It was a great event and the technicalprogram was exceptional. Among manyothers, we had the chance to followthe lectures of Prasad Modali, DannyFeldman and our CEO Ron Soferman,who gave an acclaimed speech on“Computer Vision Validation - KeyLearnings from Medical Devices andOther Industries”.

Intel is now announcing CVVC2017 onSeptember 11-12 at their premises inHaifa, Israel. The lectures will be inEnglish and we recommend you give ita thought.

The conference program will beannounced on July 2. Click on theimage below to register (English infoavailable at the bottom of the page).Read also about the call for papers,tutorials and workshops. Deadline forsubmission is on May 15.

“How to test products that are based ondeep learning and computer vision technology?”

CVVC - Intel

Page 43: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

Evostar 2017 - Bio-Inspired ComputationAmsterdam, Netherlands Apr. 19-21

Machine Learning Prague 2017Prague, Czech Republic April 21-23

RE•WORK Deep Learning Summit in SingaporeSingapore April 27-28

BIOMEDevice and Embedded Systems ConferenceBoston MA, USA May 3-4

CRV - Conference on Computer and Robot VisionEdmonton AB, Canada May 16-19

RE•WORK Deep Learning / Deep Learning in HealthcareSan Francisco CA, USA May 25-26

WSCG Computer Graphics, Visualization and Computer VisionPilsen, Czech Republic May 29-Jun 2

FG: IEEE Intl. Conf. on Automatic Face and Gesture RecognitionWashington DC, USA May 30-Jun 3

AI Expo Europe Conference and Exhibition (with IoT Tech Expo)Berlin, Germany Jun 1-2

IEEE Intelligent Vehicles Symposium 2017Redondo Beach CA, USA Jun 11-14

SCIA 2017 Scandinavian Conference On Image AnalysisTromsø, Norway Jun 12-14

CARS 2017 Computer Assisted Radiology and SurgeryBarcelona, Spain Jun 20-24

Did we miss an event? Tell us: [email protected]

Dear reader,

Do you enjoy readingComputer Vision News?Would you like to receive itfor free in your mailboxevery month?

You will fill the SubscriptionForm in less than 1 minute.Join many others computervision professionals andreceive all issues ofComputer Vision News assoon as we publish them.You can also read ComputerVision News on our websiteand find in our archive newand old issues as well.

We hate SPAM andpromise to keep your emailaddress safe, always.

FREESUBSCRIPTION

Subscription Form(click here, it’s free)

Website and Registration

Website and Registration

Computer Vision News

Dear reader,

How do you like Computer Vision News? Did youenjoy reading it? Give us feedback here:

It will take you only 2 minutes to fill and it will helpus give the computer vision community the greatmagazine it deserves!

Give us feedback, please (click here)

FEEDBACK

Website and Registration

Website and Registration

Website and Registration

Website and Registration

Website and Registration

Website and Registration

Website and Registration

Website and Registration

Website and Registration

Website and Registration

Computer Vision News

Upcoming Events 43

Page 44: PowerPoint Presentation · 2017-04-03 · animating the image processing and computer vision community in Boston. When you finish reading ... have four FDA cleared X-ray products

Improve your vision with

The only magazine covering all the fields of the computer vision and image processing industry

A publication by

Subscribe(click here, it’s free)

Map

illary

RE•W

ORK

Gau

ss Surgical