Jianjun Zhao）、馬雷先生 (Lei...

趙建軍 (Jianjun Zhao）、馬雷先生 (Lei Ma)

（情報知能工学専攻）

http://stap.ait.Kyushu-u.ac.jp/~zhao/course/2020/Machine Learning Systems Engineering.html

1

http://stap.ait.kyushu-u.ac.jp/~zhao/course/2020/Machine%20Learning%20Systems%20Engineering-2020.html

Jianjun Zhao

PANGU LabInspire Intelligence of Future

機械学習工学特論（第1回目、2020年6月30日）

機械学習工学（Machine Learning Systems Engineering）は、機械学習システムの開発・運用・保守に関して体系的・定量的にその応用を考察する新しい分野であり、その先端的内容を習得することを目的とする。

本講義では、機械学習システムの開発・運用に関わる様々な手法やツールを論じる。▪ 具体的には、機械学習システムのための要求分析や目的設計の手法、機械学習システム開発を効率的に行うためのフレームワークやプログラミング言語と開発環境、機械学習システムの設計に用いるアーキテクチャ、機械学習システムのテスト・検証・デバッグ・モニタリングなどの手法を学習する。

3

Students should have basic knowledge of

machine learning, security and privacy,

and software engineering.

4

毎週火曜日１-２限目（6月30日から、8:40～12:00）

オンライン遠隔講義方式▪ Microsoft Teams▪ https://teams.microsoft.com/dl/launcher/launcher.html?url=%2f_

%23%2fl%2fmeetup-join%2f19%3ameeting_ZGQ0YWQ5YTAtOTRmZC00MTUzLWFmOTctM2RlMTU2MzUyYjdk%40thread.v2%2f0%3fcontext%3d%257b%2522Tid%2522%253a%2522d7715f89-936a-4af7-bb75-a57ac99646fa%2522%252c%2522Oid%2522%253a%2522cf9358a1-746e-46c4-ba59-15da73a7d810%2522%257d%26anon%3dtrue&type=meetup-join&deeplinkId=7e397fbd-6793-45a3-a8ec-5fb2f50ba6a3&directDl=true&msLaunch=true&enableMobilePage=true&suppressPrompt=true

5

https://teams.microsoft.com/dl/launcher/launcher.html?url=%2F_%23%2Fl%2Fmeetup-join%2F19%3Ameeting_ZGQ0YWQ5YTAtOTRmZC00MTUzLWFmOTctM2RlMTU2MzUyYjdk%40thread.v2%2F0%3Fcontext%3D%257b%2522Tid%2522%253a%2522d7715f89-936a-4af7-bb75-a57ac99646fa%2522%252c%2522Oid%2522%253a%2522cf9358a1-746e-46c4-ba59-15da73a7d810%2522%257d%26anon%3Dtrue&type=meetup-join&deeplinkId=7e397fbd-6793-45a3-a8ec-5fb2f50ba6a3&directDl=true&msLaunch=true&enableMobilePage=true&suppressPrompt=true

毎週火曜日１-２限目（6月30日から、8:40～12:00）

オンライン遠隔講義方式▪ Microsoft Teams▪ https://teams.microsoft.com/l/meetup-

join/19%3ameeting_ZGQ0YWQ5YTAtOTRmZC00MTUzLWFmOTctM2RlMTU2MzUyYjdk%40thread.v2/0?context=%7b%22Tid%22%3a%22d7715f89-936a-4af7-bb75-a57ac99646fa%22%2c%22Oid%22%3a%22cf9358a1-746e-46c4-ba59-15da73a7d810%22%7d

6

https://teams.microsoft.com/l/meetup-join/19%3ameeting_ZGQ0YWQ5YTAtOTRmZC00MTUzLWFmOTctM2RlMTU2MzUyYjdk%40thread.v2/0?context=%7b%22Tid%22%3a%22d7715f89-936a-4af7-bb75-a57ac99646fa%22%2c%22Oid%22%3a%22cf9358a1-746e-46c4-ba59-15da73a7d810%22%7d

第1回目：講義の紹介 (Introduction to the Course) (2020.06.30)

第2回目：深層学習ソフトウェアの品質について (Introduction to AI Software Quality) (2020.07.07)（馬雷先生）

第3回目：深層学習ソフトウェアのテスト手法と技術 (AI Software Testing) (2020.07.14)（馬雷先生）

第4回目：深層学習ソフトウェアの解析手法と技術 (AI Software Analysis) (2020.07.21)（馬雷先生）

第5回目：深層学習システムのテストに関する論文輪講 (2020.07.28)

第6回目：深層学習システムのデバッグと解析に関する論文輪講(2020.08.04)

第7回目：深層学習システムの検証手法と技術に関する論文輪講(2020.08.11)

第8回目~第14回目：研究課題

7

講師 (Lecturer)

▪ 趙建軍 (チョウケングン, Jianjun Zhao)

▪ ウエスト２号館751室

▪ 092-802-3625 (Office)

▪ [email protected]

▪ 馬雷 (マレイ, Lei Ma) (Guest Lecturer)

▪ ウエスト２号館752室▪ [email protected]

8

mailto:[email protected]

mailto:[email protected]

PANGU LabInspire Intelligence of Future

智能软件工程实验室（知能ソフトウェア工学研究室）

https://pangukaitian.github.io/pangu/jp/

Group members (20+)▪ 3 faculties and 4 PhD, 13 MS/research students, 4 Undergraduates.

On-going work ▪ researches on the potential symbioses between software

engineering SE and artificial intelligence AI

The overall long-term goal▪ to obtain better software and AI systems making them more

- robust, reliable

- secure, privacy (e.g., EU GDRP, California CCPA)

- interpretable, understandable, fair

- easier to specify, build, maintain, or improve9

Jianjun Zhao Lei Ma

https://pangukaitian.github.io/pangu/jp/

1982-1987：清华大学计算机科学与技术系学士

1994-1997：日本九州大学信息学院计算机科学系博士

1997-2000：福岡工業大学计算机系讲师

2000-2005：福岡工業大学计算机系副教授

2005-2015：上海交通大学软件学院教授

2015-2016：上海交通大学計算机科学与工程系教授

2016- 現在：日本九州大学信息学院教授

2002-2003：美国麻省理工大学 (MIT) 计算机科学实

验室访问科学家

2012-2017：日本国立情報学研究所 (NII) 客座教授10

The final grade will be based on

▪ 出席、論文輪講、研究課題レポート（総合的に評価する）

11

Deep Learning (Adaptive Computation

and Machine Learning series)

Ian Goodfellow, Yoshua Bengio, Aaron

Courville

The MIT Press, 2016

ISBN-10: 0262035618

The most comprehensive textbook

on deep learning available today

12

Conference proceedings（国際会議論文集）▪ AAAI: AAAI Conference on Artificial Intelligence

▪ NeurIPS: Annual Conference on Neural Information Processing Systems

▪ ACL: Annual Meeting of the Association for Computational Linguistics

▪ CVPR: IEEE Conference on Computer Vision and Pattern Recognition

▪ ICCV: International Conference on Computer Vision

▪ ICML: International Conference on Machine Learning

▪ IJCAI: International Joint Conference on Artificial Intelligence

Related research groups（研究室）▪ Google, Facebook, Microsoft, Amazon, Stanford, MIT, CMU, Berkeley

▪ ETH, University of Oxford, INRIA, EPFL, Tsinghua, Peking, Toronto

▪ 九大、東大、理研、国立情報科学研究所など

13

http://csrankings.org/ https://ainow.ai/2020/01/13/182173/#2019AI20-3

http://csrankings.org/

https://ainow.ai/2020/01/13/182173/#2019AI20-3

What is deep learning (DL)?

Secure deep learning engineering

Challenges and opportunities

14

Learning is any process by which a system improves performance from experience

-- Herbert Simon

The complexity in traditional computer programming is in the code (programs that people write). In machine learning, algorithms (programs) are in principle simple and the complexity (structure) is in the data. Is there a way that we can automatically learn that structure? That is what is at the heart of machine learning.

-- Andrew Ng

15

Algorithms that can improve performance using training data

Typically, a large number of parameter values learned from data

Applicable to situations where challenging to define rules manually

16

Deep learning is part of a broader family of machine learning

methods based on artificial neural networks. It is a class of machine

learning algorithms that use multiple layers to progressively extract

higher level features from raw input.

Deep learning architectures such as deep neural networks and

recurrent neural networks have been applied to fields including

computer vision, speech recognition, natural language processing,

and machine translation, where they have produced results

comparable to and in some cases superior to human experts.

Artificial Neural Networks (ANNs) were inspired by information

processing and distributed communication nodes in biological

systems.

17

4:12016

19

Human Champion

100:02017

20

LETTERd o i:1 0 .1 0 3 8 /n atu re 1 4 2 3 6

H u m an -levelcon trolth rou gh d eep rein forcem en tlearn in gV olod y m y r M n ih 1*, K oray K av u k cu oglu 1*, D av id Silv er1*, A n d reiA . R u su 1, JoelV en ess1, M arc G . B ellem are1, A lex G rav es1,M artin R ied m iller1, A n d reas K . Fid jelan d 1, G eorg O strov sk i1, Stig P etersen 1, C h arles B eattie1, A m ir Sad ik 1, Ioan n is A n ton oglou 1,H elen K in g 1, D h arsh an K u m aran 1, D aan W ierstra1, Sh an e L egg 1 & D em is H assab is1

T he theory ofreinforcem entlearning providesa n orm ative account1,deeply rooted in psychological2 and n euroscientific3 perspectivesonanim albehaviour,of how agents m ay optim ize their con trolof anenvironm ent.T o usereinforcem entlearning successfully in situationsapproaching real-w orld com plexity,how ever,agentsare con frontedw ith a difficulttask:they m ustderive efficientrepresen tationsoftheenvironm ent from high-dim ension alsensory in puts,and use theseto generalize pastexperience to new situations.R em arkably,hum ansand otheran im alsseem to solve thisproblem through a harm oniouscom bin ation ofreinforcem entlearning an d hierarchicalsensory pro-cessing system s4,5,the form er evidenced by a w ealth ofneuraldatarevealing notableparallelsbetw een the phasicsignalsem itted by dopa-m inergic neuron s and tem poraldifference rein forcem en tlearn ingalgorithm s3.W hile reinforcem entlearning agentshave achieved som esuccessesin a variety ofdom ains6–8,theirapplicability haspreviouslybeen lim ited to dom ainsin w hich usefulfeaturescan be han dcrafted,or to dom ains w ith fully observed, low -dim ensional state spaces.H ere w e use recentadvancesin training deep n euraln etw orks9–11 todevelop a n ovelartificialagent,term ed a deep Q -netw ork,thatcanlearn successfulpoliciesdirectly from high-dim ensionalsensory inputsusing en d-to-end rein forcem en t learnin g.W e tested this agent onthe challengin g dom ain of classic A tari2600 gam es12.W e dem on-strate thatthe deep Q -n etw ork agen t,receiving only the pixels andthe gam e score as inputs,w as able to surpass the perform ance ofallpreviousalgorithm san d achieve a levelcom parable to thatofa pro-fession alhum an gam estesteracrossa setof49 gam es,using the sam ealgorithm ,n etw ork architecture and hyperparam eters.T his w orkbridges the divide betw een high-dim ensional sensory in puts an daction s,resultin g in the firstartificialagen tthatiscapable oflearn -in g to excelat a diverse array of challenging tasks.

W e setoutto create a single algorithm thatw ould be able to developa w ide range ofcom petencieson a varied range ofchallenging tasks—acentralgoalofgeneralartificialintelligence13 thathas eluded previousefforts8,14,15.T o achievethis,w edeveloped a novelagent,a deep Q -netw ork(D Q N ),w hich is able to com bine reinforcem entlearning w ith a classofartificialneuralnetw ork16 know n asdeep neuralnetw orks.N otably,recentadvancesin deep neuralnetw orks9–11,in w hich severallayersofnodesare used to build up progressively m ore abstractrepresentationsofthe data,have m ade itpossible forartificialneuralnetw orksto learnconcepts such as objectcategories directly from raw sensory data.W euse one particularly successful architecture, the deep convolutionalnetw ork17,w hich uses hierarchicallayers oftiled convolutionalfiltersto m im ic the effectsofreceptive fields—inspired by H ubeland W iesel’ssem inalw ork on feedforw ard processing in early visualcortex18—therebyexploiting the localspatialcorrelationspresentin im ages,and buildingin robustnessto naturaltransform ationssuch aschangesofview pointor scale.

W e considertasksin w hich the agentinteractsw ith an environm entthrough a sequenceofobservations,actionsand rew ards.T hegoalofthe

agentisto selectactionsin a fashion thatm axim izescum ulative futurerew ard.M ore form ally,w e use a deep convolutionalneuralnetw ork toapproxim ate the optim alaction-value function

Q s,að Þ~ m axp

rtz crtz 1z c2rtz 2z ...jst~ s,at~ a,p ,

w hich isthe m axim um sum ofrew ardsrtdiscounted by c ateach tim e-step t,achievable by a behaviour policy p 5 P(ajs),after m aking anobservation (s) and taking an action (a) (see M ethods)19.

R einforcem entlearning is know n to be unstable or even to divergew hen a nonlinear function approxim ator such as a neuralnetw ork isused to representthe action-value (also know n as Q ) function 20.T hisinstability has severalcauses:the correlations presentin the sequenceofobservations,thefactthatsm allupdatesto Q m ay significantly changethepolicy and thereforechangethedata distribution,and thecorrelationsbetw een theaction-values(Q )and the targetvaluesrz c m ax

a0Q s0,a0ð Þ.

W e addressthese instabilitiesw ith a novelvariantofQ -learning,w hichuses tw o key ideas.First,w e used a biologically inspired m echanismterm ed experience replay21–23 that random izes over the data,therebyrem oving correlationsin the observation sequenceand sm oothing overchangesin the data distribution (see below fordetails).Second,w e usedan iterative update that adjusts the action-values (Q ) tow ards targetvaluesthatareonly periodically updated,thereby reducing correlationsw ith the target.

W hile otherstable m ethodsexistfortraining neuralnetw orksin thereinforcem entlearning setting,such asneuralfitted Q -iteration 24,thesem ethodsinvolve therepeated training ofnetw orksdenovo on hundredsofiterations.C onsequently,these m ethods,unlike our algorithm ,aretoo inefficientto be used successfully w ith large neuralnetw orks.W eparam eterize an approxim ate value function Q (s,a;hi) using the deepconvolutionalneuralnetw ork show n in Fig.1,in w hich hiaretheparam -eters (that is, w eights) of the Q -netw ork at iteration i. T o performexperience replay w e store the agent’s experiences et5 (st,at,rt,st1 1)at each tim e-step t in a data set D t5 {e1,…,et}.D uring learning,w eapply Q -learning updates,on sam ples (or m inibatches)ofexperience(s,a,r,s9), U (D ),draw n uniform ly atrandom from the poolofstoredsam ples.T he Q -learning update at iteration iuses the follow ing lossfunction:

L i hið Þ~ s,a,r,s0ð Þ* U Dð Þ rz c m axa0

Q (s0,a0;h{i ){ Q s,a;hið Þ

2" #

in w hich c isthe discountfactordeterm ining the agent’shorizon,hiarethe param etersofthe Q -netw ork atiteration iand h{

i are the netw orkparam eters used to com pute the target at iteration i.T he target net-w ork param etersh{i are only updated w ith the Q -netw ork param eters(hi) every C steps and are held fixed betw een individualupdates (seeM ethods).

T o evaluate our D Q N agent,w e took advantage of the A tari2600platform ,w hich offers a diverse array oftasks (n 5 49) designed to be

*T h ese au th ors con trib u ted eq u ally to th is w ork.

1 G o og le D eep M in d ,5 N ew S treet S q u are,Lo n d o n E C 4 A 3 T W ,U K .

2 6 F E B R U A R Y 2 0 1 5 | V O L 5 1 8 | N A T U R E | 5 2 9

M acm illan Publishers Lim ited. All rights reserved©2015

201521

222017

in China’s most innovative city, Shenzhen,

two US-educated Chinese scientists have

found a way to turn part of God’s Eye into

reality – if the authorities allow them to

insert a tiny chip into surveillance cameras.

with the chip, a surveillance camera can

greatly speed up human facial recognition

and spot a criminal suspect in a crowd in

just a few seconds. It has proved effective in

at least in one district in Shenzhen and,

according to publicly disclosed information,

has helped police crack hundreds of cases

and find a number of lost children.

https://www.technologyreview.com/s/611815/who-needs-democracy-when-you-have-data/ 25

27

Andrej Karpathy (Tesla)

28

Andrej Karpathy (Tesla)

29*Borrowed from the talk by Prof. Foutse Khomh (MLSE国際シンポジウム2019)

30*Borrowed from the talk by Prof. Foutse Khomh (MLSE国際シンポジウム2019)

20 30 40 50 600

100

200

300

400

500

600

700

DL has the potential to create annual value across sector totalling $3.5 to $5.8 trillion

31

Software 2.0 IR 4.032

The CB insights: https://www.cbinsights.com/

https://www.cbinsights.com/

33

計算機科学分野のノーベル賞

Alan Mathison Turing 1912年6月23日-1954年6月7日

34

For conceptual and engineering

breakthroughs that have made

deep neural networks a critical

component of computing.

36

Yoshua Bengio is a Professor at the University of Montreal, and the Scientific Director of both Mila (Quebec’s Artificial Intelligence Institute) and IVADO (the Institute for Data Valorization). He is Co-director (with Yann LeCun) of CIFAR’s Learning in Machines and Brains program. Bengioreceived a Bachelor’s degree in electrical engineering, a Master’s degree in computer science and a Doctoral degree in computer science from McGill University.

Geoffrey Hinton is VP and Engineering Fellow of Google, Chief Scientific Adviser of The Vector Institute and a University Professor Emeritus at the University of Toronto. Hinton received a Bachelor’s degree in experimental psychology from Cambridge University and a Doctoral degree in artificial intelligence from the University of Edinburgh. He was the founding Director of the Neural Computation and Adaptive Perception (later Learning in Machines and Brains) program at CIFAR.

Yann LeCun is Silver Professor of the Courant Institute of Mathematical Sciences at New York University, and VP and Chief AI Scientist at Facebook. He received a Diplôme d'Ingénieur from the Ecole Superieured'Ingénieur en Electrotechnique et Electronique (ESIEE), and a PhD in computer science from UniversitéPierre et Marie Curie.

https://awards.acm.org/about/2018-turing

https://awards.acm.org/about/2018-turing

北岡明佳の錯視のページ http://www.ritsumei.ac.jp/~akitaoka/ 37

http://www.ritsumei.ac.jp/~akitaoka/

北岡明佳の錯視のページ http://www.ritsumei.ac.jp/~akitaoka/ 38

http://www.ritsumei.ac.jp/~akitaoka/

39

[Szegedy Zaremba Sutskever Bruna Erhan Goodfellow Fergus 2013]

[Biggio Corona Maiorca Nelson Srndic Laskov Giacinto Roli 2013]

Classified as panda Small adversarial noise Classified as gibbon

Ian Goodfellow, Jon Shlens, Christian Szegedy, Explaining and Harnessing Adversarial Examples, ICLR, 2014

40[Sharif Bhagavatula Bauer Reiter 2016]: Glasses that fool face recognition

Self-driving car

Security-Critical

Medical diagnose

41[Sharif Bhagavatula Bauer Reiter 2016]: Glasses that fool face recognition

Face Recognition

Security-Critical

Malware detection

Security-Critical

42

[Szegedy Zaremba Sutskever Bruna Erhan Goodfellow Fergus 2013]

[Biggio Corona Maiorca Nelson Srndic Laskov Giacinto Roli 2013]

Safety-Critical

[Sharif Bhagavatula Bauer Reiter 2016]: Glasses that fool face recognition

Face

RecognitionMalware detection Self-driving car Medical diagnose

* Su, Jiawei, Danilo Vasconcellos Vargas and Kouichi Sakurai. "One pixel attack for fooling deep neural

networks". (Arxiv) (BBC News)

43

https://arxiv.org/abs/1710.08864

http://www.bbc.com/news/technology-41845878

47

Tesla autopilot failed to recognize a white truck against bright sky

leading to fatal crash

https://www.siliconvalley.com/2016/07/26/feds-driver-in-fatal-tesla-autopilot-crash-was-speeding/

48

Uber self-driving test car was involved in a fatal collision on March

19th , 2018

Deep learning reliability and security is crucial

Self-driving car Medical diagnosis Malware detection

49

+ = 50

Challenge: How to guarantee the safety and security of DL systems?

We propose Secure Deep Learning Engineering (SDLE) as an engineering discipline for supporting safe and secure deep learning system development.

▪ Lei Ma, Felix Juefei-Xu, Minhui Xue, Qiang Hu, Sen Chen, Bo Li, Yang Liu, Jianjun Zhao, Jianxiong Yin, and Simon See. Secure Deep Learning Engineering: A Software Quality Assurance Perspective. In arXiv Preprint, 2018.

51

We propose Secure Deep Learning Engineering (SDLE) as an engineering discipline for supporting safe and secure deep learning system development.

▪ Lei Ma, Felix Juefei-Xu, Minhui Xue, Qiang Hu, Sen Chen, Bo Li, Yang Liu, Jianjun Zhao, Jianxiong Yin, and Simon See. Secure Deep Learning Engineering: A Software Quality Assurance Perspective. In arXiv Preprint, 2018.

52

53

We define SDLE as:

➢ an engineering discipline of safe and secure

DL system development, through a systematic

application of knowledge, methodology,

practice on deep learning, software

engineering and security, to requirement

analysis, design, implementation, testing,

deployment, and maintenance of DL systems.

55

Requirement analysis investigates the needs,

determines, and creates detailed functional

documents for the DL products.

DL-based software decision logic is learned from the training data and generalized to the testing data.

The requirement is usually measured in

terms of an expected prediction

performance, which is often a statistics-

based requirement, as opposed to the

rule-based one in traditional SE.

56

57

After the requirements of the DL software

become available, a DL developer

(potentially with domain experts for

supervision and labeling) tries to collect

representative data that incorporate the

knowledge on the specific target task.

For traditional software, a human developer

needs to understand the specific task, figures

out a set of algorithmic operations to solve the

task, and programs such operations in the form

of source code for execution.

One of the most important sources of DL

software is training data, where the DL software

automatically distills the computational solutions

of a specific task.

58

5959

When the training data become

available, a DL developer designs the

DNN architecture, taking into account of

requirement, data complexity, as well as the problem domain

◼ When the training data become available, a DL developer designs the DNN architecture, taking into account of requirement, data complexity, as well as the problem domain.

◼ for example, when addressing a general purpose image

processing task, convolutional layer components are often

included in the DNN model design, while recurrent layers are

often used to process natural language tasks.

◼ To concretely implement the desired DNN architecture, a DL developer often leverages an existing DL framework to encode the designed DNN into a training program.

◼ needs to specify the runtime training behaviors through the APIs

provided by the DL framework~(e.g., training epochs, learning rate,

GPU/CPU configurations).

60

61

After the DL programming ingredients (i.e., training data and training program) are ready. The runtime training procedure starts and systematically evolves the decision logic learning towards effectively resolving a target task.

◼ After the DL programming ingredients (i.e., training data and training program) are ready. The runtime training procedure starts and systematically evolves the decision logic learning towards effectively resolving a target task.

◼ the training procedure and training program adjustment might go back-and-forth several rounds until a satisfying performance is achieved.

◼ although the training program itself is often written as traditional software (e.g., in Python, Java), the obtained DL software is often encoded in a DNN model, consisting of the DNN architecture and weight matrices.

◼ The training process plays a central role in the DL software learning, to distill knowledge and solution from the sources. It involves quite a lot of software and system engineering effort to realize the learning theory to DL software over years.

62

63

When the DNN model completes training with

its decision logic determined, it goes through

the systematic evaluation of its generality and

quality through testing (or verification).

◼ When the DNN model completes training with its decision logic determined, it goes through the systematic evaluation of its generality and quality through testing (or verification).

◼ note that the testing activity in the AI community mainly considers whether the obtained DL model generalizes to the prepared test dataset, to obtain high test accuracy.

◼ The testing activity (or verification) in SDLE considers a more general evaluation scope, such as generality, robustness, defect detection, as well as other nonfunctional requirement (e.g., efficiency).

◼ the early weakness detection of the DL software provides valuable feedback to a DL developer for solution enhancement.

64

65

A DL software passed the testing phase

reaches a certain level of quality standard, and

is ready to be deployed to a target platform.

◼ A DL software passed the testing phase reaches a certain level of quality standard, and is ready to be deployed to a target platform.

◼ However, due to the platform diversity, DL framework supportability, and computation limitations of a target device, the DL software often needs to go through the platform calibration (e.g., compression, quantization, DL framework migration) procedure for deployment on a target platform.

◼ For example, once a DL software is trained and obtained on the Tensorflow framework, it needs to be successfully transformed to its counterpart of TensorflowLite (resp. CoreML) framework to Android (resp. iOS) platform.

◼ It still needs to go through on device testing after deployment, and we omit the testing phase after deployment for simplicity.

66

6767

After a DL product is deployed, it might

experience the procedure of modification for

bug correction, performance and feature

enhancements, or other attributes.

◼ After a DL product is deployed, it might experience the procedure of modification for bug correction, performance and feature enhancements, or other attributes.

◼ The major effort in evolution and maintenance phases relies on the manually revision on design, source code, documentation, or other software artifacts.

◼ DL software focuses more on comprehensive data

collection, DL model continuous learning (e.g., re-fitting,

retro-fitting, fine tuning, and re-engineering).

68

◼ To collect papers from conferences listed on the Computer Science Rankings within the scope of AI & machine learning, software engineering, and security.

◼ to develop a Python-based crawler to extract paper information of each listed conference since the year 2000 and filter with keywords.

◼ to use keywords (e.g., deep learning, AI, security, testing, verification, quality, robustness) to filter the collected papers.

◼ this finally results in 223 papers

◼ we manually confirmed and labeled each paper to form a final categorized list of literature.

69

General purpose quality assurance▪ Robustness

▪ Reliability

▪ Safety

Interpretability & understandability▪ How and why

▪ Trustworthiness

Fairness▪ Sex Bias & discrimination

▪ Racism

Security

▪ Privacy (data & model)

▪ Against poisoning

72

General purpose quality assurance▪ Robustness

▪ Reliability

▪ Safety

73

For some applications:

▪ Complain not expected

▪ Lose customers' trust

▪ Fall behind competitors

▪ Lose market share

For X-critical systems:

Why so Important?

Dead or Alive

A must before shipment

Source Code

Executable Code

Compilation Transformation

Traditional Software Developer

Coding

Input Output

x=0

If (x==8)

x+=1 x+=2

Traditional program(control flow graph)

75

76

A neural network is a function f(X) → Y▪ Trainable parameters (Wi)

on each edge and nonlinear activation function at each neuron

▪ DNN learns the weights during training

Inference: Simply propagates X through layers (fast)

Training: Given training set (X,Y), adjust W to minimize the prediction error (slow)

Training Data

Deep Learning Model

Deep LearningSoftware Developer

Coding

Input Output

Train Program

Collection

TrainingNeural network

77

The decision logic of a traditional software:

◼ In the form of code

The decision logic of a DL system:

◼ The structure of DNN

◼ The connection weights

78

79

Traditional Software Development

Labor Intensive


Data Intensive

The few experts decides >70%

VS

…..

80


Data Intensive

The few experts decide >70%

…..

81

Classification

Generation & SynthesisPrediction

85

NeuronsConnection

Strength

Behavior of the DL system?

How to test & debug for quality assurance?

86

NeuronsConnection

Strength

High accuracy High DL quality

87

NeuronsConnection

Strength

High accuracy High Robustness

88

NeuronsConnection

Strength

Robustness:

89https://www.linkedin.com/pulse/ml-overfitting-sync-reality-atul-aphale

Robustness

VS Simple Classifier always

answers the same

https://www.linkedin.com/pulse/ml-overfitting-sync-reality-atul-aphale

91ASE’18 ISSRE’18 SANER’19 ISSTA’19 ESEC/FSE’19 CCS’18

Testing Quality & Confidence

▪ Structural perspective

Test Data Quality

▪ Decision-logic perspective

(similar to semantics)

Efficient DL Defect Detection

▪ Development phase

▪ Deployment phase

92

Multi-Granularity Testing Criteria for Deep Learning Systems

• Enable quality evaluation of DLs

from multiple portrayals

• Provide systematic guidance

of test generation for Defects

• Facilitate interpretation and

understanding

ACM SIGSOFT Distinguished Paper Award

93Neural Network 3D Simulation :https://www.youtube.com/watch?v=3JQ3hYko51Y

* Lei Ma, Felix Juefei-Xu, Fuyuan Zhang, Jiyuan Sun, Chunyang Chen, Ting Su, Minhui Xue, Bo Li, Li Li, Yang Liu, Jianjun Zhao, Yadong Wang. DeepGauge: Multi-Granularity Testing Criteria for Deep Learning Systems. In Proc. 33th IEEE/ACM Conference on Automated Software Engineering (ASE 2018), Montpellier, French, September 3-7, 2018.

https://www.youtube.com/watch?v=3JQ3hYko51Y

➢ Simple to understand & use

➢ Efficient to compute

➢ General to diverse

DNNs

➢ Scale to large DNNs

➢ Adaptable by cases

94Neural Network 3D Simulation :https://www.youtube.com/watch?v=3JQ3hYko51Y

https://www.youtube.com/watch?v=3JQ3hYko51Y

(from TensorFlow Neural Network Playground)

95

96

(from TensorFlow Neural Network Playground)

Chris Olah, Alex. Mordvintsev, Ludwig Schubert, "Feature Visualization", Distill, 2017.

GoogLeNet Trained on ImageNet97

98

TensorFuzz is a coverage-

guided fuzzing method for DNNs.

Fast movement, but still at an early state

How to design even more useful testing criteria

How to leverage these criteria to understand the runtime

behavior of DNNs

Efficient testing, verification, and analysis tool for DL systems

Quality Reliability interpretability Safety Security Privacy

100

How to debug support for opaque DL/ML systems (MODE@FSE’18)

How to fix DL issues upon being detected

Better Engineering DL development lifecycle full-stack toolchain

Efficiently solve specific program combined with domain knowledge

…

Quality Reliability interpretability Safety Security Privacy

101

PANGU LabsInspire Intelligence of Future

Kyushu Univ.（知能ソフトウェア工学研究室）

Intelligence of Things For everyone, everywhere

Prof. Lei Ma

http://www.malei.xyz/

[email protected]

https://pangukaitian.github.io/pangu/

Prof. Jianjun Zhao

http://stap.ait.kyushu-u.ac.jp/~zhao/

[email protected]

Jianjun Zhao）、馬雷先生 (Lei...

Documents

Transcript of Jianjun Zhao）、馬雷先生 (Lei...