NLP/BERT in 10 mins - HPC Advisory Council...Natural Language Processing (NLP) Language...

Post on 20-Jun-2020

1 views 0 download

Transcript of NLP/BERT in 10 mins - HPC Advisory Council...Natural Language Processing (NLP) Language...

Timothy Liu, 24 March 2020

timothyl@nvidia.comhttp://bit.ly/hpcai_nlp_intro

NLP/BERT in 10 mins

2

1. Introduction to Natural Language Processing (NLP)

2. Transformer Networks

3. Transfer Learning with BERT

Outline

NLP/BERT in 10 mins

3

Natural Language Processing (NLP)

Language Understanding for Computers

Natural Language Processing is a field of computer science that aims to:

1. Allow computers to understand and process large amounts of text

2. Enable closer, more natural interactions between human and computers

NLP also generates useful ideas for related fields that involve sequences (e.g. genomics)

4

Natural Language Processing (NLP)

Example Task: Text Classification

EMAIL

EMAIL

EMAIL

NLPMODEL

SPAM

PROMOTION

SOCIAL

5

Natural Language Processing (NLP)

Using Deep Learning to Build Powerful Models

ENCODER

Input Data

INTERNAL REPRESENTATION

DENSE DENSE DENSE DENSE

0.01 0.97 0.01 0.01

6

Transformer Networks

Powerful, non-recurrent DL model for sequence modelling

Attention Is All You Need (Vaswani et al., 2017)

7

Transformer Networks

Powerful, non-recurrent DL model for sequence modelling

Calculate “attention score” from pairwise comparisons between tokens,

then do weighted combination of token representations

Input Sequence

Attention Output

8

BERT

Bidirectional Encoder Representations from Transformers

BERT is a Transformer model that:

• Builds upon initial ideas from Attention is All You Need (Vaswani, 2017) paper

• Designed to learn powerful methods of encoding representations from text

• Demonstrated state-of-the-art results on many NLP problems in many languages

9

BERT

Bidirectional Encoder Representations from Transformers

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (Devlin et al., 2018)

Pre-trainingon large corpus

Fine-tuningon task-specific corpus

10

BERT

Bidirectional Encoder Representations from Transformers

BERT-BASE12-layer

768-hidden12-heads

110M parameters

BERT-LARGE24-layer

1024-hidden16-heads

340M parameters

11

Natural Language Processing (NLP)

Example Task: Text Classification

EMAIL

EMAIL

EMAIL

BERT MODEL

SPAM

PROMOTION

SOCIAL

Timothy Liu, 24 March 2020

timothyl@nvidia.comhttp://bit.ly/hpcai_nlp_intro

NLP/BERT in 10 mins