UGSD AAAI19 poster 3 - CSIEcjwang/data/AAAI2019_P.pdf · 2019. 2. 8. · UGSD: User Generated...

1
UGSD: User Generated Sentiment Dictionaries from Online Customer Reviews Chun-Hsiang Wang, Kang-Chun Fan, Chuan-Ju Wang, Ming-Feng Tsai National Chengchi University, Taiwan Academia Sinica, Taiwan CFDA-CLIP Labs Source Code: UGSD The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI 2019) CFDA & CLIP Labs Motivation Constructing domain-specific sentiment dictionaries is important to sentiment analysis. Customer review platforms attain rich information about the ways that people convey their sentiment in certain domains. General sentiment dictionaries or annotated seed words greatly impact the results of dictionary construction. Contributions Data-driven dictionary: Requiring no additional annotation of seed words or external dictionaries. Domain-specific dictionaries: Applying to a variety of user-generated content from dierent domains. Application scalability: Producing representations of the learned sentiment words during the dictionary construction. Experiments Our framework: UGSD Leverage POS information Concatenate adverbs and adjectives Replace entities with ratings Review transformation Maximum-cosine-similarity scheme: Z-score scheme: Dictionary construction a ij = cos( ~ v s i , ~ v r j ) ( ) Representation learning Joint probability between words Minimize the distance between the empirical and learned distribution Replace the distance function with KL divergence p(i, j )= 1 1+ e - ~ v i | · ~ v j O = distp(·, ·),p(·, ·)) O = - X (i,j )2A f ij log p(i, j ) Amazon dictionaries

Transcript of UGSD AAAI19 poster 3 - CSIEcjwang/data/AAAI2019_P.pdf · 2019. 2. 8. · UGSD: User Generated...

Page 1: UGSD AAAI19 poster 3 - CSIEcjwang/data/AAAI2019_P.pdf · 2019. 2. 8. · UGSD: User Generated Sentiment Dictionaries from Online Customer Reviews Chun-Hsiang Wang,† Kang-Chun Fan,‡

UGSD:User Generated Sentiment Dictionariesfrom Online Customer ReviewsChun-Hsiang Wang,† Kang-Chun Fan,‡ Chuan-Ju Wang,‡ Ming-Feng Tsai†† National Chengchi University, Taiwan‡ Academia Sinica, Taiwan

CFDA-CLIP Labs

Source Code:UGSD

The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI 2019) CFDA & CLIP Labs

Motivation

✓ Constructing domain-specific sentiment dictionaries is important to sentiment analysis.

✓ Customer review platforms attain rich information about the ways that people convey their sentiment in certain domains.

✓ General sentiment dictionaries or annotated seed words greatly impact the results of dictionary construction.

Contributions

✓ Data-driven dictionary: Requiring no additional annotation of seed words or external dictionaries.

✓ Domain-specific dictionaries: Applying to a variety of user-generated content from different domains.

✓ Application scalability: Producing representations of the learned sentiment words during the dictionary construction.

Experiments

Our framework: UGSD

✓ Leverage POS information

✓ Concatenate adverbs and adjectives

✓ Replace entities with ratings

Review transformation

✓ Maximum-cosine-similarity scheme:

✓ Z-score scheme:

Dictionary construction

aij = cos(~vsi ,~vrj )

( )

Representation learning

✓ Joint probability between words

✓ Minimize the distance between the empirical and learned distribution

✓ Replace the distance function with KL divergence

p(i, j) =1

1 + e�~vi|·~vj

<latexit sha1_base64="D3coRQrnk+uwHZu1feusG5omPKQ=">AAACL3icbZBLSwMxFIUz9VXrq+rSTbAIFR/MiKALhYIgLitYFTrtkEnvaDSTGZJMoYT5R278K92IKOLWf2H6WKj1QODjnBuSe8KUM6Vd99UpTE3PzM4V50sLi0vLK+XVtWuVZJJCgyY8kbchUcCZgIZmmsNtKoHEIYeb8PFskN90QSqWiCvdS6EVkzvBIkaJtlZQPjc4rbLdh218iv1IEmq83Hg70DZ7fheo6QaG5Xnb+ExokJTwHPu0k2g8SvPAPORWQbni7rtD4UnwxlBBY9WDct/vJDSLQWjKiVJNz011yxCpGeWQl/xMQUroI7mDpkVBYlAtM9w3x1vW6eAokfYIjYfuzxuGxEr14tBOxkTfq7/ZwPwva2Y6Om4ZJtJMg6Cjh6KMY53gQXm4wyRQzXsWCJXM/hXTe2Jbs92oki3B+7vyJFwf7HuWLw8rtZNxHUW0gTZRFXnoCNXQBaqjBqLoCfXRG3p3np0X58P5HI0WnPGddfRLztc3RWSp7g==</latexit><latexit sha1_base64="D3coRQrnk+uwHZu1feusG5omPKQ=">AAACL3icbZBLSwMxFIUz9VXrq+rSTbAIFR/MiKALhYIgLitYFTrtkEnvaDSTGZJMoYT5R278K92IKOLWf2H6WKj1QODjnBuSe8KUM6Vd99UpTE3PzM4V50sLi0vLK+XVtWuVZJJCgyY8kbchUcCZgIZmmsNtKoHEIYeb8PFskN90QSqWiCvdS6EVkzvBIkaJtlZQPjc4rbLdh218iv1IEmq83Hg70DZ7fheo6QaG5Xnb+ExokJTwHPu0k2g8SvPAPORWQbni7rtD4UnwxlBBY9WDct/vJDSLQWjKiVJNz011yxCpGeWQl/xMQUroI7mDpkVBYlAtM9w3x1vW6eAokfYIjYfuzxuGxEr14tBOxkTfq7/ZwPwva2Y6Om4ZJtJMg6Cjh6KMY53gQXm4wyRQzXsWCJXM/hXTe2Jbs92oki3B+7vyJFwf7HuWLw8rtZNxHUW0gTZRFXnoCNXQBaqjBqLoCfXRG3p3np0X58P5HI0WnPGddfRLztc3RWSp7g==</latexit><latexit sha1_base64="D3coRQrnk+uwHZu1feusG5omPKQ=">AAACL3icbZBLSwMxFIUz9VXrq+rSTbAIFR/MiKALhYIgLitYFTrtkEnvaDSTGZJMoYT5R278K92IKOLWf2H6WKj1QODjnBuSe8KUM6Vd99UpTE3PzM4V50sLi0vLK+XVtWuVZJJCgyY8kbchUcCZgIZmmsNtKoHEIYeb8PFskN90QSqWiCvdS6EVkzvBIkaJtlZQPjc4rbLdh218iv1IEmq83Hg70DZ7fheo6QaG5Xnb+ExokJTwHPu0k2g8SvPAPORWQbni7rtD4UnwxlBBY9WDct/vJDSLQWjKiVJNz011yxCpGeWQl/xMQUroI7mDpkVBYlAtM9w3x1vW6eAokfYIjYfuzxuGxEr14tBOxkTfq7/ZwPwva2Y6Om4ZJtJMg6Cjh6KMY53gQXm4wyRQzXsWCJXM/hXTe2Jbs92oki3B+7vyJFwf7HuWLw8rtZNxHUW0gTZRFXnoCNXQBaqjBqLoCfXRG3p3np0X58P5HI0WnPGddfRLztc3RWSp7g==</latexit><latexit sha1_base64="D3coRQrnk+uwHZu1feusG5omPKQ=">AAACL3icbZBLSwMxFIUz9VXrq+rSTbAIFR/MiKALhYIgLitYFTrtkEnvaDSTGZJMoYT5R278K92IKOLWf2H6WKj1QODjnBuSe8KUM6Vd99UpTE3PzM4V50sLi0vLK+XVtWuVZJJCgyY8kbchUcCZgIZmmsNtKoHEIYeb8PFskN90QSqWiCvdS6EVkzvBIkaJtlZQPjc4rbLdh218iv1IEmq83Hg70DZ7fheo6QaG5Xnb+ExokJTwHPu0k2g8SvPAPORWQbni7rtD4UnwxlBBY9WDct/vJDSLQWjKiVJNz011yxCpGeWQl/xMQUroI7mDpkVBYlAtM9w3x1vW6eAokfYIjYfuzxuGxEr14tBOxkTfq7/ZwPwva2Y6Om4ZJtJMg6Cjh6KMY53gQXm4wyRQzXsWCJXM/hXTe2Jbs92oki3B+7vyJFwf7HuWLw8rtZNxHUW0gTZRFXnoCNXQBaqjBqLoCfXRG3p3np0X58P5HI0WnPGddfRLztc3RWSp7g==</latexit>

O = dist(p̂(·, ·), p(·, ·))<latexit sha1_base64="CJ0V85UpYLGFEfjMZhbismtcHoc=">AAACFnicbVDLSgMxFM3UV62vUZdugkVooZYZEXShUHDjzgr2AW0pmUymDc3MhOSOUIZ+hRt/xY0LRdyKO//G9LGwrQcSTs65l5t7PCm4Bsf5sTIrq2vrG9nN3Nb2zu6evX9Q13GiKKvRWMSq6RHNBI9YDTgI1pSKkdATrOENbsZ+45EpzePoAYaSdULSi3jAKQEjde3TO3yNfTOn0O4TSOWo0KZ+DKXJXSzJuWexa+edsjMBXibujOTRDNWu/d32Y5qELAIqiNYt15HQSYkCTgUb5dqJZpLQAemxlqERCZnupJO1RvjEKD4OYmVOBHii/u1ISaj1MPRMZUigrxe9sfif10oguOykPJIJsIhOBwWJwBDjcUYmD8UoiKEhhCpu/oppnyhCwSSZMyG4iysvk/pZ2TX8/jxfuZrFkUVH6BgVkIsuUAXdoiqqIYqe0At6Q+/Ws/VqfVif09KMNes5RHOwvn4B3ESd5w==</latexit><latexit sha1_base64="CJ0V85UpYLGFEfjMZhbismtcHoc=">AAACFnicbVDLSgMxFM3UV62vUZdugkVooZYZEXShUHDjzgr2AW0pmUymDc3MhOSOUIZ+hRt/xY0LRdyKO//G9LGwrQcSTs65l5t7PCm4Bsf5sTIrq2vrG9nN3Nb2zu6evX9Q13GiKKvRWMSq6RHNBI9YDTgI1pSKkdATrOENbsZ+45EpzePoAYaSdULSi3jAKQEjde3TO3yNfTOn0O4TSOWo0KZ+DKXJXSzJuWexa+edsjMBXibujOTRDNWu/d32Y5qELAIqiNYt15HQSYkCTgUb5dqJZpLQAemxlqERCZnupJO1RvjEKD4OYmVOBHii/u1ISaj1MPRMZUigrxe9sfif10oguOykPJIJsIhOBwWJwBDjcUYmD8UoiKEhhCpu/oppnyhCwSSZMyG4iysvk/pZ2TX8/jxfuZrFkUVH6BgVkIsuUAXdoiqqIYqe0At6Q+/Ws/VqfVif09KMNes5RHOwvn4B3ESd5w==</latexit><latexit sha1_base64="CJ0V85UpYLGFEfjMZhbismtcHoc=">AAACFnicbVDLSgMxFM3UV62vUZdugkVooZYZEXShUHDjzgr2AW0pmUymDc3MhOSOUIZ+hRt/xY0LRdyKO//G9LGwrQcSTs65l5t7PCm4Bsf5sTIrq2vrG9nN3Nb2zu6evX9Q13GiKKvRWMSq6RHNBI9YDTgI1pSKkdATrOENbsZ+45EpzePoAYaSdULSi3jAKQEjde3TO3yNfTOn0O4TSOWo0KZ+DKXJXSzJuWexa+edsjMBXibujOTRDNWu/d32Y5qELAIqiNYt15HQSYkCTgUb5dqJZpLQAemxlqERCZnupJO1RvjEKD4OYmVOBHii/u1ISaj1MPRMZUigrxe9sfif10oguOykPJIJsIhOBwWJwBDjcUYmD8UoiKEhhCpu/oppnyhCwSSZMyG4iysvk/pZ2TX8/jxfuZrFkUVH6BgVkIsuUAXdoiqqIYqe0At6Q+/Ws/VqfVif09KMNes5RHOwvn4B3ESd5w==</latexit><latexit sha1_base64="CJ0V85UpYLGFEfjMZhbismtcHoc=">AAACFnicbVDLSgMxFM3UV62vUZdugkVooZYZEXShUHDjzgr2AW0pmUymDc3MhOSOUIZ+hRt/xY0LRdyKO//G9LGwrQcSTs65l5t7PCm4Bsf5sTIrq2vrG9nN3Nb2zu6evX9Q13GiKKvRWMSq6RHNBI9YDTgI1pSKkdATrOENbsZ+45EpzePoAYaSdULSi3jAKQEjde3TO3yNfTOn0O4TSOWo0KZ+DKXJXSzJuWexa+edsjMBXibujOTRDNWu/d32Y5qELAIqiNYt15HQSYkCTgUb5dqJZpLQAemxlqERCZnupJO1RvjEKD4OYmVOBHii/u1ISaj1MPRMZUigrxe9sfif10oguOykPJIJsIhOBwWJwBDjcUYmD8UoiKEhhCpu/oppnyhCwSSZMyG4iysvk/pZ2TX8/jxfuZrFkUVH6BgVkIsuUAXdoiqqIYqe0At6Q+/Ws/VqfVif09KMNes5RHOwvn4B3ESd5w==</latexit>

O = �X

(i,j)2A

fij log p(i, j)

<latexit sha1_base64="HI/VlqJnKX9pX+EMD2yAqbeGAFI=">AAACG3icbVDLSsNAFJ3UV62vqEs3g0WooCUpgi4UKm7cWcE+oAlhMp20004mYWYilJD/cOOvuHGhiCvBhX/jtM1CqwcGDuecy9x7/JhRqSzryygsLC4trxRXS2vrG5tb5vZOS0aJwKSJIxaJjo8kYZSTpqKKkU4sCAp9Rtr+6Grit++JkDTid2ocEzdEfU4DipHSkmfWbuAFPHZkEnpphR4NDx3KoRMiNcCIpZdZBgMvpcPMYVEfxtOEZ5atqjUF/EvsnJRBjoZnfji9CCch4QozJGXXtmLlpkgoihnJSk4iSYzwCPVJV1OOQiLddHpbBg+00oNBJPTjCk7VnxMpCqUch75OTraW895E/M/rJio4c1PK40QRjmcfBQmDKoKTomCPCoIVG2uCsKB6V4gHSCCsdJ0lXYI9f/Jf0qpVbc1vT8r187yOItgD+6ACbHAK6uAaNEATYPAAnsALeDUejWfjzXifRQtGPrMLfsH4/AaHPp/a</latexit><latexit sha1_base64="HI/VlqJnKX9pX+EMD2yAqbeGAFI=">AAACG3icbVDLSsNAFJ3UV62vqEs3g0WooCUpgi4UKm7cWcE+oAlhMp20004mYWYilJD/cOOvuHGhiCvBhX/jtM1CqwcGDuecy9x7/JhRqSzryygsLC4trxRXS2vrG5tb5vZOS0aJwKSJIxaJjo8kYZSTpqKKkU4sCAp9Rtr+6Grit++JkDTid2ocEzdEfU4DipHSkmfWbuAFPHZkEnpphR4NDx3KoRMiNcCIpZdZBgMvpcPMYVEfxtOEZ5atqjUF/EvsnJRBjoZnfji9CCch4QozJGXXtmLlpkgoihnJSk4iSYzwCPVJV1OOQiLddHpbBg+00oNBJPTjCk7VnxMpCqUch75OTraW895E/M/rJio4c1PK40QRjmcfBQmDKoKTomCPCoIVG2uCsKB6V4gHSCCsdJ0lXYI9f/Jf0qpVbc1vT8r187yOItgD+6ACbHAK6uAaNEATYPAAnsALeDUejWfjzXifRQtGPrMLfsH4/AaHPp/a</latexit><latexit sha1_base64="HI/VlqJnKX9pX+EMD2yAqbeGAFI=">AAACG3icbVDLSsNAFJ3UV62vqEs3g0WooCUpgi4UKm7cWcE+oAlhMp20004mYWYilJD/cOOvuHGhiCvBhX/jtM1CqwcGDuecy9x7/JhRqSzryygsLC4trxRXS2vrG5tb5vZOS0aJwKSJIxaJjo8kYZSTpqKKkU4sCAp9Rtr+6Grit++JkDTid2ocEzdEfU4DipHSkmfWbuAFPHZkEnpphR4NDx3KoRMiNcCIpZdZBgMvpcPMYVEfxtOEZ5atqjUF/EvsnJRBjoZnfji9CCch4QozJGXXtmLlpkgoihnJSk4iSYzwCPVJV1OOQiLddHpbBg+00oNBJPTjCk7VnxMpCqUch75OTraW895E/M/rJio4c1PK40QRjmcfBQmDKoKTomCPCoIVG2uCsKB6V4gHSCCsdJ0lXYI9f/Jf0qpVbc1vT8r187yOItgD+6ACbHAK6uAaNEATYPAAnsALeDUejWfjzXifRQtGPrMLfsH4/AaHPp/a</latexit><latexit sha1_base64="HI/VlqJnKX9pX+EMD2yAqbeGAFI=">AAACG3icbVDLSsNAFJ3UV62vqEs3g0WooCUpgi4UKm7cWcE+oAlhMp20004mYWYilJD/cOOvuHGhiCvBhX/jtM1CqwcGDuecy9x7/JhRqSzryygsLC4trxRXS2vrG5tb5vZOS0aJwKSJIxaJjo8kYZSTpqKKkU4sCAp9Rtr+6Grit++JkDTid2ocEzdEfU4DipHSkmfWbuAFPHZkEnpphR4NDx3KoRMiNcCIpZdZBgMvpcPMYVEfxtOEZ5atqjUF/EvsnJRBjoZnfji9CCch4QozJGXXtmLlpkgoihnJSk4iSYzwCPVJV1OOQiLddHpbBg+00oNBJPTjCk7VnxMpCqUch75OTraW895E/M/rJio4c1PK40QRjmcfBQmDKoKTomCPCoIVG2uCsKB6V4gHSCCsdJ0lXYI9f/Jf0qpVbc1vT8r187yOItgD+6ACbHAK6uAaNEATYPAAnsALeDUejWfjzXifRQtGPrMLfsH4/AaHPp/a</latexit>

Amazon dictionaries