Competitive Analysis for Points of Interest - USTC
Transcript of Competitive Analysis for Points of Interest - USTC
Competitive Analysis for Points of InterestShuangli Li1,2, Jingbo Zhou2,3โ, Tong Xu1, Hao Liu2,3, Xinjiang Lu2,3, Hui Xiong4โ
1School of Computer Science, University of Science and Technology of China, 2Business Intelligence Lab, Baidu Research3National Engineering Laboratory of Deep Learning Technology and Application, China, 4Rutgers University
[email protected],{zhoujingbo,liuhao30,luxinjiang}@baidu.com,[email protected],[email protected]
ABSTRACTThe competitive relationship of Points of Interest (POIs) refers to thedegree of competition between two POIs for business opportunitiesfrom third parties in an urban area. Existing studies for competitiveanalysis usually focus on mining competitive relationships of enti-ties, such as companies or products, from textual data. However,there are few studies which have a focus on competitive analysis forPOIs. Indeed, the growing availability of user behavior data aboutPOIs, such as POI reviews and human mobility data, enables a newparadigm for understanding the competitive relationships amongPOIs. To this end, in this paper, we study how to predict the POIcompetitive relationship. Along this line, a very first challenge ishow to integrate heterogeneous user behavior data with the spatialfeatures of POIs. As a solution, we first build a heterogeneous POIinformation network (HPIN) from POI reviews and map searchdata. Then, we develop a graph neural network-based deep learn-ing framework, named DeepR, for POI competitive relationshipprediction based on HPIN. Specifically, DeepR contains two compo-nents: a spatial adaptive graph neural network (SA-GNN) and a POIpairwise knowledge extraction learning (PKE) model. The SA-GNNis a novel GNN architecture with incorporating POIโs spatial in-formation and location distribution by a specially designed spatialoriented aggregation layer and spatial-dependency attentive prop-agation mechanism. In addition, PKE is devised to distill the POIpairwise knowledge in HPIN being useful for relationship predic-tion into condensate vectors with relational graph convolution andcross attention. Finally, extensive experiments on two real-worlddatasets demonstrate the effectiveness of our method.
KEYWORDSPoint of Interest; Graph Neural Networks; Competitive Analysis;Heterogeneous Information NetworkACM Reference Format:Shuangli Li1,2, Jingbo Zhou2,3โ, Tong Xu1, Hao Liu2,3, Xinjiang Lu2,3, HuiXiong4โ. 2020. Competitive Analysis for Points of Interest. In Proceedings ofthe 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining(KDD โ20), August 23โ27, 2020, Virtual Event, CA, USA. ACM, New York, NY,USA, 10 pages. https://doi.org/10.1145/3394486.3403179โJingbo Zhou and Hui Xiong are corresponding authors. This work was done whenShuangli Li was an intern at the Baidu Research.
Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others than ACMmust be honored. Abstracting with credit is permitted. To copy otherwise, or republish,to post on servers or to redistribute to lists, requires prior specific permission and/or afee. Request permissions from [email protected] โ20, August 23โ27, 2020, Virtual Event, CA, USAยฉ 2020 Association for Computing Machinery.ACM ISBN 978-1-4503-7998-4/20/08. . . $15.00https://doi.org/10.1145/3394486.3403179
1 INTRODUCTIONTo sustain the prosperity of business, it is critical to providingan effective understanding of their competitive environment. Forexample, identifying competitors of a company can help to makea reasonable price level and appropriate management strategy forbusiness. The study of competitive relationship prediction has beena hot research topic for a long time [4, 7]. Most of the existing worksaim at solving the competition prediction between social events,companies or products [17, 25, 28, 29]. With the wide availabilityof Point-of-Interest (POI) data, we propose to investigate a newunique perspective for understanding the competitive environmentby measuring the competition between POIs.
POI competitive relationship prediction aims to identify the de-gree of competition between two POIs to secure business from athird party in an urban area. POIs, like bars, retail stores, restau-rants, and hotels, usually have to strive for limited users to survive.Therefore, being aware of the competitive relationship among POIsis fundamental to the shopkeeper of the POI to keep the businessalive and thriving. Our study also has a lot of potential applicationsfor location-based services, like POI recommendation, local mar-keting and location-based advertising. For example, the inferredcompetitive relationships can be used as features for advertisingprediction models.
Previous studies of competitive relationship prediction cannotbe applied to POIs. Most of the existing works focus on identifyingcomparative entities from sentences in text data like reviews, socialnetworks and web pages [12, 17, 25]. However, such comparativeevidence for POIs is often absent in text data. For example, it is quiterare to find the comparative sentence contains specific names ofPOIs (e.g., a KFC store) in POI reviews. Meanwhile, there is a largeamount of user behavior data about POIs which can help to enablea new paradigm for understanding the relationships among POIs.If two POIs with similar functionalities have common users, theytend to be competitive. Besides, the user-generated reviews of a POIalso contain many text descriptions of the POI. How to integratesuch heterogeneous but useful information with the spatial featuresof POIs for competitive relationship prediction remains a uniqueresearch challenge.
In this paper, we propose a graph neural network-based (GNN-based) Deep learning framework for competitive Relationship pre-diction, named DeepR for short. The DeepR framework is runningover a carefully designed heterogeneous POI information network(HPIN) which integrates the POI spatial feature, user behavior dataand POI review data. To the best of our knowledge, we are the firstto study the POI competitive relationship prediction problem.
The construction of HPIN enables us to investigate the POIcompetitive relationship prediction problem with heterogeneousdata. In HPIN, we consider three context factors related to POI
Research Track Paper KDD '20, August 23โ27, 2020, Virtual Event, USA
1265
competitive relationship, which are spatial context, user behaviorcontext, and aspect context. The spatial context represents thesurrounding environment and relative positions of POIs in spatialspace. The user behavior context captures relations among POIshidden in the map search query data which records usersโ actionson POIs on an online map service platform, like Google Maps andBaidu Maps. Here we build a POI co-query graph where the nodeis a POI and an edge weight records how many users search thetwo POIs in a short time interval. The aspect context reflects themajor services of a POI extracted from reviews. All these factors areintegrated into a heterogeneous POI information network (HPIN)which is illustrated in Figure 2. A detailed explanation of HPIN willbe introduced later.
The DeepR framework adopts the Graph Neural Network (GNN)as the basic model since GNN has achieved state-of-the-art perfor-mance for many relational learning problems. Particularly, due tothe unique features of POIs, DeepR contains a novel spatial adap-tive graph neural network (named SA-GNN) and a POI pairwiseknowledge extraction learning (named PKE) model.
The SA-GNN is a novel GNN architecture tailored for POI com-petitive relationship prediction, which can handle the POIโs spatialinformation and location distribution. Compared with other com-mon graphs, an important feature of the POI graph (in HPIN) isthat the relative spatial positions are important for competitiverelationship because: 1) the nearby POIs are more likely to be com-petitive; 2) the spatial distribution of POIs also affects the result.For example, given a POI ๐๐ , its competitive environment is moreintense if all the competitors are evenly distributed around ๐๐ , com-pared with the case that all the competitors are on one side of ๐๐ .However, there are two drawbacks of traditional message-passingbased GNN: 1) losing the spatial information of POIs and theirneighbors; 2) lacking the ability to capture distant-range spatiallocation dependencies. These limitations degrade the performanceof GNN for our problem. The novelty of SA-GNN relies on speciallydesigned components upon GNN to handle such limitations. Toaddress the first limitation, we propose a spatial oriented aggrega-tion layer for SA-GNN; and for the second limitation, we devise aspatial-dependency attentive propagation mechanism for SA-GNN.
Another important part of DeepR is the PKE model which candistill the POI pairwise knowledge in HPIN for relationship pre-diction into condensate representation vectors. After extractingaspects from reviews, we build a relation-aware aspect graph con-volutional network (RAConv) in HPIN to learn aspect embeddingand brand embedding of POIs. Then a cross attention mechanismis designed to generate the aspect enhanced representation of POIpairs. The intuition behind the cross attention is to build the tai-lored representation of POI pairs with attention to the comparativeaspects between a pair of POIs.
Finally, we unify all the components into the DeepR for POIcompetitive relationship prediction. To summarize, the main con-tributions of our work are as follows:
โข We first study the POI competitive relationship prediction prob-lem with heterogeneous information including human behaviordata and POI review data. The POI competitive relationship pre-diction has great potential for many applications.
โข We propose a novel DeepR framework over a carefully designedheterogeneous POI information network (HPIN). Several tech-niques, such as SA-GNN, spatial oriented aggregation layer, spatial-dependency attentive propagation and PKE model for knowledgeextraction, are invented to handle the unique properties of POIsfor competitive relationship prediction.
โข Extensive experiments are conducted on two real-world datasets,demonstrating the effectiveness of our proposed DeepR model.
2 RELATEDWORKThe research topic of this paper is closely related to the competitiverelationship mining, link prediction and graph neural networks. Inthis section, we will briefly discuss them respectively.
Competitive Relationship Mining.Most of the existing stud-ies aim at detecting the competitive relationship between companiesor products on text data. The early research [4, 7] applies predefinedlinguistic patterns (i.e., A vs B) for mining competitors. A graphicalmodel in [25] is designed to extract product comparative relationsfrom reviews. The authors in [12] propose a network-based ap-proach based on company citations (co-occurrence) graph in onlinenews articles. CMiner [17], which compares specific features onuser reviews, is useful for explaining the reasons for competition.TFGM [26] learns from patent and twitter data to classify the rela-tionships between entities by a topic model. All the above methodscannot handle the unique features of POIs which have spatial con-text and user behavior context factors. In a word, there are fewworks on mining competitive relationship for Points of Interest.
Link Prediction. There are many methods for solving the linkprediction problem in different fields, including the statistical meth-ods [8, 21] and graph embedding methods [2, 11, 15]. Some worksalso study the link prediction on heterogeneous information net-works by the auto-encoder model [20] and the meta path-basedmethods [3]. Notice that the above methods mostly treat the rela-tionship prediction as completing the missing links, with a strongassumption that all the target links to predict are exactly in thegraph. However, our problem is not a link completion problem sincewhat we predict is the POI competitive relationship which does notexist in HPIN. Therefore, the above methods lose their effectiveness.Furthermore, they cannot consider additional information such asspatial information and user behavior information.
Graph Neural Networks. Recently, many efforts have been de-voted to study Graph Neural Networks (GNNs)[6, 14, 18, 19, 22, 31].HAN[22] uses the hierarchical attention to aggregate feature infor-mation through the predefinedmeta-paths in heterogeneous graphs.There are also some works designed for link prediction [16, 30].SEAL[30] predicts the general relationship on the graph based ongraph neural network and Weisfeiler-Lehman neural machine. Theabove GNN-based models can only capture the topological struc-ture features of the graph, but the necessary spatial informationand aspect information of POIs can not be handled by the existingmodels. In recent years the GNN and graph embedding methodshave also been applied to solve spatio-temporal problems in spe-cific applications [9, 10, 24, 32]. There are also some works aboutso-called spatial-temporal graph neural network. However, thesespatial graph neural networks mainly refer to the message passingprocess over the original โnodeโ space (instead of graph spectral
Research Track Paper KDD '20, August 23โ27, 2020, Virtual Event, USA
1266
space) which convolves the central nodeโs representation with itsneighborsโ representations to derive the updated representation forthe central node [23].
3 OVERVIEWIn this section, we first introduce the preliminaries, then we presenta framework overview to show the work process of the competitiverelationship prediction for POIs.
3.1 PreliminariesHere we first provide some basic concepts used in our paper, fol-lowing by a formal definition of our problem.
Definition 3.1. Heterogeneous POI Information Network.The Heterogeneous POI Information Network (HPIN) is defined as๐บ = (PโชBโชAโชM,E๐๐โชE๐๐โชE๐๐โชE๐๐โชE๐๐), whereP,B,A,andM are the sets of nodes in HPIN. In specific, P = {๐1, ..., ๐๐๐ }denotes the set of POIs, B = {๐1, ..., ๐๐๐ } denotes the set of brands,A = {๐1, ..., ๐๐๐ } denotes the set of aspects extracted from re-views, and M = {M1, ...,M๐๐ } denotes the set of heat maps.๐๐ , ๐๐ , ๐๐, ๐๐ are the number of POI, brand, aspect, spatial heatmap in HPIN. Each POI node ๐๐ is associated with a heat map torepresent the surrounding distribution feature of POI. E๐๐ , E๐๐ ,E๐๐ ,E๐๐ andE๐๐ are five different sets of relational edges (POI-POI,POI-brand, brand-brand, brand-aspect, and aspect-aspect) amongthe three types of nodes. How to construct HPIN (including M) isintroduced in Section 4.
Definition 3.2. Meta-path for Brand. We define meta-path ฮฆ
as a path with the form ๐๐
๐ โ1๐๐
โโ ๐๐๐ ๐๐โโ ๐๐
๐ ๐๐โโ ๐ ๐ , which describes
a POI-based brand correlation between brand ๐๐ and brand ๐ ๐ via apath between POI ๐๐ and POI ๐๐ .
Taking Figure 2 for example, there are four POIs in the HPIN,each POI belongs to a brand, denoted as an edge in E๐๐ , meaningthat a POI is one branch of its brand (i.e., ๐2 is one branch ofKFC). In addition, the attributes of POI include its category (i.e.,the category of ๐2 is the fast-food restaurant) and coordinate. Therelation between POIs is based on the user map search query. Forexample, ๐1 is linked with ๐2 in HPIN because they are alwaysqueried together by users on the online map. The relation betweenbrands is defined based on meta-path ฮฆ. In Figure 2(c), the edge๐2๐3 is constructed by the path ๐2 โ ๐2 โ ๐3 โ ๐3. Each brandis also associated with several aspects. ๐3 (Mcdonaldโs) is closelyrelated to ๐2 (French Fries) and ๐3 (hamburger). The two aspects๐2 and ๐3 often appear together in the user reviews so they areconnected with each other.
Definition 3.3. Competitive Relationship Prediction. The ob-jective of our problem is to associate each POI pair (๐๐ , ๐ ๐ ) with alabel ๐ฆ โ {0, 1} where ๐ฆ = 1 indicates ๐๐ and ๐ ๐ have a competitiverelationship. Formally, given a set of POIs P and an HPIN ๐บ , thegoal is to learn a predictive function ๐ : (P รP |๐บ) โ ๐ to predictthe competitive relationship between POIs.
3.2 Framework OverviewFigure 1 shows an overview of the POI competitive relationshipanalysis process. First, we construct a HPIN from map query data
Map query data
Userreview data
POI attributes
HPIN(HeterogeneousPOI Information
Network)
DeepR LearningFramework
competitiverelationship
POI graph
Aspect graph
Figure 1: An overview of the POI competitive relationshipanalysis process.
and user review data with POI attributes. Then we input the aspectgraph and POI graph extracted from the HPIN to the proposedDeepR model, and the outputs indicate whether the given POI pairshave competitive relationships.
Figure 3 presents the whole framework of DeepR, which consistsof two main components: Spatial Adaptive Graph Neural Network(SA-GNN) Learning and Pairwise POI Knowledge Extraction (PKE).First, SA-GNN aggregates the neighbors on the POI graph withintegrating the spatial information via the spatial oriented aggre-gation layer, and then attentively propagates the distant-rangedependencies with spatial locations. Next, SA-GNN is applied totwo sub-graphs (Diffusion and Affinity Graph) of the POI graph tocapture the hidden patterns indicated by category information onthe POI graph. Second, PKE adopts the relation-aware aspect graphconvolution (RAConv) to encode the brand and aspect on the aspectgraph. Then PKE employs the cross attention mechanism to distillthe pairwise POI knowledge from aspects, which is combined withthe POI pair representation next. Finally, the pairwise predictionpart outputs the result of the competition. In the following two sec-tions, we first introduce the construction of HPIN, then we presentour DeepR framework.
4 HPIN CONSTRUCTIONIn this section, we describe how to construct the HPIN with anexample of HPIN shown in Figure 2. We first introduce the methodof constructing spatial heat maps linked with POIs. Then we discussthe process of building different relations in HPIN. The relation ofPOI-brand is given by the attributes of POI, as illustrated in Table2 in appendix, showing which brand every POI belongs to. Wemainly introduce the POI relation (POI-POI), and the relations ofaspect and brand (brand-aspect, aspect-aspect and brand-brand).
4.1 Spatial Heat Map ConstructionAs illustrated in Figure 2(a), M = {M1, ...,M๐๐ } is a set of gridspatial heat maps that are linked with POIs to represent the sur-rounding environment features. Each element M๐ of M corre-sponds to a spatial heat matrix ๐ด๐ โ R๐ถร๐ฟร๐ฟ , where ๐ถ is thenumber of category-based channels and ๐ฟ is the side length of thematrix. We split a city into the same size grids (i.e., 500mร500m).Each grid ๐๐ has๐ถ channels respecting to the number of categoriesof POIs, which can be denoted as {v1
๐, ..., v๐ถ
๐}. The value of v๐
๐is
calculated with max-pooling method:
v๐๐= maxโ๐๐ก โ๐๐
{๐โ๐๐ก (๐๐ก ) | tag(๐๐ก ) = ๐, 1 โค ๐ โค ๐ถ
}, (1)
Research Track Paper KDD '20, August 23โ27, 2020, Virtual Event, USA
1267
where tag(๐๐ก ) = ๐ limits ๐๐ก has a category of ๐ , and ๐โ๐๐ก returnsthe hot value of ๐๐ก , which is total times of ๐๐ก being searched byusers recorded on map search query data in a time interval. Wetreat one central grid of each POI with its neighbors as ๐ฟ ร ๐ฟ gridsto constitute the spatial heat map.
4.2 POI Relation ConstructionPOI co-query edge in E๐๐ is a type of behavior-driven relation-ship. Here we use map search query data which records usersโactions on POIs on an online map service platform to constructthe behavior-driven relationship. It is also possible to use othersimilar user behavior data. Intuitively the more frequently a pair ofPOIs (๐๐ , ๐ ๐ ) are queried by the same users in a short time intervalฮ๐ก , the more likely they are related to each other. Each POI-POIedge (๐๐ , ๐ ๐ ,๐ค๐๐ ๐ ) โ E๐๐ means that ๐๐ and ๐ ๐ are queried๐ค
๐
๐ ๐times
together by all users in a period of time. We use A๐ to denote theadjacency matrix of the POI co-query, and (A๐)๐ ๐ = (A๐)๐๐ = ๐ค๐๐ ๐ .To reduce noise disturbance and reduce the size of the co-querygraph, we set a threshold \๐ to filter edges that satisfy๐ค๐
๐ ๐< \๐ .
We also name the graph formed by POIs and edges E๐๐ as POIco-query graph.
The POI co-query graph can also be divided into two sub-graphs:a diffusion graph and an affinity graph. If a pair of POIs with thesame category are searched many times, they tend to be competitivesince users may make a choice between them; if ones with differentcategories are searched many times, they tend to be complementary(instead of competitive) since users may plan to visit both of them.Due to the difference of behavior semantics, we build a diffusiongraph on the same category of POIs, and an affinity graph on thedifferent categories of POIs. As introduced later, SA-GNN will beapplied to the diffusion graph and the affinity graph separately.
4.3 Aspect and Brand Relation ConstructionAspect set A is extracted from reviews data. We first gather alluser reviews of a brand to format a document. The reasons togather reviews to brands instead of POIs are: 1) the reviews ofa certain POI are usually rare; 2) most users give reviews aboutthe brand (like โKFCโ) instead of a certain store. If a POI doesnot have a brand, we use its name as a brand. We treat brandsas documents and aspects as the key words of documents. Thenwe calculate wordsโ term frequency-inverse document frequency(TF-IDF), and we select the top ๐ words as aspects in aspect set,denoted as ๐ ๐ โ A. In this way, brand-aspect relation can also bebuilt, (๐๐ , ๐ ๐ ,๐ค๐ก๐ ๐ ) โ E๐๐ means that ๐ ๐ is one of the top ๐ aspectsof ๐๐ and edge weight๐ค๐ก
๐ ๐is the TF-IDF value. We use (A๐ก )๐ ๐ = ๐ค๐ก๐ ๐
to denote the adjacency matrix of E๐๐ . We also employ point-wisemutual information (PMI) to establish linkages between aspects[27]. Thus, we have ๐๐๐ผ (๐, ๐) = ๐ค๐
๐ ๐to measure the relevance of
aspects (๐๐ , ๐ ๐ ), denoted as (๐๐ , ๐ ๐ ,๐ค๐๐ ๐ ) โ E๐๐ , and (Aa)๐ ๐ = ๐ค๐๐ ๐ .We filter out aspect pairs whose PMI is lower than a threshold \๐๐๐ผ .
In addition, there is an edge between brands too. We count thenumber of path ฮฆ from ๐๐ to ๐ ๐ and normalize the result:
๐ (๐๐ , ๐ ๐ ) = ๐ (๐ ๐ , ๐๐ ) =
๏ฟฝ๏ฟฝ{๐๐๐โ๐ ๐ : ๐๐๐โ๐ ๐ |= ฮฆ}๏ฟฝ๏ฟฝโ๏ฟฝ๏ฟฝN (๐๐)
๐
๏ฟฝ๏ฟฝ ยทโ๏ฟฝ๏ฟฝN (๐๐)๐
๏ฟฝ๏ฟฝ , (2)
! Spatial heat map "# โ
%ร%'(
)*
)+
),
)(-+
-,
-* !*
!+
!,- POI Set . Brand Set / Aspect Set
aspect-aspectbrand-aspectPOI-POI POI-brand brand-brand
-+ โ )+ โ ), โ -,
Figure 2: An illustrative example of HPIN. (a) Spatial heatmap of several category-based channels (i.e., hotel, restau-rant, shopping center, house). (b) POI set. (c) Brand (i.e., KFC,Mcdonaldโs, Starbucks) set. (d) Aspect (i.e., coffee, FrenchFries, hamburger) set.
where |= means path ๐๐๐โ๐ ๐ follows the rule defined by meta-pathฮฆ, and
๏ฟฝ๏ฟฝN (๐๐)๐
๏ฟฝ๏ฟฝ denotes the number of POIs belonging to brand ๐๐ .Formally, edges of brands are (๐๐ , ๐ ๐ ,๐ค๐๐ ๐ = ๐ (๐๐ , ๐ ๐ )) โ E๐๐ , andadjacency matrix as (A๐ )๐ ๐ = ๐ค๐๐ ๐ .
5 DEEPR FRAMEWORKIn this section, we present our DeepR framework whose architec-ture is illustrated in Figure 3. We first introduce two importantcomponents of DeepR, which are spatial adaptive graph neural net-work (SA-GNN) learning component and pairwise POI knowledgeextraction model (PKE). Then we apply a pairwise output layer topredict the competitive probability given a pair of POIs.
5.1 Spatial Adaptive Graph Neural NetworkThough graph neural networks have been widely studied in manyapplications and show great advantages in the general graph learn-ing problem, these message-passing neural networks (MPNNs),such as GCN [6] and HAN [22], still have fundamental weaknessesin several respects [14], especially in our POI graph learning prob-lem. Firstly, the aggregation of MPNNs treats all neighbors of POIsequally in direction and loses the spatial information of POIs andtheir neighbors. Secondly, spatial distance dependencies are ne-glected, causing the lack of the ability to capture distant-rangespatial location dependencies.
Research Track Paper KDD '20, August 23โ27, 2020, Virtual Event, USA
1268
attentionโฆ โฆ
SA-GNN
SA-GNN!"
#$
FC %&(!$, !")FC#$* #$+
#$
attentionโโโ
!.
/$* /0*
spatial
1.* 1$* 10*
CNN
Cross Attention Layer
POI Graph
(a) SA-GNN Learning
(b) Pairwise POI Knowledge Extraction
(c) Pairwise Prediction
(d) SA-GNN Layer
Aspect Graph
23||25
Pairwise AveDense Layer 67(28, 29)
POI/brand/ aspect node
POI-a/POI-d embeddingbrand/aspect embedding
element-wise +โโ element-wise ร
!0!$
!. #"
;$
;" !$ !0
!$ !0
!$ !$
#$,"+
#$,"*
<$,"
Fetch =0
RAConvSpatial AGG
Figure 3: Illustration of the proposed DeepR framework. We input the POI graph and aspect graph extracted from the het-erogeneous POI information network (HPIN). Figures (a) (b) (c) show different components, and Figure(d) presents the innerstructure of the SA-GNN component.
!"
!#
!$
!%
&"
&#&$
&%
&'
!'!(
)
*+,
-./01)23โฒ5
6789:8 *;19
<
*+,
=;>>) ?7@
6789:8 *;19
-./01)23โฒ5
.
POI edgeLocal oriented aggregationGlobal oriented aggregation
=;>>) ?7@
Figure 4: (a,b) Two types of spatial distribution example. (c)Explanation of the spatial oriented aggregation process ofSA-GNN.
For example, as shown in Figure 4, McDonaldโs has three neigh-bors on the POI graph. In Figure 4(a), three neighbor POIs relativelyevenly distributed around McDonaldโs. In this case, three POIsprobably have the same effect on McDonaldโs from the point ofthe spatial distribution. In Figure 4(b), KFC is located on one sideof McDonaldโs, while Burger King and Pizza Hut are on anotherside. Whatโs more, compared to Burger King and Pizza Hut, KFC ismuch closer to McDonaldโs in distance. Thus McDonaldโs shouldpay more attention to KFC than that to Burger King and Pizza Hut,because people who live on the side of KFC are more likely to makea choice between KFC and McDonaldโs. But MPNNs can not tellthe difference between Figure 4(a) and Figure 4(b).
To overcome the above two limitations of traditional GNNs basedonmessage-passing, as shown in Figure 3(d), we design the SA-GNNto process both the user behavior context and the spatial contextof POI in HPIN with considering spatial location relation on the
co-query graph and spatial distribution of POIs. SA-GNN is com-posed of spatial oriented aggregation layer and spatial-dependencyattentive propagation layer.
5.1.1 Spatial Oriented Aggregation Layer. To overcome the firstlimitation and aggregate POI neighbors on the co-query graphspatially, we adopt a graph convolutional layer to learn POI rep-resentations by considering spatial location relation, as illustratedin Figure 4(c). We first evenly divide the neighbors of each POInode ๐๐ into several sectors (e.g., six sectors) ๐1, ..., ๐๐ accordingto their location coordinates. Each POI in the sector ๐๐ belongsto the same spatial neighbor set of ๐๐ , denoted as N๐ (๐๐ ). POIsin the same sector ๐๐ are associated with the spatial relationship๐ ๐๐ . Notice that ๐๐ itself does not belong to any sector ๐1 โผ ๐๐ , sowe define the central node ๐๐ in the sector ๐0 with neighbor setN0 (๐๐ ) = {๐๐ }. Then we use the same graph convolutional rulebased on symmetric normalized Laplacian as GCN [6] to aggregatePOIs of the relationship ๐ ๐๐ locally in the sector ๐๐ :
๐๐๐ =โ
๐ ๐ โN๐ (๐๐ )(deg(๐๐ )deg(๐ ๐ ))โ
12 ๐ ๐ , (3)
where deg(๐๐ ) is the degree of node ๐๐ in co-query graph. Thenwe aggregate the representations of different sectors for each POI:
๐๐ = ๐ (๐พ๐ ยท๐
| |๐=0
๐๐๐ ), (4)
where ๐ is the non-linear activation function,๐พ๐ is the transformmatrix and we use concatenation operator | | as the global orientedaggregation function. In this spatial oriented aggregation way, SA-GNN can distinguish the POI neighbors from different sectors andintegrate spatial information.
5.1.2 Spatial-dependency Attentive Propagation Layer. To over-come the second limitation and capture distant-range dependencies,we propose a spatial attentive propagation layer to handle the spa-tial information with two techniques. First, we design a heat map
Research Track Paper KDD '20, August 23โ27, 2020, Virtual Event, USA
1269
convolution layer to model the surrounding environment of POIsto capture the spatial distribution feature. For a target POI ๐๐ , weinput its spatial heat map matrix in HPIN, denoted as๐ด๐ , into CNNoperation and learn the distant-range dependencies feature ๐๐ :
๐๐ = ๐๐ถ๐๐(๐ด๐ ;๐โ
), (5)
Then we concatenate ๐๐ and ๐๐ to learn POI representationfeature:
๐๐ = CONV(๐๐ ) = ๐(๐๐ โ ๐๐
), (6)
Second, we invent a location-aware attentive propagation layerto process the relative spatial positions between POIs. Althoughthe spatial oriented aggregation layer can capture POI spatial infor-mation of different sectors, the spatial distance factor between POIsis overlooked. As illustrated in Figure 4, the nearer a neighbor POIis to another POI, the more possible they are competitive, whichmeans this POI deserves higher attention in the model. Besides, therelative spatial location in two-dimensional space is also influential.To overcome this problem, we propose a location-aware attentionmechanism to propagate POI representation feature:
๐๐ =โ๐ โN๐
๐๐ก๐ก๐๐ (๐๐ , ๐ ๐ , ๐๐ )๐พ๐ ยท CONV(๐ ๐ ), (7)
where ๐๐ก๐ก๐๐ (๐๐ , ๐ ๐ , ๐๐ ) is the location-aware attentive weight ofeach neighbor ๐ ๐ for ๐๐ with their relative spatial location feature๐๐ .๐พ๐ is a transform matrix.
As shown in Figure 5, We define two one-hot vectors ๐๐ฅ (๐๐ , ๐ ๐ )and ๐๐ฆ (๐๐ , ๐ ๐ ), which represent the relative position in longitudeand latitude dimensions. For each dimension, we take the ๐๐ asthe origin of coordinates and divide the distance between ๐๐ and๐ ๐ in two dimensions into buckets (in our experiment, we set thebucket as 100 meters, and the maximum distance in each dimen-sion is 10km). Then we concatenate two embeddings ๐๐ฅ (๐, ๐) =
๐พ๐ฅ๐๐ฅ (๐๐ , ๐ ๐ ) and ๐๐ฆ (๐, ๐) = ๐พ๐ฆ๐๐ฆ (๐๐ , ๐ ๐ ), and we apply a denselayer transformation:
๐๐ =๐พ๐ ยท(๐๐ฅ (๐, ๐) โ ๐๐ฆ (๐, ๐)
), (8)
Next, We implement location-aware attention ๐๐ก๐ก๐๐ , which isformulated as shown:
๐๐ก๐ก๐๐ (๐๐ , ๐ ๐ , ๐๐ ) = ๐(๐๐ ยท (๐พ๐ก ๐๐ โ๐พ๐ก ๐ ๐ โ ๐๐ )
), (9)
Finally, as suggested in GAT [18], multi-head attention is usedto enhance learning ability during propagation. We concatenate๐พ embeddings learned by K independent location-aware attentionmechanisms to represent final aggregation embedding:
๐๐ =
๏ฟฝ๏ฟฝ๏ฟฝ๏ฟฝ๏ฟฝ๏ฟฝ๏ฟฝ๏ฟฝ๏ฟฝ๏ฟฝ๐พ๐=1
๐ยฉยญยซโ๐ โN๐
๐๐ก๐ก๐๐๐ (๐๐ , ๐ ๐ , ๐๐ )๐พ๐๐ ยท CONV(๐ ๐ )
ยชยฎยฌ .5.1.3 Learning on Diffusion and Affinity Graphs. We further applySA-GNN on the two sub-graphs deriving from co-query graph. Aswe have stated, on co-query graph, there are two sub-graphs whichare diffusion graph and affinity graph. The diffusion graph is basedon POIs with the same category (which tend to be competitive sinceusers may make a choice among them), and the affinity graph isbased on POIs with different categories (which are complementaryinstead of competitive since users may plan to visit all of them). Due
!"
!#
1 2 3 41234
("#
)*(!", !#) = (0,0,0,1)
)0(!", !#) = (0,0,1,0)
Figure 5: Relative spatial location feature of ๐1 and ๐2
to the different behavior semantics behind this view, we apply SA-GNN operation to these two different graphs separately, denoted as๐๐๐and ๐๐
๐, and concatenate two types of representations to output
final POI ๐๐ representation feature:
๐๐ = ๐๐๐ โ ๐๐๐ , (10)
5.2 Pairwise POI Knowledge ExtractionDeepR also utilizes the aspect information and brand information ofPOI in HPIN to help competitive relationship prediction. Formally,Given a pair of POIs into DeepR, (Note that one POI in HPIN, e.g.,โKFC in West Courtโ, corresponds to one brand, e.g., โKFCโ), wecan fetch a pair of brand embeddings, denoted as ๐๐ and ๐ ๐ . Suchembedding features can be used to enhance the POIs for relationshipprediction. To meet it, as shown in Figure 3(b), we present the othercomponent of DeepR: Pairwise POI Knowledge Extraction (PKE),which consists of relation-aware aspect convolution (RAConv) andcross attention layer. In this section, We discuss how to generatethe embeddings of brands and aspects, and then distill the pairwisePOI knowledge about the aspect for each POI pair from HPIN.
5.2.1 Relation-aware Aspect Convolution. Inspired from Text GCN[27] and R-GCN [16], we propose relation-aware aspect convolution(RAConv) to learn brand and aspect embeddings which can dis-tinguish three relations of brand and aspect (brand-brand relation,aspect-aspect relation and brand-aspect relation). For each brandand aspect, we use relation-aware graph convolution operation toaggregate all related neighbor brands and aspects:
๐ด๐บ๐บ (๐ (๐)๐
) =โ๐ โN๐
๐
(A๐)๐ ๐๐พ๐๐ (๐โ1)๐+
โ๐ โN๐ก
๐
(A๐ก )๐ ๐๐พ๐ก๐ (๐โ1)๐,
๐ (๐)๐
= ๐
(๐พ๐ (๐โ1)
๐+๐ด๐บ๐บ (๐ (๐)
๐)), (11)
๐ด๐บ๐บ (๐ (๐)๐
) =โ๐ โN๐
๐
(A๐ )๐ ๐๐พ๐๐(๐โ1)๐
+โ๐ โN๐ก
๐
(A๐ก )๐ ๐๐พ๐ก๐ (๐โ1)๐,
๐ (๐)๐
= ๐
(๐พ๐ (๐โ1)
๐+๐ด๐บ๐บ (๐ (๐)
๐)), (12)
where the function ๐ด๐บ๐บ denotes that each aspect ๐๐ or brand ๐๐aggregates the influences from its related brands and aspects; ๐๐represents aspect embedding and ๐๐ represents brand embedding;A๐ = D
โ 12
๐ A๐Dโ 1
2๐ , A๐ = A๐ + I๐ (A๐ , A๐ก in the same way);N๐
๐,N๐
๐
andN๐ก๐denote the neighbor indices set of node ๐ in three relations;
Here ๐ is a non-linear activation; ๐พ , ๐พ๐ , ๐พ๐ and ๐พ๐ก are weightmatrices of different relations.
Research Track Paper KDD '20, August 23โ27, 2020, Virtual Event, USA
1270
5.2.2 Cross Attention. Given a pair of brand embeddings generatedby the above RAConv, denoted as ๐๐ and ๐ ๐ , there are two lists ofaspect embeddings, denoted as {๐๐1, ..., ๐
๐๐} and {๐ ๐1, ..., ๐
๐๐}. Then
we feed the embeddings into the cross attention layer to mutuallygenerate attentive weights. We first calculate the similarity betweenbrand๐๐ and each aspect๐ ๐๐ of brand๐ ๐ , denoted as๐ (๐๐ , ๐ ๐๐ ), aimingto pick up the important aspects and de-emphasize the noisy aspectsof another brand ๐ ๐ (and ๐ (๐ ๐ , ๐๐๐ ) is the same):
๐ (๐๐ , ๐ ๐๐ ) =๐๐ ยท ๐ ๐๐
โฅ๐๐ โฅ ยท โฅ๐ ๐๐ โฅ, ๐ โ [1, ๐] (13)
Then softmax function is used to obtain each otherโs normalizedattentive weights of aspects, and the aspect enhanced feature vector๐๐ for POI ๐๐ can be computed as the weighted sum of brand ๐๐ โsaspect ๐๐
๐:
๐๐ =๐โ๐=1
๐ฝ๐๐๐๐, (14)
๐ฝ๐ =exp(๐ (๐ ๐ , ๐๐๐ ))โ๐๐ก=1 exp(๐ (๐ ๐ , ๐๐๐ก ))
, (15)
Definition 5.1. Pairwise Average Dense Operation.We definea pairwise average dense operation ๐๐ to overcome the asymmetryof concatenation two vectors ๐1 and ๐2 directly:
๐๐ (๐1, ๐2) = average(๐พ๐ (๐1 โ ๐2),๐พ๐ (๐2 โ ๐1)
), (16)
Given a pair of POIs, we input ๐๐ and ๐ ๐ into ๐๐ to output finalpairwise aspect representation ๐๐, ๐ :
๐๐, ๐ = ๐๐ (๐๐ , ๐ ๐ ). (17)
5.3 Prediction and OptimizationFor a pair of POIs ๐๐ , ๐ ๐ , DeepR encodes ๐๐ , ๐ ๐ separately by SA-GNN layer and outputs their representation features, denoted as ๐๐and ๐ ๐ . After pairwise POI knowledge extraction, ๐๐, ๐ representsthe pairwise aspect context feature of ๐๐ and ๐ ๐ .
5.3.1 Pairwise Interaction. Pairwise interaction layer combinesPOI representation feature of ๐๐ , ๐ ๐ and outputs POI pair featurefor competitive relationship prediction:
๐๐, ๐ =๐พ๐ก ยท(๐๐ (๐๐๐ ,๐
๐๐ ) โ ๐๐ (๐๐๐ ,๐
๐๐ )), (18)
where๐พ๐ก is the weight matrix, and ๐๐๐,๐๐๐are two POI representa-
tions of diffusion graph and affinity graph.
5.3.2 Prediction. Finally, the probability of a pair of POIs compet-ing with each other is predicted by a fully connected layer withconcatenating POI representation feature ๐๐, ๐ and aspect contextfeature ๐๐, ๐ . Then we have:
๐ฆ๐, ๐ = sigmoid(๐พ๐ ยท (๐๐, ๐ โ ๐๐, ๐ )), (19)
5.3.3 Optimization. We adopt the Cross-Entropy loss function totrain DeepR over all labeled POI pairs between the ground-truthand the prediction:
L =โ
(๐๐ ,๐ ๐ ) โD
(๐ฆ๐, ๐ ๐๐๐๐ฆ๐, ๐ + (1 โ ๐ฆ๐, ๐ )๐๐๐(1 โ ๐ฆ๐, ๐ )
), (20)
where D is the ground-truth set and includes positive and negativelabels of the competitive relationship. ๐ฆ๐, ๐ is the label of the pair(๐๐ , ๐ ๐ ) and ๐ฆ๐, ๐ is the prediction result of DeepR.
6 EXPERIMENTSIn this section, we first introduce the experiment settings, and thendemonstrate the effectiveness of DeepR on two city-level datasets.
6.1 Experiment Settings6.1.1 Dataset. Experiments are conducted on two real-world POIsdatasets in Beijing and Chengdu. To construct the HPIN, we usethe map search query data and POI data in the corresponding citiesfrom 1st August 2018 to 31st August 2018. The dataset is a portion ofthe whole data of Beijing and Chengdu randomly sampled fromthe data of Baidu Maps. We extract 3,231 aspects from reviews. Andthere are 113,997 links of aspect-aspect relation, 33,098 links ofbrand-brand relation and 207,917 links of brand-aspect relation.Theconstruction of ground-truth is introduced in Appendix A.1.
6.1.2 Baselines and Evaluation Metrics. We use five kinds of base-lines including the simple rule-based methods (DIST and EW),feature-based methods (MLP and XGboost[1]), graph embeddingmethods (Deepwalk[15] and Node2vec[2]), several state-of-the-art GNN models (GCN[6], GAT [18], SEAL[30], Geom-GCN[14])and state-of-the-art GNN over heterogeneous information network(HAN[22]). The detail of baselines is introduced in Appendix A.2.We use Accuracy (Acc), Area Under Curve (AUC), Precision (Prec),Recall (Rec) and F1-score (F1) as the evaluation metrics.
6.2 Performance Evaluation6.2.1 Overall Comparison. Table 1 shows the performance resultsof our proposed DeepR as compared to all the baselines on Bei-jing and Chengdu datasets. As we can see, DeepR significantlyoutperforms all the baselines in almost all metrics.
Specifically, we can observe that the prediction results of the rule-based methods (DIST, EW) are very poor, especially the accuracy isonly around 60%, demonstrating that we cannot simply use distanceor co-query weight to make predictions. Furthermore, feature-basedmodels (MLP, XGBoost) are also not ideal compared to DeepR dueto the lack of the incorporation of the co-query graph structureinformation. For graph embedding models (Deepwalk, Node2vec),they can learn the structure features of the POI co-query graph. As aresult, they achieve better performances compared with the feature-based models. But the accuracy is still not high, because thesegraph embedding models fail to capture effective neighborhoodinformation as graph neural networks.
For graph neural networks, GCN and GAT outperform relativelybetter than graph embedding baseline methods, which indicates thepowerful ability of GNN to capture the graph information with POIfeatures. The performance of Geom-GCN is not significantly betterthan GCN and GAT, the potential reason is that neighborhoodgraph structure and latent dependency in graph are not essentialfor competitive relationship prediction on our datasets. We noticethat the result of SEAL is not as good as other GNNmodels, becausethe POI graph is much sparser than general graphs and SEAL hasthe limited effect to learn the subgraph pattern for link prediction.Since HAN can learn the heterogeneous information compared to
Research Track Paper KDD '20, August 23โ27, 2020, Virtual Event, USA
1271
Table 1: Experimental results on Beijing and Chengdu dataset.
Beijing ChengduAcc AUC F1 Prec Rec Acc AUC F1 Prec Rec
EW 0.5765 0.6225 / / / 0.5667 0.6133 / / /DIST 0.6442 0.7131 / / / 0.6257 0.6963 / / /MLP 0.7221 0.8102 0.7389 0.6968 0.7863 0.6883 0.7476 0.7117 0.6621 0.7694
XGboost 0.7814 0.8641 0.7915 0.7566 0.8298 0.7300 0.8090 0.7353 0.7211 0.7500Deepwalk 0.7732 0.8511 0.7811 0.7549 0.8511 0.7397 0.8158 0.7485 0.7241 0.7745Node2vec 0.7784 0.8527 0.7866 0.7586 0.8167 0.7411 0.8151 0.7518 0.7291 0.7759GCN 0.8061 0.8790 0.8139 0.7826 0.8477 0.7534 0.8394 0.7569 0.7463 0.7677GAT 0.8069 0.8828 0.8077 0.8046 0.8108 0.7581 0.8281 0.7542 0.7669 0.7418
Geom-GCN 0.8091 0.8835 0.8045 0.8071 0.802 0.7527 0.8309 0.7447 0.7697 0.7213SEAL 0.8023 0.8813 0.8094 0.7814 0.8396 0.7489 0.8418 0.7505 0.7455 0.7557HAN 0.8145 0.8893 0.8175 0.8046 0.8308 0.7633 0.8424 0.7656 0.7556 0.7758DeepR 0.8516 0.9129 0.8509 0.8546 0.8472 0.7876 0.8566 0.7884 0.7857 0.7911
ACC AUC F1
0.80
0.82
0.84
0.86
0.88
0.90
0.92
0.94
met
ric s
core
DeepRDeepR-CDeepR-LDeepR-SDeepR-P
(a) Spatial contribution on Beijing
ACC AUC F1
0.80
0.82
0.84
0.86
0.88
0.90
0.92
0.94
met
ric s
core
DeepRDeepR-CADeepR-A
(b) Aspect contribution on Beijing
ACC AUC F1
0.70
0.75
0.80
0.85
0.90
met
ric s
core
DeepRDeepR-CDeepR-LDeepR-SDeepR-P
(c) Spatial contribution on Chengdu
ACC AUC F1
0.70
0.75
0.80
0.85
0.90
met
ric s
core
DeepRDeepR-CADeepR-A
(d) Aspect contribution on Chengdu
Figure 6: Evaluation of DeepR with its variants on the Bei-jing and Chengdu dataset.
the above GNN models, it achieves the best performance amongall the baselines. However, HAN is not capable of capturing thespatial features of POI and the aspect features well. DeepR can learnmulti-context information effectively from the HPIN. Therefore, asshown in Table 1, the performance of DeepR is improved greatly,where the ACC of DeepR is higher than Geom-GCN by 5.25% andHAN by 4.52% on the Beijing dataset.
6.2.2 How Spatial Factor Helps. To study how spatial factor helpswith the prediction, we design different variants of DeepR:
โข DeepR-C: It uses the co-query graph information withoutheat map based CNN to learn spatial distribution.
โข DeepR-L: Location-aware attention mechanism is removed.โข DeepR-S: It drops spatial oriented aggregation in SA-GNN.โข DeepR-P: There is no spatial-dependency attentive propa-gation in SA-GNN.
As we can see in Figure 6(a) and 6(c), DeepR can outperformall other variants of DeepR-C, DeepR-L, DeepR-S, and DeepR-Pon two datasets. It demonstrates the effectiveness of our proposedmodel to handle the spatial information. Specifically, our proposedmodel outperforms DeepR-C, which shows the effectiveness ofconsidering spatial distant-range dependencies in the heat map.The performance of DeepR-L is poor obviously on all three met-rics, showing that the spatial information for attentive propagationis significant. Furthermore, if we remove the spatial oriented ag-gregation (DeepR-S) or spatial-dependency attentive propagationcompletely (DeepR-P), results get worse greatly, which indicatesthe necessity of the SA-GNN to overcome the limitations of thegeneral message-passing GNNs in the POI graph learning problem.
6.2.3 How Aspect Context Helps. We further study how the aspectcontext can help to predict the competitive relationship effectively.We conduct experiments on variants:
โข DeepR-CA: Cross attention layer of PKE is removed.โข DeepR-A:The PKE component in DeepR is removed.
Figure 6(b) and 6(d) show that removing aspect context learningcomponent results in the accuracy decreased by 2.6%, which justifiesthe importance of aspect information. Moreover, DeepR-CA getsworse performance than the proposed model, because noisy aspectscan disturb prediction results without cross attention.
6.2.4 Hyper-parameters Sensitivity. We also conduct two experi-ments to study how parameter ๐ and ๐ ๐ influence DeepRโs perfor-mance. As illustrated in Figure 7, when the number of aspects ๐increases from 0 to 20, there are noticeable improvements on allmetrics. But when ๐ โฅ 10, results donโt change much. It is becausethat the top aspects (i.e., top 10) sorted by TF-IDF are more im-portant and the low-ranking aspects are relatively less semanticfor prediction. Figure 7 shows the scores increase slowly at thebeginning and become relatively stable when the size of the grid islarger than 400๐ ร 400๐. With the increase of the grid size, on theone hand, POIs in a grid can capture more distant Euclidean spatialinformation, while on the other hand, this information from thebigger granularity is not always good. As a result, the performanceachieves an equilibrium when ๐ ๐ โฅ 400.
Research Track Paper KDD '20, August 23โ27, 2020, Virtual Event, USA
1272
5 10 15 20 25 300.800
0.825
0.850
0.875
0.900
0.925
0.950
0.975
1.000
met
ric s
core
accaucf1-score
(a) Effect of topk aspects on Beijing
100 200 300 400 500 6000.800
0.825
0.850
0.875
0.900
0.925
0.950
0.975
1.000
met
ric s
core
accaucf1-score
(b) Effect of grid size on Beijing
5 10 15 20 25 300.70
0.75
0.80
0.85
0.90
0.95
met
ric s
core
accaucf1-score
(c) Effect of topk aspects on Chengdu
100 200 300 400 500 6000.70
0.75
0.80
0.85
0.90
0.95
met
ric s
core
accaucf1-score
(d) Effect of grid size on Chengdu
Figure 7: Sensitivity analysis on two datasets.
7 CONCLUSIONIn this paper, we study the competitive analysis problem for POIswhich is valuable on the commercial front. We construct a heteroge-neous POI information network and propose the DeepR frameworkto discover the competitors of POIs. Moreover, spatial adaptivegraph neural network is designed in DeepR, aiming at overcomingthe limitations of message-passing GNNs. Meanwhile, the proposedmodel utilizes the pairwise POI knowledge extracted from reviewsto improve the performance. Extensive experimental results on tworeal-world datasets show that DeepR significantly outperforms allbaselines for POI competitive relationship prediction.
ACKNOWLEDGMENTSThis research was partially supported by grants from the NationalKey Research andDevelopment Program of China (Grant No.2018YFB1402600), and the National Natural Science Foundation of China(Grant No.91746301,71531001, 61725205,61703386, U1605251).
REFERENCES[1] Tianqi Chen and Carlos Guestrin. 2016. Xgboost: A scalable tree boosting system.
In Proceedings of the 22nd acm sigkdd international conference on knowledgediscovery and data mining. ACM, 785โ794.
[2] Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning fornetworks. In Proceedings of the 22nd ACM SIGKDD international conference onKnowledge discovery and data mining. ACM, 855โ864.
[3] Mahdi Jalili, Yasin Orouskhani, Milad Asgari, Nazanin Alipourfard, and MatjaลพPerc. 2017. Link prediction in multiplex online social networks. Royal Societyopen science 4, 2 (2017), 160863.
[4] Nitin Jindal and Bing Liu. 2006. Identifying comparative sentences in text docu-ments. In Proceedings of the 29th annual international ACM SIGIR conference onResearch and development in information retrieval. ACM, 244โ251.
[5] Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic opti-mization. arXiv preprint arXiv:1412.6980 (2014).
[6] Thomas N Kipf and MaxWelling. 2016. Semi-supervised classification with graphconvolutional networks. arXiv preprint arXiv:1609.02907 (2016).
[7] Rui Li, Shenghua Bao, Jin Wang, Yong Yu, and Yunbo Cao. 2006. Cominer: Aneffective algorithm for mining competitors from the web. In Sixth InternationalConference on Data Mining (ICDMโ06). IEEE, 948โ952.
[8] David Liben-Nowell and Jon Kleinberg. 2007. The link-prediction problem forsocial networks. Journal of the American society for information science andtechnology 58, 7 (2007), 1019โ1031.
[9] Hao Liu, Ting Li, Renjun Hu, Yanjie Fu, Jingjing Gu, and Hui Xiong. 2019. JointRepresentation Learning for Multi-Modal Transportation Recommendation. InProceedings of the Thirty-Third AAAI Conference on Artificial Intelligence. 1036โ1043.
[10] Hao Liu, Yongxin Tong, Panpan Zhang, Xinjiang Lu, JianguoDuan, andHui Xiong.2019. Hydra: A Personalized and Context-Aware Multi-Modal TransportationRecommendation System. In Proceedings of the 25th ACM SIGKDD InternationalConference on Knowledge Discovery and Data Mining. 2314โ2324.
[11] Weiping Liu and Linyuan Lรผ. 2010. Link prediction based on local random walk.EPL (Europhysics Letters) 89, 5 (2010), 58007.
[12] Zhongming Ma, Gautam Pant, and Olivia RL Sheng. 2011. Mining competitorrelationships from online news: A network-based approach. Electronic CommerceResearch and Applications 10, 4 (2011), 418โ427.
[13] Xing Niu, Xinruo Sun, Haofen Wang, Shu Rong, Guilin Qi, and Yong Yu. 2011.Zhishi. me-weaving chinese linking open data. In International Semantic WebConference. Springer, 205โ220.
[14] Hongbin Pei, Bingzhe Wei, Kevin Chen-Chuan Chang, Yu Lei, and Bo Yang.2020. Geom-GCN: Geometric Graph Convolutional Networks. In InternationalConference on Learning Representations (ICLR).
[15] Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learningof social representations. In Proceedings of the 20th ACM SIGKDD internationalconference on Knowledge discovery and data mining. ACM, 701โ710.
[16] Michael Schlichtkrull, Thomas N Kipf, Peter Bloem, Rianne Van Den Berg, IvanTitov, and Max Welling. 2018. Modeling relational data with graph convolutionalnetworks. In European Semantic Web Conference. Springer, 593โ607.
[17] George Valkanas, Theodoros Lappas, and Dimitrios Gunopulos. 2017. MiningCompetitors from Large Unstructured Datasets. IEEE Transactions on Knowledgeand Data Engineering 29, 9 (2017), 1971โ1984.
[18] Petar Veliฤkoviฤ, Guillem Cucurull, Arantxa Casanova, Adriana Romero, PietroLio, and Yoshua Bengio. 2017. Graph attention networks. arXiv preprintarXiv:1710.10903 (2017).
[19] Hao Wang, Tong Xu, Qi Liu, Defu Lian, Enhong Chen, Dongfang Du, Han Wu,and Wen Su. 2019. MCNE: An End-to-End Framework for Learning MultipleConditional Network Representations of Social Network. In Proceedings of the25th ACM SIGKDD International Conference on Knowledge Discovery & DataMining. 1064โ1072.
[20] Hongwei Wang, Fuzheng Zhang, Min Hou, Xing Xie, Minyi Guo, and Qi Liu. 2018.Shine: Signed heterogeneous information network embedding for sentiment linkprediction. In Proceedings of the Eleventh ACM International Conference on WebSearch and Data Mining. ACM, 592โ600.
[21] Peng Wang, BaoWen Xu, YuRong Wu, and XiaoYu Zhou. 2015. Link predictionin social networks: the state-of-the-art. Science China Information Sciences 58, 1(2015), 1โ38.
[22] Xiao Wang, Houye Ji, Chuan Shi, Bai Wang, Yanfang Ye, Peng Cui, and Philip SYu. 2019. Heterogeneous Graph Attention Network. In The World Wide WebConference. ACM, 2022โ2032.
[23] Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, andPhilip S Yu. 2019. A comprehensive survey on graph neural networks. arXivpreprint arXiv:1901.00596 (2019).
[24] Jin Xu, Jingbo Zhou, Yongpo Jia, Jian Li, and Xiong Hui. 2020. An AdaptiveMaster-Slave Regularized Model for Unexpected Revenue Prediction Enhancedwith Alternative Data. In IEEE 36th International Conference on Data Engineering.IEEE, 601โ612.
[25] Kaiquan Xu, Stephen Shaoyi Liao, Jiexun Li, and Yuxia Song. 2011. Mining com-parative opinions from customer reviews for competitive intelligence. Decisionsupport systems 50, 4 (2011), 743โ754.
[26] Yang Yang, Jie Tang, Jacklyne Keomany, Yanting Zhao, Juanzi Li, Ying Ding,Tian Li, and Liangwei Wang. 2012. Mining competitive relationships by learningacross heterogeneous networks. In Proceedings of the 21st ACM internationalconference on Information and knowledge management. ACM, 1432โ1441.
[27] Liang Yao, Chengsheng Mao, and Yuan Luo. 2019. Graph convolutional net-works for text classification. In Proceedings of the AAAI Conference on ArtificialIntelligence, Vol. 33. 7370โ7377.
[28] Zikai Yin, Tong Xu, Hengshu Zhu, Chen Zhu, Enhong Chen, and Hui Xiong. 2019.Matching of social events and users: a two-way selection perspective. WorldWide Web (2019), 1โ19.
[29] Le Zhang, Tong Xu, Hengshu Zhu, Chuan Qin, Qingxin Meng, Hui Xiong, and En-hong Chen. 2020. Large-Scale Talent Flow Embedding for Company CompetitiveAnalysis. In Proceedings of The Web Conference 2020. 2354โ2364.
[30] Muhan Zhang and Yixin Chen. 2018. Link prediction based on graph neuralnetworks. In Advances in Neural Information Processing Systems. 5165โ5175.
[31] Weijia Zhang, Hao Liu, Yanchi Liu, Jingbo Zhou, and Hui Xiong. 2020. Semi-Supervised Hierarchical Recurrent Graph Neural Network for City-Wide ParkingAvailability Prediction.
[32] Jingbo Zhou, Shan Gou, Renjun Hu, Dongxiang Zhang, Jin Xu, Airong Jiang, YingLi, and Hui Xiong. 2019. A Collaborative Learning Framework to Tag Refinementfor Points of Interest. In Proceedings of the 25th ACM SIGKDD InternationalConference on Knowledge Discovery & Data Mining. 1752โ1761.
Research Track Paper KDD '20, August 23โ27, 2020, Virtual Event, USA
1273
A APPENDIXA.1 Additional Dataset DescriptionHere we introduce how to construct ground truth. Table 2 showsthe instruction of the main attributes of POI we used in our work.At first, for each brand, we collected its related brand (i.e. KFC andMcdonaldโs) from a public knowledge base zhishi.me [13] whichhas a โrelatedPageโ relation. For all POI pairs between two relatedbrands, we use the following conditions to pick up POI pairs withcompetitive relationship: 1) The distance between two POIs shouldbe within 10 km; ) They have the same category; 3) The overlap ofthe checked-in users of two POIs in July 2018 is larger than 5%.
In total, the Beijing dataset contains 18,731 pairs and the Chengdudataset contains 7,514 pairs. We manually check 200 pairs randomlyselected from the ground truth, and find that the accuracy is largerthan 95%. We generate the same number of negative POI pairs withthe same category for each city by randomly sampling while alsolimiting their distances within 10 km, and we also control thatthe distribution of POIs in each category is consistent with thedistribution of positive samples.
A.2 Baseline DescriptionWe compare our DeepR model with the following methods to pre-dict the competitive relationship of POIs:
โข DIST (distance rule) and EW (edge weight rule) are two sim-ple methods. We set the best threshold to determine whetherPOI pairs are competitive or not. For these baselines, if thedistance (w.r.t. DIST) or edge weight (w.r.t EW) of the POIpairs with the same category is smaller than a pre-definedthreshold, they are considered to be competitive.
โข MLP and XGBoost[1]. We concatenate POI features (i.e.,category and coordinate) of a pair of POIs and correlationfeatures (i.e., distance and co-query weight) as input to pre-dict competitive relationship by MLP and XGBoost.
โข Deepwalk[15] and Node2vec[2]. We use graph embeddingmethods to learn POIโs embedding on co-query graph, andthen apply an MLP classifier to predict.
โข GCN[6]. It is a kind of graph neural network, which ag-gregates node features on graph. Here we use POI featuresand the average vector of aspect embeddings learned fromword2vec as the input feature of each node, and GCN isconducted on co-query graph.
โข GAT[18]. This model is also a graph neural network consid-ering the attention mechanism to learn appropriate weightsof neighbor nodes. Here the input is the same as GCN.
โข Geom-GCN[14]. Geom-GCN is based on a geometric aggre-gation scheme for graph neural networks. Here we build themodel on the co-query POI graph, and the input feature isthe same as GCN.
โข SEAL[30]. It is a state-of-art graph neural network for linkprediction. Here we feed it with the subgraph extracted fromthe link, and the input feature has three components: struc-tural POI labels, POI embeddings and POI attributes.
โข HAN[22]. HAN is a state-of-art heterogeneous graph neuralnetwork, which employs node-level attention and semantic-level attention. Here we construct three meta-paths (PP, PBP,
Table 2: The main attributes of POI we used in our work.
Attribute DescriptionName The name of POI.Category Standard two-level category.Brand Brand information of POI.Point x The x-coordinate of POI on the map.Point y The y-coordinate of POI on the map.
PBABP) for HAN to learn POI representations to predict thecompetitive relationship.
A.3 Experiment SetupA.3.1 Data Splitting. In the experiment, we randomly pick 10% ofdata for the test and 10% of data as the validation set while limitingtheir corresponding brand pairs not appearing in the remaining80% train set in order to avoid information leakage.
A.3.2 Parameter Settings. For DeepR, we set the embedding sizeof each node in HPIN as 100, and the size of every hidden layerthe same as embedding size. We set ฮ๐ก = 30 min, \๐ = 50 and\๐๐๐ผ = 0.2. We select the top 30 aspects of each brand and thewindow size is 5. We divide the neighbors of each POI into foursectors for the spatial oriented aggregation of SA-GNN. For spatial-dependency attentive propagation, we set the bucket as 100 meters,and the maximum distance in each dimension is 10km. We set thegrid size in the heat map as 500mร500m, set ๐ฟ = 11 and select 12categories as channels, which means ๐ถ = 12. In addition, we setthe learning rate to 0.01, the filter size of CNN to 2ร2 and 3ร3, thenumber of attention head to 8, the dropout to 0.5, the ๐ฟ2 loss weightto 1e-5. We use Adam [5] as the optimizing method and Relu as theactivation function ๐ (ยท)
For baseline models, we tune the parameters of each model toensure the best performance. More specifically, for the rule-basedmethods, the threshold of distance for DIST is 4.2 km, and thethreshold of co-query edge weight for EW is 227, meaning that twoPOIs are queried 227 times by common users. For feature-basedmethods, we concatenate POI features of a pair of POIs and cor-relation features as input to train MLP and XGboost. We adoptthree-layer structure for MLP. The numer of decision tree in XG-boost is set to 200 and the max-depth of trees is set to 5. For faircomparison, the embedding dimension d of all other baselines areset to 100 (same as DeepR). For the graph embedding methods(Deepwalk and Node2vec), we set the walk length as 10 and thenumber of random walk as 80. The window size is set to 5, andthe parameter q and p for Node2vec are both set to 1. For GCN,the input includes POI features and the average vector of aspectembeddings learned from word2vec as the input feature of eachnode, and GCN is conducted on co-query graph. For SEAL, we usethe output of Node2vec as the embedding part of the node features.For GAT and Geom-GCN, the input is the same as the GCN. Thenumber of heads and the numbers of hidden units for GAT areset to 8 and 16, respectively. We use the Struc2vec as the graphembedding method to learn the latent space feature for Geom-GCN.For HAN, we employ three meta-paths, i.e., PP (POI-POI), PBABP(POI-brand-aspect-brand-POI) and PBP (POI-brand-POI) to trainthe model.
Research Track Paper KDD '20, August 23โ27, 2020, Virtual Event, USA
1274