CrowDM system
-
Upload
dmitry-ignatov -
Category
Technology
-
view
249 -
download
0
description
Transcript of CrowDM system
Mining Complex Data Generated by Collaborative Platforms
Dmitry I. Ignatov, Alexandra Yu. Kaminskaya, Anastasia A. Bezzubt-
seva, Ekaterina L. Chernyak, Konstantin N. Blinkin, Daniil R. Ne- dumov, Olga N. Chugunova, Andrey V. Konstantinov, Nikita S. Ro-
mashkin, Fedor V. Strok, Daria A. Goncharova, Rostislav E. Yavorsky
BIR 2012 HSE, Nizhniy Novgorod
The story of collaboration
The project and educational group
«Algorithms of Data Mining for Internet forums on innovative projects» (NRU HSE)
Crowdsourcing
• From Wikipedia: – Crowdsourcing is a process that
involves outsourcing tasks to a distributed group of people. This process can occur both online and offline (Jeff Howe , 2006)
– Crowdsourcing is related to, but not the same as, human-based computation, which refers to the ways in which humans and computers can work together to solve problems (Quinn & Bederson, 2010)
Collaborative platform
• Carrying out brainstorming (public examination, crowdsourcing)
• Platform core is a socio-semantic network (users, content)
• Users solve common problem, propose their ideas, evaluate and discuss ideas of each other
• As a result of users and ideas rating we get the best ideas and its generators (best users)
The goal
The development of special instrument for deeper understanding of collaborative platform users behavior, developing the sufficient rating criteria, dynamics and statistics analysis
The data analysis scheme
Formal context: data
• The project «Sberbank-21»: http://sberbank21.ru/
• Objects are platform users
• Attributes are ideas within the topic Sberbank and Private Client
• Object x Attribute datasets:
– The user is the author of the idea
– The user left a comment to the idea or to any of its comments
– The user has evaluated the idea or its comments
Results: concept lattice
Concept Explorer conexp.sourceforge.net/
Results: concept lattice
Formal concept: ({User45, User22}, {“Microcredits in [1000, 5000] rub.”})
Results: “iceberg” lattice
For user-Comment Context for Sberbank-21 Project
Results: biclustering
BicAT (Biclustering Analysis Toolbox): http://www.tik.ee.ethz.ch/sop/bicat/
Results: biclustering
Bicluster: ({User1 – User11}, {I1, I2, I3})
Results: biclustering
Extent Intent Stability Support
Hrabrova_Tatyana_Sergeevna,
Rasul_Gappoev, Alena,
Aleksey_Protsenko,
Valentin_Mashkin,
Aleksandr_Popov,
Maksim_Dubinin,
Mihail_Demchenko,
Dinara_Gorlenko, Viktoriya,
Tatyana_Dmitrova
What_shall_appear_at_physical_
office_of_SB-21?,
A_unique_service_of_2021_for
_small_businesses?,
Sberbank_and_Private_Clients
0,7109375 0,101852
Results: statistical methods
1
10
100
1 000
1 10 100 1000 10000
Nu
mb
er
of
use
rs
Number of evaluations, x
Distribution evaluation Power Law?
Power Law Tests
№ Выборка n xmin xmax α p-value
1 Idea generation 64 11 55 3,5 0,73
2.1 Commenting (1) 109 5 681 1,5 0
2.2 Commenting (2) 65 10 199 1,84 0,116
3.1 Evalutation (1) 38 614 5020 3,48 0,78
3.2 Evaluation (2) 70 84 614 1,81 0
Conclusion
• The developed methodology is useful for collaborative system and system of resource sharing data analysis
• Future work
– Using of textual information
– Applying multimodal clustering methods
– Development of recommender system
Thank you! Questions?