casos• Overview of theory and models used
• Methodology
• Results
data
• Unsupervised technique that explains sets of observed
data as being generated from unobserved groups
Box’s Loop (the Blei method)
Domain Knowledge and Assumptions
Observed Data
Blei, D. M. (2014). Build, compute, critique, repeat: Data analysis
with latent variable models.
Inference Criticize and Evaluate
data
• Unsupervised technique that explains sets of observed
data as being generated from unobserved groups
Latent Dirichlet AllocationLatent Dirichlet Allocation*
* Equivalent to probabilistic latent semantic analysis (matrix
factorization)
Hierarchical Bayesian Model
• Choose where k ∈{1,...,K}βk ∼ Dir(η)
• Choose a faction zi, j ∼ Mulitnomial(θ i )
Bills
Factions
x xi, j ∼ Mulitnomial(βzi , j )• Choose a vote
Blei, David M., Andrew Y. Ng, and Michael I. Jordan (2003). Latent
dirichlet allocation.
Generative Story
• A bill is proposed for a vote
• Assume that the (observed) votes on each bill are generated from
a
finite number of (unobserved) factions
• For each MP: • Decides which (mixture of) factions to align with:
zi, j ∼ Mulitnomial(θ i )
xi, j ∼ Mulitnomial(βzi , j )
What is Machine Learning?
Against 5 1
Abstain 6 0
Against 5 0
Abstain 6 0
• As such we evaluate the appropriateness of the model
(and assumptions) with two alternative measures:
faction diversity and party composition.
Evaluation
Miroshnichenko Ivan Vladimirovich Markevich Yaroslav Vladimirovich
Nemirovsky Andriy Valentinovich Humil Mikhail Mikhailovich Drayuk
Sergey Evseevich
Faction 6
Faction 11
Kolganova Olena Valerievna Unguryan Pavel Yakimovich Humil Mikhail
Mikhailovich Krivenko Vadim Valerievich Sidorchuk Vadim
Vasilyevich
Faction 8
Discovered Factions People Front non-factional Opposition Bloc
Economic Development Union Will of the People
Diversity Even at limit, entropy doesn’t reach theoretical max of
2.19
Factions over time
Factions over time
Factions over time
Factions over time
1558-1 [1] -> 1580 [4]
Factions over time
1558 Econ Consumer Protection financial services from the effects
of devaluation of the
hryvnia
1562 Econ on measures to stabilize the balance of payments of
Ukraine
1563 Econ amending the Customs Code of Ukraine (to stabilize the
balance of payments)
1564 Econ on measures to promote the
capitalization and restructuring of banks
1565 Sectoral about peculiarities of ownership in an apartment
house
1566 Sectoral energy efficiency of buildings
1567 Econ On the list of state property that can not be
privatized
1568 Legal amending some laws of Ukraine
concerning the manner of execution of judgments
1569 Legal
temporarily occupied territory of Ukraine
1573 Social reform of compulsory state social insurance and payroll
legalization
1574 Econ amending the Customs Code of Ukraine
(regarding certain issues state compensation for damage)
1575 Econ On Amendments to the Tax Code of
Ukraine (regarding certain issues state compensation for
damage)
1576 Econ amending the Budget Code of Ukraine (concerning protected
expenditures)
1577 Sectoral amending and ceasing to be invalid some legislative
acts of Ukraine
1578 Econ Tax Code of Ukraine and laws of Ukraine (on tax
reform)
1580 Econ simplification of the business environment
(deregulation)
Model Selection
and equivalent for bills.
McInnes, L., & Healy, J. (2018). UMAP: Uniform Manifold
Approximation and Projection for Dimension Reduction. arXiv
preprint arXiv:1802.03426.
Party composition
McInnes, L., & Healy, J. (2018). UMAP: Uniform Manifold
Approximation and Projection for Dimension Reduction. arXiv
preprint arXiv:1802.03426.
Party composition
McInnes, L., & Healy, J. (2018). UMAP: Uniform Manifold
Approximation and Projection for Dimension Reduction. arXiv
preprint arXiv:1802.03426.
Party composition
McInnes, L., & Healy, J. (2018). UMAP: Uniform Manifold
Approximation and Projection for Dimension Reduction. arXiv
preprint arXiv:1802.03426.
Party composition
• Global cluster position
• LDA assumes independence of observations
• As such it does not encode temporal relations/patterns
• Ad-hoc tuning of optimal # of parameters
• Purely inferential. LDA cannot predict/forecast dynamics
(i.e. when a new faction emerge/dissolve given past
trend) without extensions to the method.
Short Comings
Sequential models • Hidden Markov models and recurrent neural
networks
Temporal and dynamic models • Dynamic topic modeling and
switching/change point models
Concurrent Work and Extensions
Semi-supervised and active learning • Label propagation with known
faction leaders
Graph neural networks • Representation learning on MP association
network
Causal modeling • Causal effect inference with deep latent variable
models
Questions?
Appendix and References
• Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent
dirichlet allocation. Journal of machine Learning research, 3(Jan),
993-1022.
• Blei, D. M. (2014). Build, compute, critique, repeat: Data
analysis with latent variable models. Annual Review of Statistics
and Its Application, 1, 203-232.
• Teh, Y. W., Jordan, M. I., Beal, M. J., & Blei, D. M. (2005).
Sharing clusters among related groups: Hierarchical Dirichlet
processes. In Advances in neural information processing systems
(pp. 1385-1392).
• Johnson, M. J., & Willsky, A. S. (2013). Bayesian
nonparametric hidden semi-Markov models. Journal of Machine
Learning Research, 14(Feb), 673-701.
• Blei, D. M., & Lafferty, J. D. (2006, June). Dynamic topic
models. In Proceedings of the 23rd international conference on
Machine learning (pp. 113-120). ACM.
• Fox, E., Sudderth, E. B., Jordan, M. I., & Willsky, A. S.
(2009). Nonparametric Bayesian learning of switching linear
dynamical systems. In Advances in Neural Information Processing
Systems (pp. 457-464).
References
• Yang, Z., Hu, Z., Salakhutdinov, R., & Berg-Kirkpatrick, T.
(2017). Improved variational autoencoders for text modeling using
dilated convolutions. arXiv preprint arXiv:1702.08139.
• Gregory, S. (2010). Finding overlapping communities in networks
by label propagation. New Journal of Physics, 12(10), 103018.
• Tran, D., Hoffman, M. D., Saurous, R. A., Brevdo, E., Murphy, K.,
& Blei, D. M. (2017). Deep probabilistic programming. arXiv
preprint arXiv:1701.03757.
• Krishnan, R. G., Shalit, U., & Sontag, D. (2017, February).
Structured Inference Networks for Nonlinear State Space Models. In
AAAI (pp. 2101-2109).
• Hamilton, W. L., Ying, R., & Leskovec, J. (2017).
Representation Learning on Graphs: Methods and Applications. arXiv
preprint arXiv:1709.05584.
• Louizos, C., Shalit, U., Mooij, J. M., Sontag, D., Zemel, R.,
& Welling, M. (2017). Causal effect inference with deep
latent-variable models. In Advances in Neural Information
Processing Systems (pp. 6449-6459).
• Chen, G. H., Nikolov, S., & Shah, D. (2013). A latent source
model for nonparametric time series classification. In Advances in
Neural Information Processing Systems (pp. 1088-1096).
References