6 scikit-learn - Data Tuesday 26 fev 2013
-
Upload
data-tuesday -
Category
Documents
-
view
119 -
download
0
description
Transcript of 6 scikit-learn - Data Tuesday 26 fev 2013
• Library of Machine Learning models
• Simple fit / predict / transform API
• Python / NumPy / SciPy / Cython
& wrappers for libsvm / liblinear
• Model Assessment, Selection & Ensembles
• Some support for multi-core
dimanche 24 février 13
Possible Applications
• Text Classification / Sequence Tagging NLP
• Computer Vision / Robotics
• Learning To Rank - IR and advertisement
• Statistical Analysis of the Brain: fMRI / MEG
• Astronomy, Biology, Social Sciences...
dimanche 24 février 13
Total dataset size:n_samples: 1288, n_features: 1850, n_classes: 7
Extracting the top 150 eigenfaces from 966 facesdone in 0.466s
Projecting the input data on the eigenfaces orthonormal basisdone in 0.056s
Fitting the SVM classifier to the training setdone in 18.549s
Predicting people's names on the test setdone in 0.062s precision recall f1-score support
Ariel Sharon 0.90 0.75 0.82 12 Colin Powell 0.78 0.94 0.85 62 Donald Rumsfeld 0.86 0.72 0.78 25 George W Bush 0.89 0.96 0.92 141Gerhard Schroeder 0.92 0.74 0.82 31 Hugo Chavez 0.90 0.53 0.67 17 Tony Blair 0.81 0.74 0.77 34
avg / total 0.86 0.86 0.86 322
dimanche 24 février 13
Contributors
• GitHub-centric contribution workflow
• each pull request needs 2 x [+1] reviews
• code + tests + doc + example
• 92% test coverage / Continuous Integr.
• 4 major releases per years + 4 bugfix rel.
• 66 contributors for release 0.13
dimanche 24 février 13
Users
• We support users on & ML
• 200+ questions tagged with [scikit-learn]
• Many competitors + benchmarks
• 500+ answers on ongoing user survey
• 60% academics / 40% from industry
• Some data-drive Startups use sklearn
dimanche 24 février 13
Thank you!
• http://scikit-learn.org - Main Project + doc
• @ogrisel on twitter
• http://ogrisel.com - ML Consultancy (soon)
dimanche 24 février 13