R Software development - How to write and maintain 30K+ LOC in R and survive?
-
Upload
wit-jakuczun -
Category
Data & Analytics
-
view
463 -
download
0
Transcript of R Software development - How to write and maintain 30K+ LOC in R and survive?
![Page 1: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/1.jpg)
Copyright (c) WLOG Solutions
R software development How to write and maintain 30K+ LOC in R and
survive?
Wit Jakuczun, WLOG Solutions
2017-06-20
![Page 2: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/2.jpg)
Copyright (c) WLOG Solutions 2
World of analytics has changed.
![Page 3: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/3.jpg)
Copyright (c) WLOG Solutions 3
![Page 4: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/4.jpg)
Copyright (c) WLOG Solutions 4
4000x4 elastic-net models (CV-5) for 45Kx10K datasetin 1,5 minute!
![Page 5: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/5.jpg)
Copyright (c) WLOG Solutions 5
Join 21st centuRy today!
![Page 6: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/6.jpg)
Copyright (c) WLOG Solutions
What is R?
6
![Page 7: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/7.jpg)
Copyright (c) WLOG Solutions 7
Dynamically interpreted general programming language
![Page 8: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/8.jpg)
Copyright (c) WLOG Solutions 8
Stable open-source productdeveloped by R Foundation
since ~1995 year.
![Page 9: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/9.jpg)
Copyright (c) WLOG Solutions 9
Created for data analysis.
![Page 10: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/10.jpg)
Copyright (c) WLOG Solutions 10
flights %>%
group_by(year, month, day) %>%
select(arr_delay, dep_delay) %>%
summarise(
arr = mean(arr_delay, na.rm = TRUE),
dep = mean(dep_delay, na.rm = TRUE)
) %>%
filter(arr > 30 | dep > 30)
z <- scaled_input %>%
layer_convolution2D(c(5,5), 32, pad = TRUE) %>%
layer_max_pooling(c(3,3), c(2,2)) %>%
layer_convolution2D(c(3,3), 48) %>%
layer_max_pooling(c(3,3), c(2,2)) %>%
layer_convolution2D(c(3,3), 64) %>%
layer_dense(96) %>%
layer_dropout(0.5) %>%
layer_dense(num_output_classes, activation = activation_softmax())
![Page 11: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/11.jpg)
Copyright (c) WLOG Solutions 11
R is a community.
![Page 12: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/12.jpg)
Copyright (c) WLOG Solutions 12
CRAN10K+ packages
Githubmore and more
popular
![Page 13: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/13.jpg)
Copyright (c) WLOG Solutions 13
http://githut.info
![Page 14: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/14.jpg)
Copyright (c) WLOG Solutions 14
R is really popular
![Page 15: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/15.jpg)
Copyright (c) WLOG Solutions 15
Tiobe Index, 2017
Estimated 2M+ users all over the world.
![Page 16: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/16.jpg)
Copyright (c) WLOG Solutions 16
Sounds like python?
![Page 17: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/17.jpg)
Copyright (c) WLOG Solutions 17
![Page 18: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/18.jpg)
Copyright (c) WLOG Solutions 18
RPackage reticulate
PythonPackage rpy2
![Page 19: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/19.jpg)
Copyright (c) WLOG Solutions 19
R Software DevelopmentWhat is large scale?
![Page 20: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/20.jpg)
Copyright (c) WLOG Solutions 20
R software development vs
R scripting
![Page 21: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/21.jpg)
Copyright (c) WLOG Solutions 21
Large scale ~ 10K+ LOCSmall scale ~ 1K LOC
![Page 22: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/22.jpg)
Copyright (c) WLOG Solutions 22
CRAN (MRAN) Github Other
R environment
Installed packages
Local CRANSource code repo
![Page 23: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/23.jpg)
Copyright (c) WLOG Solutions 23
CRAN (MRAN) Github Other
R environment
Installed packages
Local CRANSource code repo
![Page 24: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/24.jpg)
Copyright (c) WLOG Solutions 24
R Software DevelopmentBest practices by WLOG
![Page 25: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/25.jpg)
Copyright (c) WLOG Solutions 25
Always make final test from command line.
![Page 26: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/26.jpg)
Copyright (c) WLOG Solutions 26
Rscript my_script.R
![Page 27: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/27.jpg)
Copyright (c) WLOG Solutions 27
Put all logic into packages.
![Page 28: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/28.jpg)
Copyright (c) WLOG Solutions 28
Package help system
Package dependency
system
External data in packages
Vignettes Tests
![Page 29: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/29.jpg)
Copyright (c) WLOG Solutions 29
Use any source code version control system.
Yes, even if you are working alone. :)
![Page 30: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/30.jpg)
Copyright (c) WLOG Solutions 30
print is not for logging.
Forbidden
![Page 31: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/31.jpg)
Copyright (c) WLOG Solutions 31
logging::loginfo(“Phase 1 passed”)
logging::logdebug(“Iter %d done”, i)
logging::logwarning(“Are you sure?”)
logging::logerror(“I failed :(”)
![Page 32: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/32.jpg)
Copyright (c) WLOG Solutions 32
Select external packages carefully.And control their versions!
![Page 33: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/33.jpg)
Copyright (c) WLOG Solutions 33
data.table
![Page 34: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/34.jpg)
Copyright (c) WLOG Solutions 34
Use configuration files.
![Page 35: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/35.jpg)
Copyright (c) WLOG Solutions 35
SnapshotDate: 2015-11-01PackagesPath: packagesLocalRepoPath: repositoryScriptPath: executionScriptsProject: XXXZipVersion:Artifacts:
LogLevel: INFOwork_path: ../workdata_path: ../dataexport_path: ../exportN_days: 365solver_max_iterations: 10solver_opt_horizon: 8
PARAMETERS CONFIG
![Page 36: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/36.jpg)
Copyright (c) WLOG Solutions 36
Use standard project structure.
![Page 37: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/37.jpg)
Copyright (c) WLOG Solutions 37
Master scripts
Project local packages
Tests
External packages
Logs
Work
Import
Export
Configura
tion
![Page 38: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/38.jpg)
Copyright (c) WLOG Solutions 38
Automate building, deploying, testing, etc.
![Page 39: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/39.jpg)
Copyright (c) WLOG Solutions 39
Jenkins exemplary pipeline
![Page 40: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/40.jpg)
Copyright (c) WLOG Solutions 40
Go to hell :)
![Page 41: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/41.jpg)
Copyright (c) WLOG Solutions 41
“If you are using R and
you think you’re in hell, this is a map for you.”
Patrick Burns, “R Inferno”, 2011
![Page 42: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/42.jpg)
Copyright (c) WLOG Solutions 42
Summary
![Page 43: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/43.jpg)
Copyright (c) WLOG SolutionsCopyright (c) WLOG Solutions 43
Seamless integration with existing systems and IT infrastructure
Dev/Test/Prod processes according to current software
development standards
Fast development to production cycle
Continuous integration & deployment
Repositories – models, builds, code,
dependencies, configuration
Controllable distributed job
scheduling
Resource usage monitoring
Secure access control, protected password
repositories
A well deployed R based analytical platform must have the following features
![Page 44: R Software development - How to write and maintain 30K+ LOC in R and survive?](https://reader031.fdocuments.us/reader031/viewer/2022022415/5a65052b7f8b9abb218b496f/html5/thumbnails/44.jpg)
Copyright (c) WLOG Solutions
Wit Jakuczun, PhD
44
WLOG R Suite™Field tested R ecosystem for Enterprise