1 The Simfit package - SimFiT · Generalized Linear Models (GLM) can be fitted interactively with...
Transcript of 1 The Simfit package - SimFiT · Generalized Linear Models (GLM) can be fitted interactively with...
SIMFIT SIMDEM
1 The Simfit package
SimFIT is a comprehensive free OpenSource package for simulation, model fitting, graphics
and data analysis in Windows, Linux-Wine, and Mac-Crossover, available from the website
at https://simfit.org.uk. Macros are provided to interface with Excel and results
tables can be extracted from the log files in tab-separated, or html format for incorporation
into word processors, or in LATEX format for technical documents.
SimFIT can be used in statistical analysis (power as a function of sample size), epidemi-
ology (survival analysis), biology (growth curves), pharmacology (dose response curves),
pharmacy (pharmacokinetics), physiology (membrane transport), biochemistry (enzyme ki-
netics), biophysics (ligand binding), chemistry (chemical kinetics), and physics (dynamical
systems).
The forty individual programs are run from a driver which provides access to the reference
manual, the help program, tutorials with each program, and a mechanism for using test
files to demonstrate every procedure. SimFIT is installed by a setup program into a SimFIT
folder, then desktop short cuts to the driver w_simfit.exe in 32-bit operating systems, or
x64_simfit.exe in 64-bit operating systems, can be made. Programs are selected, then
test files are provided to demonstrate analysis of correctly formatted files before using your
own data. You can scan the results (which are saved to log files), or you can browse current
data, edit and re-run interactively, copy results to the clipboard, or create a results archive.
SimFIT is very easy to use for simple procedures like statistical testing, or elementary curve
fitting, and there are tutorials to teach students how to use it to explore data, but it is the
advanced features, like the extensive constrained nonlinear regression facilities, that will be
greatly appreciated by more experienced analysts. All users, however, will benefit from the
professional quality graph plotting procedures, which can create industry standard eps files
and collages. The eps files can be used to create pdf, svg, png, jpg, tif, or bmp retrospectively,
and there is a utility to edit eps files, or create inlays, or collages from them. Educators will
appreciate the fact that, when students come to try a procedure for the first time, there is
always a correctly formatted example data set to demonstrate that procedure.
1.1 Data format
SimFIT data files are just ASCII text tables of values, and there are SimFIT editors to format
data. These editors have extensive features to facilitate data preparation, e.g. checking
for nondecreasing order or nonnegative values; cutting/pasting rows, columns, blocks;
plotting to identify errors; highlighting extreme signal/noise ratios; assigning weights for
fitting; performing row/column arithmetic, including trig and hyperbolic functions, variance
stabilizing transformations, etc. Data copied to the clipboard from other applications can
be analyzed, and there are special macros supplied with the package to interface SimFIT
with MS Excel. SimFIT will also accept data in spreadsheet export formats such as csv,
xml, html, mht, txt, prn, or mhtml.
1
1.2 Curve fitting
There are programs for straightforward fitting, such as multi-linear, or polynomial regres-
sion, but also simple programs for nonlinear regression that automatically choose starting
estimates, then select models on the basis of goodness of fit. However, there are more
advanced programs for fitting general linear models, survival models, time series, differen-
tial equations, or sets of models, where users can control parameter limits and set starting
estimates. Models can be used from a library or can be supplied by users, and there are
many options for calibration and bioassay, as follows.
❑ Sums of exponentials (and estimation of AUC)
❑ Sums of Michaelis-Menten functions (and estimation of half saturation points and
final asymptotes)
❑ Sums of High/Low affinity binding sites
❑ Cooperative order n saturation functions (and cooperativity analysis)
❑ Positive n:n rational functions
❑ Growth curves (derivative plots, comparison of models and estimation of min/max
growth rates, half-times and final sizes)
❑ Survival curves using several models
❑ Polynomials (all degrees up to 6 then statistical tests for the best model and prediction
of x with confidence limits given y)
❑ Cubic splines (user-placed knots, automatic knots with variable tension, or crosss-
validation data smoothing)
❑ Systems of differential equations (Adams, Gear, phase portrait, orbits) using defined
starting estimates and limits or random cycles to search for a global minimum
❑ Calibration curves (polynomials, splines or user selected models)
❑ Area under curve (AUC by choice from several methods)
❑ Initial rates
❑ Lag times
❑ Horizontal and inclined asymptotes
❑ Numerical deconvolution of sums of exponentials, Michaelis-Mentens, trigonometric
functions and Gaussian densities
❑ Fitting user supplied models
❑ Analyzing flow cytometry profiles
2
❑ After fitting functions of 1, 2 or 3 variables, parameters and objective functions can
be stored for F, Akaike, Schwarz and Mallows Cp tests, and the wssq/ndof contours
and 3D surface can be viewed as functions of any two chosen parameters. With
all functions of 1 variable, calibration, evaluation, extrapolation, area calculations,
derivative estimations and interactive error bar plots can be done.
❑ Multi-function mode: simultaneous fitting of several functions of the same indepen-
dent variables, linked by common model parameters
❑ Generalized Linear Models (GLM) can be fitted interactively with either normal,
binomial, Poisson or Gamma errors. Appropriate links can be either identity, power,
square root, reciprocal, log, logistic, probit or complementary log-log with canonical
links as defaults and facility to supply fixed offsets. A simplified interface is in-
cluded for logistic, binary logistic or polynomial logistic regression, bioassy, log-lin
contingency analysis or survival analysis.
❑ Stratified data sets can be analyzed by Cox regression and conditional logistic regres-
sion
❑ Autoregressive integrated moving average models (ARIMA) to time series with fore-
casting
❑ Facility to store parameter estimates and covariance matrices in order to compute
Mahalanobis distances between fits of the same model to different data sets and test
for significant differences in parameter estimates.
1.3 Statistics
All the usual descriptive statistics (bar charts, histograms, best-fit distributions on sample
cdfs, dendrograms, box and whisker or cluster plots), multivariate statistics (distance ma-
trices and dendrograms, principal components and scree plots), time series (ACF, PACF,
ARIMA) and frequently used tests (mostly with exact p values not the normal approxima-
tions), such as:
• chi-square (O/E vectors, m by n contingency tables and wssq/ndof)
• McNemar test on n by n frequency tables
• Cochran Q test
• Fisher exact (2 by 2 contingency table) with all p values
• Fisher exact Poisson distribution test
• t (both equal and unequal variances, paired and unpaired)
• variance ratio
• F for model validation
3
• Bartlett and Levene tests for homogeneity of variance
• 1,2,3-way Anova (with automatic variance stabilizing transformations and nonpara-
metric equivalents)
• Tukey post-ANOVA Q test
• Factorial ANOVA with marginal plots
• Repeated measures ANOVA with Helmert matrix of orthonormal contrasts, Mauchly
sphericity test and Greenhouse-Geisser/Huyn-Feldt epsilon corrections
• MANOVA with Wilks lambda, Roy’s largest root, Lawley-Hotelling trace, and Pillai
trace for equality of mean vectors, Box’s test for equality of covariance matrices, and
profile analysis for repeated measurements.
• Canonical variates and correlations for group comparisons
• Mahalanobis distance estimation for allocation to groups using estimative or predictive
Bayesian methods
• Cochran-Mantel-Haenszel 2x2xk contingency table Meta Analysis test
• Binomial test
• Sign test
• various Hotelling T-squared tests
• Goodness of fit and non parametric tests:
– runs (all, conditional, up and down)
– signs
– Wilcoxon-Mann-Whitney U
– Wilcoxon paired-samples signed-ranks
– Kolmogorov-Smirnov 1 and 2 sample
– Kruskal-Wallis
– Friedman
– Median, Mood and David tests and Kendall coefficient of concordance
– Mallows Cp, Akaike AIC, Schwarz SC, Durbin-Watson
– tables and plots of residuals, weighted residuals, deviance residuals, Anscombe
residuals, leverages and studentized residuals as appropriate
– half normal and normal residuals plots
• Multilinear regression by
4
– L1 norm
– L2 norm (weighted least squares)
– L∞ norm
– Robust regression (M-estimates)
– Also logistic, binary logistic, log-linear, orthogonal, reduced major axis, with
interactive selection and transformation of variables in all cases.
• Partial Least squares (PLS)
• Survival analysis (Kaplan-Meier, ML-Weibull, Mantel-Haenszel) or, using general-
ized linear models with covariates (Exponential, Weibull, Extreme value, Cox)
• Correlation analysis (Pearson product moment, Kendal Tau, Spearman Rank) on all
possible pairs of columns in a matrix, and canonical correlations when data columns
fall naturally into two groups. Partial correlation coefficients can be calculated for
data sets with more than two variables.
• Shapiro-Wilks normality test with the large sample correction
• Normal scores plots
• Binomial distribution, analysis of proportions, exact parameter confidence limits,
likelihood ratio, odds, odds ratios and graphical tests such as log-odds-ratios plots
with exact confidence limits for systematic variation in binomial p values
• Trinomial distribution (and confidence contour plots)
• Parameter estimates, confidence limits and goodness of fit for:
– uniform
– normal
– binomial
– Poisson
– exponential
– gamma
– beta
– lognormal
– Weibull distributions
• All possible pairwise comparisons between columns of data by KS-2, MWU and
unpaired t tests using the Bonferroni principle.
• Distance matrices for use in cluster analysis with extensive choice of pre-conditioning
transformations and alternative link functions, e.g. Canberra dissimilarity and Bray-
Curtis similarity.
5
• Nearest neighbors from a distance matrix
• Classical-metric and non-metric scaling of distance matrices
• Principal components with eigenvalues, scree diagrams, loadings and scores from
multivariate data sets
• Procrustes analysis to estimate the similarity between two matrices
• Varimax or Quartimax rotation of a loading matrix
• Canonical variates with eigenvalues, scree diagrams, loadings and scores from mul-
tivariate data sets. Group means can be plotted with confidence regions to assign
comparison data to existing groups.
• K-means cluster analysis with plots
• Number needed to treat (NNT) and false discovery rate (FDR)
1.4 Calculations
Options are provided for the sort of calculations that are most often required inn data
analysis, for instance singular value decomposition to find the rank of a matrix, or eigenvalue
estimation.
❍ Zeros of polynomials
❍ Zeros of a user-defined function
❍ Zeros of n nonlinear functions in n variables
❍ Integrals of n user-defined functions in m variables
❍ Convolution integrals
❍ Bound-constrained quasi-Newton optimization
❍ Eigenvalues
❍ Determinants
❍ Inverses
❍ Singular value decomposition with right and left singular vectors
❍ LU factorisation as A = PLU with matrix 1 and infinity norms and corresponding
condition numbers
❍ QR factorisation as in A = QR
❍ Cholesky factorisation as in Q = RRT
6
❍ Matrix multiplication C = AB, AT B, ABT or AT BT
❍ Evaluation of quadratic forms xT Ax or xT A−1x
❍ Solve full-rank matrix equations Ax = b
❍ Solve over-determined linear systems Ax = b in the L1, L2 or L∞ norms
❍ Solve the symmetric eigenvalue problem (A− λB)x = 0
❍ Areas, derivatives and arc lengths of user supplied functions
❍ Analysis of cooperative ligand binding (zeros of binding polynomial, Hessian, min-
max Hill slope, transformed binding constants, cooperativity indices, plotting species
fractions)
❍ Power and sample size calculations for statistical tests used in clinical trials, including
plotting power as a function of sample size (1 or 2 binomial proportions; 1, 2 or k
normal ANOVA samples; 1 or 2 correlation coefficients; 1 or 2 variances; chi-square
test)
❍ Probabilities and cdf plots for the non-central t, non-central chi-square, non-central
beta or non-central F distributions
❍ Estimation of exact parameter confidence limits for the binomial, normal, Poisson,
etc. distributions and plotting confidence contours for the trinomial distribution.
❍ Robust calculation of location parameter with confidence limits for one sample (me-
dian, trimmed and winsorized means, Hodges-Lehmann estimate, etc.).
❍ Time series smoothing by moving averages, running medians, Hanning or the 4253H-
twice smoother
❍ Time series, sample autocorrelation functions and partial autocorrelation functions
and plots for chosen numbers of lags and associated test statistics
❍ Auto- and cross-correlation matrices for two time series
❍ Shannon, Brilloin, Pielou, and Simpson diversity indices.
❍ Kernel density estimation
1.5 Simulation
Models can be evaluated for plotting, and exact data can be simulated and random error
added to mimic experimental situations, as now listed.
● Generating exact data from a library of models or from user-defined models
7
● User defined models can be multiple equations in several variables or sets of non-
linear differential equations, and can have conditional logical branching (equivalent
to if...elseif...else) for models that swap at critical values of subsidiary functions or
independent variables
● Logical operators like IF, IFNOT, AND, OR, NOT, XOR, etc. can be used to control
model execution
● Special functions that can be used within models include
• Airy functions Ai(x), Bi(x) and derivatives
• Hyperbolic and inverse hyperbolic functions cosh(x), sinh(x), tanh(x), arc-
cosh(x), arcsinh(x), arctanh(x)
• Bessel functions J0, J1, Y0, Y1, I0, I1, K0, K1
• Normal integral phi(x), error function erf(x) and complements phic(x) and
erfc(x), Dawson’s integral
• Exponential, sine and cosine integrals, E1(x), Ei(x), Si(x), Ci(x)
• Fresnel, Spence, Debye, Fermi-Dirac, Abramovitz, Clausen integrals
• Kelvin bei(x), ber(x), kei(x) and ker(x) functions
• Elliptic integrals RC, RF, RD, RJ and Jacobi functions sn(u,m), cn(u,m) and
dn(u,m)
• Binomial coefficients, gamma function, incomplete gamma function, log(gamma(x)),
digamma (psi(x)), and trigamma functions
• Struve confluent hypergeometric functions
• Legendre polynomials of degree n and order m, spherical harmonics
• Cumulative distribution functions for the normal, t, F, chi-square, beta and
gamma distributions
• Inverse functions for the normal, t, F, chi-square, beta and gamma distributions
• Impulse functions: Heaviside, Kronecker delta, Dirac delta, triangular spike,
Gaussian
• Periodic wave impulse functions: square, rectified triangle, Morse dot, sawtooth,
rectified sine, rectified half sine, unit impulse
● One-line commands can be used for vector arithmetic (initialisation, norms, dot
products), for evaluating polynomials or Chebyshev expansions, or for calling for
mathematical constants (like Euler’s gamma)
● Submodels can be defined, and these can be called from a main model for function
evaluation, root finding, or adaptive quadrature or they can be called dynamically
with arbitrary arguments
● A one line command is all that is necessary to estimate a convolution integral for any
two sub-models over any range
8
● Adding random error to exact data sets to simulate experimental error
● Systems of differential equations
After simulating, selected orbits can be stored and, with autonomous systems, vector
field phase portraits can be plotted to identify singularities.
● Generating random numbers and 1,2,3-D random walks from the:
• uniform
• normal
• chi-square
• F
• logistic
• Weibull
• Cauchy
• Poisson
• binomial distributions.
● Generating n by m normally distributed random matrices.
● Generating random permutations of lists and Latin squares.
1.6 Documentation
■ Help documents are available in several formats as follows:
• ASCII text files can be read by any text editor (e.g. Notepad), or using a SimFIT
viewer, which is safer as it does not allow editing.
• HTML documents can be viewed by a built in HTML browser.
• PostScript documents (.ps) can be read using GSview or transformed into
Portable Document Format.
• Portable Document Format (.pdf) files can be read and printed using Adobe
Acrobat reader.
• Binary files that can only be read using a built in interpreter.
■ A short HTML program can be viewed from the main SimFIT program manager,
w_simfit.exe.
■ Each program has a dedicated self contained tutorial.
■ Many of the specialized controls have individual tutorials.
■ There is an extensive set of readme files (w_readme.0 gives details) which describe
advanced features and technical details.
9
■ This document (simfit_summary.pdf) summarizes the package and contains collages.
■ A document (MS_office.pdf) describes the interface to MS Office and, in par-
ticular, explains how to use the macros to extract data from MS Excel spreadsheets.
■ A document (PS_fonts.pdf) lists the PostScript font encodings, for those who
want to create special effects.
■ A detailed reference manual (w_manual.pdf) is provided in .ps and .pdf formats.
The pdf version incorporates hyperlinks between the contents, index and page refer-
ences and provides book marks.
■ There is a full set of test files containing appropriate data to test every SimFIT
procedure.
1.7 Tutorials and worked examples
The SimFIT website provides a set of some 220 short documents explaining how to use the
individual programs. These are also collected together into an 800 page document called
w_examples.pdf which can also be read from the website and is available from the
SimFIT main menu.
1.8 Graphical techniques
The primary hardcopy is encapsulated PostScript (.eps) but these can be used retrospectively
to generate .pdf, .svg, .png, .jpg, .tif, or .bmp. The following procedures are available.
◆ Grouping of data into histograms (with error bars if appropriate) and cdfs
◆ Calculation of means and error bars with arbitrary confidence limits from replicates
can be done interactively or from data files
◆ Error bars can be non-symmetrical or sloping if required and multiple non-orthogonal
error bars can be plotted
◆ Error bars can be added to 2D and 3D bar charts and 3D cylinder plots
◆ Extrapolation of best-fit linear and nonlinear curves to arbitrary end points
◆ Automatic transformation of error bars into various axes (Hill, Lineweaver-Burk,
Scatchard, log-odds, etc.)
◆ Immunoassay type dilution plots using logs to base 2, 3, 4, 5, 6, 7, 8, 9 as well as e
and 10, and with labels as logs, powers of the base or fractions
◆ Multiple axes plots
◆ Pie charts with arbitrary displacements, fill-styles, colours
10
◆ Bar charts with arbitrary positions, sizes, fill-styles, colours and error-bars
◆ Presentation box and whisker plots, and pie or bar charts with 3D perspective effects
◆ Orbits and vector field diagrams for systems of differential equations
◆ Dendrograms and 3D-cluster plots for use in cluster analysis
◆ Scree diagrams and score or loading scatter plots for principal components analysis
(score plots can have Hotelling T2 elliptical confidence regions)
◆ 3D-surfaces and 2D-projections of contours
◆ Curves in space and projections onto planes
◆ vast array of plotting characters and maths symbols
◆ Standard PostScript fonts, Symbol, ZapfDingbats and Isolatin1 encoding
◆ Professional quality PostScript files that can easily be edited to change titles, legends,
symbols, line-types and thicknesses, etc.
◆ Interactive PostScript facility for arbitrarily stretching and clipping overcrowded plots
such as dendrograms without changing aspect ratios of fonts or plotting symbols but
by just changing white-space between graphical objects
◆ A PostScript editor is supplied for scaling, rotating, shearing, translating, editing,
making collages and inlays from .eps files
◆ Transformation of .eps files into scalable vector graphics format (svg) or compressed
bitmap graphics formats (e.g. bmp, pcx, tif, jpg, png, pdf)
◆ Plotting user defined parameteric equations such as r (θ), x(t), y(t) in 2-space and
x(t),y(t),z(t) in 3-space
◆ Facility to import PostScript specials automatically into the PostScript file creation
stream in order to redefine fonts, colours, plotting symbols, add logos, etc.
◆ Graphical deconvolution of summation models after fitting
◆ 2D and 3D Biplots for multivariate data sets
◆ Metafiles can be archived then replayed to continue editing
11
2 The Simdem package
SimDEM is a set of three dynamic link libraries, available with 32-bit (cdecl or stdcall
compatible) or 64-bit binaries, that can be called from any executables that support one or
other of these conventions. It provides programmers with access to the SimFIT libraries
on order to wrote Windows programs without having to know anything about the Windows
API.
SimDEM was originally designed for Fortran programmers, but the DLLs can be linked
to Windows executables compiled in any programming language. The package has 70
individual programs, each designed to demonstrate a particular GUI or graphical feature,
and the source code is provided so that the calling sequences can be understood. The driver
(simdem.exe or x64_simdem.exe) allows users to execute compiled examples then
view the corresponding source code to understand the calling sequences. It can also run
the SimFIT PostScript editor. Executables linked to SimDEM have access to all the SimFIT
input-output, menu, tutorial, and graphical procedures.
SimDEM is ideally suited for use with the NAG Fortran Builder and Silverfrost Plato IDEs
and it is bundled with these compilers.
32-bit programs compiled using the SimDEM library must be distributed with the following
run–time system
w_clearwin.dllw_menus.dllw_graphics.dllsalflibc.dll
while 64-bit versions require this run–time system
x64_clearwin.dllx64_menus.dllx64_graphics.dllsalflibc64.dllclearwin64.dll.
12
3 Gallery
Here are examples of plotting styles, followed by collages constructed from SimFIT .eps
files. Zoom in to the graphs to appreciate that the resolution is only limited by the resolution
of the display or plotting device.
SIMFIT 3D plot for z = f(x,y)
XY
Z
1.000
0.000
1.000
0.0000
10
Contours for Rosenbrock Optimization Trajectory
X
Y-1.500
1.500
1.500-1.500
Key Contour 1 1.425 2 2.838 3 5.663 4 11.313 5 22.613 6 45.212 7 90.412 8 1.808×102
9 3.616×102
10 7.232×102
1
2
2
3
3
4
4
5
56
6
7
7
8
8
9 9
10 10
100%
80%
60%
40%
20%
0%
PC1PC2
PC5PC8
PC6HC8
PC3PC4
PC7HC7
HC424A
33B76B
30B100A
34
53A
76
30A
61B60A
27A27B
52
37B
68
28A
97A26A
60B29
36A36B
31B31A
35B32A
32B35A
72A72B
99A99B
37A47
100B33A
53B73
24B26B
28B97B
91A91B
25A25B
61AHC5
HC6
Per
cent
age
Sim
ilarit
y
Bray-Curtis Similarity Dendrogram
0.15
0.25
0.35
0.45
0.55
0.65
0.75
0.85
0.05 0.15 0.25 0.35 0.45 0.55 0.65 0.75
Trinomial Parameter 95% Confidence Contours
px
p y
7,11,12
70,110,20
210,330,60
9,9,2
90,90,20
270,270,60
0.50
1.00
-4 -2 0 2
x
Plotting Equationsusing PSfrag and
1
σ√
2π
∫ x
−∞exp
−1
2
t − µ
σ
2
dt
LATEX
0
1
2
0 1 2 3 4 5
Oxidation of p-Dimethylaminomethylbenzylamine
Time (min)
Con
cent
ratio
n (m
M)
✧✧ ❜❜
❜❜ ✧✧❜❜
✧✧
CH2NH2
CH2N(Me)2
✲[O] ✧✧ ❜❜
❜❜ ✧✧❜❜
✧✧
CH=0
CH2N(Me)2
+NH3✲
[O] ✧✧ ❜❜
❜❜ ✧✧❜❜
✧✧
C02H
CH2N(Me)2
13
MultivariateK-meansClustering Distribution Sites
for UK Airports
Diffusion From a Plane Source
3
-3
1.25
0.00-2
-10
12
0.250.50
0.751.00
0.0
1.6
0.4
0.8
1.2
Z
XY
-1
1
-1 1
Rhodoneae of Abbe Grandi, r = sin(4 )
x
y
0.000
0.100
0.200
0.300
0.400
0.500
-3.0 1.5 6.0 10.5 15.0
Deconvolution of 3 Gaussians
x
y
0.00
0.50
1.00
0 1 2 3 4 5
Fitting a Convolution Integral f*g
Time t
f(t)
, g(t
) an
d f*
g
f(t) = exp(- t)
g(t) = 2 t exp(- t)
f*g
0.0
0.2
0.4
0.6
0.8
1.0
0.0 1.0 2.0
Scatchard Plot
y
y/x 1 Site Model
2 Site Model
T = 21°C[Ca++] = 1.3×10-7M
0
2
4
6
8
0 1
Cochran-Mantel-Haenszel Meta Analysis
log10[Odds Ratio]
Gro
up N
umbe
r
Pooled Estimate 0.00
0.25
0.50
0.75
1.00
0 1 2 3 4 5 6
Data Smoothing by Cubic Splines
X-values
Y-v
alu
es
14
0.00
0.25
0.50
0.75
1.00
-2.50 -1.25 0.00 1.25 2.50
GOODNESS OF FIT TO A NORMAL DISTRIBUTION
Sample Values (µm)
Sam
ple
Dis
trib
utio
n F
unct
ion
Sample
N(µ,σ2)µ = 0σ = 1
0.00
0.25
0.50
0.75
1.00
0 50 100 150 200 250 300 350
Kaplan-Meier Product-Limit Survivor Estimates
Days from start of trial
S(t
)
Stage 3
Stage 4
+ indicates censored times
Values
Month 7Month 6
Month 5Month 4
Month 3Month 2
Month 1
Case 1Case 2
Case 3Case 4
Case 5
0
11
Simfit Cylinder Plot with Error Bars
0.0
10.0
20.0
0.00 0.25 0.50 0.75 1.000.00
0.20
0.40
0.60
0.80
1.00
Using QNFIT to fit Beta Function pdfs and cdfs
Random Number Values
His
togr
am a
nd p
df f
it
Step Curve and cdf fit
0
1
2
0 1 2
Phase Portrait for the Lotka-Voterra Equations
y(2)
y(1)
-0.50
0.00
0.50
1.00
1.50
2.00
2.50
-1.25 0.00 1.25
Orbits for a System of Differential Equations
y(2)
y(1)
t Increasing
SIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFIT
Style 1
Style 2
Style 3
Style 4
Style 5
Style 6
Style 7
Style 8
Style 9
Style 10
Pie Chart Fill Styles
Pie key 1
Pie key 2
Pie key 3
Pie key 4
Pie key 5
Pie key 6
Pie key 7
Pie key 8
Pie key 9
Pie key 10
SIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFIT
-2.50
0.00
2.50
5.00
7.50
January
February
March
April
May
Perspective Effects In Bar Charts
Ran
ges,
Qua
rtile
s, M
edia
ns
15
-80
-60
-40
-20
0
20
40
-60 -40 -20 0 20 40 60
x
y
ToiletKitchen
Bath
Electricity
Water
Radio
TV set
Refrigerator
ChristianArmenian
Jewish
MoslemAm.Colony Sh.Jarah
Shaafat Bet-Hanina
A-Tur Isawiye
Silwan Abu-Tor
Sur-Bahar Bet-Safafa
Mulivariate Biplot Three Dimensional Multivariate Biplot
-.4
.6
0
-1-.4
.35
Toilet KitchenBath
Electricity
Water
Radio
TV set
Refrigerator
Christian
Armenian
Jewish
Moslem
Am.Colony Sh.Jarah
Shaafat Bet-Hanina
A-Tur Isawyie
Silwan Abu-Tor
Sur-Bahar Bet-SafafaXY
Z
SIMFIT Three Dimensional Scatter Diagram
0
10
10
02
46
8 24
68
2
8
3
4
5
6
7
A1A2
A3A4
A5
B1
B2
B3B4
B5
XY
Z
Type A
Type B
-0.30
0.00
0.30
0.60
0 2 4 6 8 10
Partial Autocorrelation Function
Lags
PA
CF
val
ues
2÷√n
-2÷√n
0
20
40
60
0 20 40 60 80 100 120
ARIMA Predictions with 95% Confidence Limits
Time
Obs
erva
tions
Forecasts
Data
0
20
40
60
80
100
0 20 40 60 80
ANOVA Power and Sample Size
Sample Size (n)
Pow
er (
%)
k =
2k
= 4
k =
8k
= 16
k = 32
k = no. of groups2 = 1 (variance) = 1 (difference)
0
200
400
600
0 50 100 150 200 250
Using CSAFIT to analyse Flow Cytometry Histograms
Channel Number
Fre
quen
cy
ControlTreatment
-16
-8
0
8
16
-16 -8 0 8 16
95% Confidence Ellipses
x
y
16
A
A∩B
A∩B
B∩A
B
Venn Diagram for the Addition Rule
P{A∪ B} = P{A} + P{B} - P{A∩B} 10
20
30
40
40 60 80 100
Highlighting K-means cluster centroids
Variable 1
Var
iabl
e 2
A
B
C
D
E
F
G
HI
JK
L
M
N
O
P
Q
RS
T
(47.8, 35.8, 16.3, 2.4, 6.7)
(64.0, 25.2, 10.7, 2.8, 6.7)
(81.2, 12.7, 7.2, 2.1, 6.6)
0
1
2
0 1 2
t
Dat
a an
d B
est F
it C
urve
s
1 Exponential
2 Exponentials
-0.75
-0.25
0.25
0 1 2t
log 1
0 [f(
t)]
log10 [f(t)] against t
-5
15
25
35
45
-5 5 10 15 20
Extrapolating Double Reciprocal Inhibition Plots
1/S
1/v
I = 0
I = 1
I = 2
I = 3
I = 4
0
200
400
600
800
1000
1200
0 2 4 6 8 10
Best Fit Epidemic Differential Equations
t
y(1)
, y(2
), y
(3)
SusceptibleInfectedResistant
SIMFIT 3D plot for z = f(x,y)
XY
Z
1
0
1
0
0
1
Twister Curve with Projections onto Planes
-20
20
-20
20
-10
0
10
-10
0
10
0
400
100
200
300
z(t)
x(t)y(t)
-3-2
-10
12
3
-3-2
-10
12
3
0.0
0.15915
XY
Bivariate Normal Distributionµx = µy = 0, σx = σy = 1, ρ = 0
f(x,y)
17
4 Collages
0.00
0.50
1.00
1.50
2.00
0 10 20 30 40 50
Binding Curve for the 2 2 isoform at 21 C
Concentration of Free Ligand(µM)
Lig
and
Bou
nd p
er M
ole
of P
rote
in
1 Site Model
2 Site Model
0.00
0.25
0.50
0.75
1.00
0.00 0.50 1.00 1.50 2.00
Scatchard Plot for the 2 2 isoform
y
y/x
(µM
-1)
1 Site Model
2 Site Model
T = 21°C[Ca++] = 1.3×10-7M
0.00
0.25
0.50
0.75
1.00
0 1 2 3 4 5 6
Data Smoothing by Cubic Splines
X-values
Y-v
alu
es
0.00
0.25
0.50
0.75
1.00
-2.50 -1.25 0.00 1.25 2.50
GOODNESS OF FIT TO A NORMAL DISTRIBUTION
Sample Values (µm)
Sam
ple
Dis
trib
utio
n F
unct
ion
Sample
N(µ,σ2)µ = 0σ = 1
Inhibition Kinetics: v = f([S],[I])
[S]/mM
v([S
],[I
])/
M.m
in-1
0
20
40
60
80
100
0 20 40 60 80
[I] = 0
[I] = 0.5mM
[I] = 1.0mM
[I] = 2.0mM
Absorbance, Enzyme Activity and pH
Fraction Number
Ab
sorb
ance
at
280n
m
En
zyme A
ctivity (un
its) and
pH
0.00
0.25
0.50
0.75
1.00
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0.00
2.00
4.00
6.00
8.00
Absorbance Enzyme Units pH of Eluting Buffer
Plotting a Surface and Contours for z = f(x,y)
XY
Z
1.000
0.000
1.000
0.000
0
1
Three Dimensional Bar Chart
January
June
1996
1992May
AprilMarch
February
19931994
1995
0%
100%
50%
0.00
0.50
1.00
0 5 10 15 20 25
Survival Analysis
Time
Est
imat
ed S
urvi
vor
Func
tion
Kaplan-Meier Estimate
MLE Weibull Curve
Best Fit Line and 95% Limits
x
y
0
5
10
15
0 2 4 6 8 10
SIMFIT 3D plot for z = x2 - y2
XY
Z
1
-1
1
-1-1
1
0.25
0.50
0.75
1.00
-3 -2 -1 0 1 2
x
φ(x) =1
σp
2π
Z x
�∞exp
(
�
12
�
t�µσ
�2)
dt
0.000
0.020
0.040
0.060
0.080
0.100
0.120
0 10 20 30 40 50
Binomial Probability Plot for N = 50, p = 0.6
x
Pr(X
= x
)
0
100
200
300
400
500
0 50 100 150 200 250
Using CSAFIT for Flow Cytometry Data Smoothing
Channel Number
Num
ber
of C
ells
x(t), y(t), z(t) curve and projection onto y = - 1
XY
Z
1.000
-1.000
1.000
-1.0000.000
1.000
0
25
50
75
100
125
0 2 4 6 8 10
Time (weeks)
Perc
enta
ge o
f A
vera
ge F
inal
Siz
e
MALE
FEMALE
Using GCFIT to fit Growth Curves
1
2
3
4
5
-1.50 -1.00 -0.50 0.00 0.50 1.00 1.50
Log Odds Plot
log10[p/(1 - p)]
t/W
eeks
0
20
40
60
80
100
0 20 40 60 80
ANOVA (k = no. groups, n = no. per group)
Sample Size (n)
Pow
er (
%)
k =
2k
= 4
k =
8k
= 16
k = 32
2 = 1 (variance) = 1 (difference)
18
0.00
0.20
0.40
0.60
0.80
1.00
1.20
0.00 1.00 2.00 3.00 4.00 5.00
t (min)
x(t)
, y(t
), z
(t)
x(t)
y(t)
z(t)
A kinetic study of the oxidation of p-Dimethylaminomethylbenzylamine
"
"
b
b
b
b
"
"b
b
"
"
CH2NH2
CH2N(Me)2
-
[O] "
"
b
b
b
b
"
"b
b
"
"
CH=0
CH2N(Me)2
+ NH3-
[O] "
"
b
b
b
b
"
"b
b
"
"
C02H
CH2N(Me)2
ddt
0
@
xyz
1
A
=
0
@
�k+1 k
�1 0k+1 (�k
�1�k+2) k
�2
0 k+2 �k
�2
1
A
0
@
xyz
1
A
;
0
@
x0
y0
z0
1
A
=
0
@
100
1
A
-0.50
0.00
0.50
1.00
1.50
2.00
2.50
-1.25 0.00 1.25
Orbits for a System of Differential Equations
y(2)y(
1)
0.15
0.25
0.35
0.45
0.55
0.65
0.75
0.85
0.05 0.15 0.25 0.35 0.45 0.55 0.65 0.75
Trinomial Parameter 95% Confidence Regions
px
p y
7,11,2
70,110,20
210,330,60
9,9,290,90,20
270,270,60
Using SIMPLOT to plot a Contour Diagram
X
Y
1.000
0.000
1.0000.000
Key Contour 1 9.025×10-2
2 0.181 3 0.271 4 0.361 5 0.451 6 0.542 7 0.632 8 0.722 9 0.812 10 0.903
1
1
2
2
2
2
3
3
3
4
4
4
5
5
5
6
7
89
10
0.0
10.0
20.0
0.00 0.25 0.50 0.75 1.000.00
0.20
0.40
0.60
0.80
1.00
Using QNFIT to fit Beta Function pdfs and cdfs
Random Number Values
His
togr
am a
nd p
df f
it
Step Curve and cdf fit
-1
0
1
2
-1 0 1 2
Phase Portrait for Lotka-Volterra Equations
y(2)
y(1)
0.000
0.100
0.200
0.300
0.400
0.500
-3.0 1.5 6.0 10.5 15.0
Deconvolution of 3 Gaussians
x
y
Illustrating Detached Segments in a Pie Chart
January
February
March
April
May
June
July
August
SeptemberOctober
November
December
Box and Whisker Plot
Month
Ran
ge,
Qu
arti
les
and
Med
ian
s
-2.00
0.25
2.50
4.75
7.00
Janu
ary
Feb
ruar
y
Mar
ch
Apr
il
May
Bar Chart Features
Overlapping Group
Normal Group
Stack
Hanging Group
Box/Whisker
55%
0%
-35%-3
0
3
6
9
0 10 20 30 40 50
1-Dimensional Random Walk
Number of Steps
Posi
tion
3-Dimensional Random Walk
XY
Z
1
-11
11
-1-11
1
19
K-Means Clusters
100%
80%
60%
40%
20% 0%
PC1PC2PC5PC8PC6HC8PC3PC4PC7HC7HC424A33B76B30B
100A34
53A76
30A61B60A27A27B
5237B
6828A97A26A60B
2936A36B31B31A35B32A32B35A72A72B99A99B37A
47100B33A53B
7324B26B28B97B91A91B25A25B61AHC5HC6
Contours for Rosenbrock Optimization Trajectory
X
Y
-1.500
1.500
1.500-1.500
Key Contour 1 1.425 2 2.838 3 5.663 4 11.313 5 22.613 6 45.212 7 90.412 8 1.808×102
9 3.616×102
10 7.232×102
1
2
2
3
3
4
4
5
56
6
7
7
8
8
9 9
10 10
Diffusion From a Plane Source
3
-3
1.25
0.00-2
-10
12
0.250.50
0.751.00
0.0
1.6
0.4
0.8
1.2
Z
XY
Values
Month 7Month 6
Month 5Month 4
Month 3Month 2
Month 1
Case 1Case 2
Case 3Case 4
Case 5
0
11
Simfit Cylinder Plot with Error Bars
0.0
2.0
4.0
6.0
0.0 2.0 4.0 6.0
Slanting and Multiple Error Bars
x
y
20