1 The Simfit package - SimFiT · Generalized Linear Models (GLM) can be fitted interactively with...

20
SIMF I T SIMD E M 1 The Simfit package SimF I T is a comprehensive free OpenSource package for simulation, model fitting, graphics and data analysis in Windows, Linux-Wine, and Mac-Crossover, available from the website at https://simfit.org.uk. Macros are provided to interface with Excel and results tables can be extracted from the log files in tab-separated, or html format for incorporation into word processors, or in L A T E X format for technical documents. SimF I T can be used in statistical analysis (power as a function of sample size), epidemi- ology (survival analysis), biology (growth curves), pharmacology (dose response curves), pharmacy (pharmacokinetics), physiology (membrane transport), biochemistry (enzyme ki- netics), biophysics (ligand binding), chemistry (chemical kinetics), and physics (dynamical systems). The forty individual programs are run from a driver which provides access to the reference manual, the help program, tutorials with each program, and a mechanism for using test files to demonstrate every procedure. SimF I T is installed by a setup program into a SimF I T folder, then desktop short cuts to the driver w_simfit.exe in 32-bit operating systems, or x64_simfit.exe in 64-bit operating systems, can be made. Programs are selected, then test files are provided to demonstrate analysis of correctly formatted files before using your own data. You can scan the results (which are saved to log files), or you can browse current data, edit and re-run interactively, copy results to the clipboard, or create a results archive. SimF I T is very easy to use for simple procedures like statistical testing, or elementary curve fitting, and there are tutorials to teach students how to use it to explore data, but it is the advanced features, like the extensive constrained nonlinear regression facilities, that will be greatly appreciated by more experienced analysts. All users, however, will benefit from the professional quality graph plotting procedures, which can create industry standard eps files and collages. The eps files can be used to create pdf, svg, png, jpg, tif, or bmp retrospectively, and there is a utility to edit eps files, or create inlays, or collages from them. Educators will appreciate the fact that, when students come to try a procedure for the first time, there is always a correctly formatted example data set to demonstrate that procedure. 1.1 Data format SimF I T data files are just ASCII text tables of values, and there are SimF I T editors to format data. These editors have extensive features to facilitate data preparation, e.g. checking for nondecreasing order or nonnegative values; cutting/pasting rows, columns, blocks; plotting to identify errors; highlighting extreme signal/noise ratios; assigning weights for fitting; performing row/column arithmetic, including trig and hyperbolic functions, variance stabilizing transformations, etc. Data copied to the clipboard from other applications can be analyzed, and there are special macros supplied with the package to interface SimF I T with MS Excel. SimF I T will also accept data in spreadsheet export formats such as csv, xml, html, mht, txt, prn, or mhtml. 1

Transcript of 1 The Simfit package - SimFiT · Generalized Linear Models (GLM) can be fitted interactively with...

Page 1: 1 The Simfit package - SimFiT · Generalized Linear Models (GLM) can be fitted interactively with either normal, binomial, Poisson or Gamma errors. Appropriate links can be either

SIMFIT SIMDEM

1 The Simfit package

SimFIT is a comprehensive free OpenSource package for simulation, model fitting, graphics

and data analysis in Windows, Linux-Wine, and Mac-Crossover, available from the website

at https://simfit.org.uk. Macros are provided to interface with Excel and results

tables can be extracted from the log files in tab-separated, or html format for incorporation

into word processors, or in LATEX format for technical documents.

SimFIT can be used in statistical analysis (power as a function of sample size), epidemi-

ology (survival analysis), biology (growth curves), pharmacology (dose response curves),

pharmacy (pharmacokinetics), physiology (membrane transport), biochemistry (enzyme ki-

netics), biophysics (ligand binding), chemistry (chemical kinetics), and physics (dynamical

systems).

The forty individual programs are run from a driver which provides access to the reference

manual, the help program, tutorials with each program, and a mechanism for using test

files to demonstrate every procedure. SimFIT is installed by a setup program into a SimFIT

folder, then desktop short cuts to the driver w_simfit.exe in 32-bit operating systems, or

x64_simfit.exe in 64-bit operating systems, can be made. Programs are selected, then

test files are provided to demonstrate analysis of correctly formatted files before using your

own data. You can scan the results (which are saved to log files), or you can browse current

data, edit and re-run interactively, copy results to the clipboard, or create a results archive.

SimFIT is very easy to use for simple procedures like statistical testing, or elementary curve

fitting, and there are tutorials to teach students how to use it to explore data, but it is the

advanced features, like the extensive constrained nonlinear regression facilities, that will be

greatly appreciated by more experienced analysts. All users, however, will benefit from the

professional quality graph plotting procedures, which can create industry standard eps files

and collages. The eps files can be used to create pdf, svg, png, jpg, tif, or bmp retrospectively,

and there is a utility to edit eps files, or create inlays, or collages from them. Educators will

appreciate the fact that, when students come to try a procedure for the first time, there is

always a correctly formatted example data set to demonstrate that procedure.

1.1 Data format

SimFIT data files are just ASCII text tables of values, and there are SimFIT editors to format

data. These editors have extensive features to facilitate data preparation, e.g. checking

for nondecreasing order or nonnegative values; cutting/pasting rows, columns, blocks;

plotting to identify errors; highlighting extreme signal/noise ratios; assigning weights for

fitting; performing row/column arithmetic, including trig and hyperbolic functions, variance

stabilizing transformations, etc. Data copied to the clipboard from other applications can

be analyzed, and there are special macros supplied with the package to interface SimFIT

with MS Excel. SimFIT will also accept data in spreadsheet export formats such as csv,

xml, html, mht, txt, prn, or mhtml.

1

Page 2: 1 The Simfit package - SimFiT · Generalized Linear Models (GLM) can be fitted interactively with either normal, binomial, Poisson or Gamma errors. Appropriate links can be either

1.2 Curve fitting

There are programs for straightforward fitting, such as multi-linear, or polynomial regres-

sion, but also simple programs for nonlinear regression that automatically choose starting

estimates, then select models on the basis of goodness of fit. However, there are more

advanced programs for fitting general linear models, survival models, time series, differen-

tial equations, or sets of models, where users can control parameter limits and set starting

estimates. Models can be used from a library or can be supplied by users, and there are

many options for calibration and bioassay, as follows.

❑ Sums of exponentials (and estimation of AUC)

❑ Sums of Michaelis-Menten functions (and estimation of half saturation points and

final asymptotes)

❑ Sums of High/Low affinity binding sites

❑ Cooperative order n saturation functions (and cooperativity analysis)

❑ Positive n:n rational functions

❑ Growth curves (derivative plots, comparison of models and estimation of min/max

growth rates, half-times and final sizes)

❑ Survival curves using several models

❑ Polynomials (all degrees up to 6 then statistical tests for the best model and prediction

of x with confidence limits given y)

❑ Cubic splines (user-placed knots, automatic knots with variable tension, or crosss-

validation data smoothing)

❑ Systems of differential equations (Adams, Gear, phase portrait, orbits) using defined

starting estimates and limits or random cycles to search for a global minimum

❑ Calibration curves (polynomials, splines or user selected models)

❑ Area under curve (AUC by choice from several methods)

❑ Initial rates

❑ Lag times

❑ Horizontal and inclined asymptotes

❑ Numerical deconvolution of sums of exponentials, Michaelis-Mentens, trigonometric

functions and Gaussian densities

❑ Fitting user supplied models

❑ Analyzing flow cytometry profiles

2

Page 3: 1 The Simfit package - SimFiT · Generalized Linear Models (GLM) can be fitted interactively with either normal, binomial, Poisson or Gamma errors. Appropriate links can be either

❑ After fitting functions of 1, 2 or 3 variables, parameters and objective functions can

be stored for F, Akaike, Schwarz and Mallows Cp tests, and the wssq/ndof contours

and 3D surface can be viewed as functions of any two chosen parameters. With

all functions of 1 variable, calibration, evaluation, extrapolation, area calculations,

derivative estimations and interactive error bar plots can be done.

❑ Multi-function mode: simultaneous fitting of several functions of the same indepen-

dent variables, linked by common model parameters

❑ Generalized Linear Models (GLM) can be fitted interactively with either normal,

binomial, Poisson or Gamma errors. Appropriate links can be either identity, power,

square root, reciprocal, log, logistic, probit or complementary log-log with canonical

links as defaults and facility to supply fixed offsets. A simplified interface is in-

cluded for logistic, binary logistic or polynomial logistic regression, bioassy, log-lin

contingency analysis or survival analysis.

❑ Stratified data sets can be analyzed by Cox regression and conditional logistic regres-

sion

❑ Autoregressive integrated moving average models (ARIMA) to time series with fore-

casting

❑ Facility to store parameter estimates and covariance matrices in order to compute

Mahalanobis distances between fits of the same model to different data sets and test

for significant differences in parameter estimates.

1.3 Statistics

All the usual descriptive statistics (bar charts, histograms, best-fit distributions on sample

cdfs, dendrograms, box and whisker or cluster plots), multivariate statistics (distance ma-

trices and dendrograms, principal components and scree plots), time series (ACF, PACF,

ARIMA) and frequently used tests (mostly with exact p values not the normal approxima-

tions), such as:

• chi-square (O/E vectors, m by n contingency tables and wssq/ndof)

• McNemar test on n by n frequency tables

• Cochran Q test

• Fisher exact (2 by 2 contingency table) with all p values

• Fisher exact Poisson distribution test

• t (both equal and unequal variances, paired and unpaired)

• variance ratio

• F for model validation

3

Page 4: 1 The Simfit package - SimFiT · Generalized Linear Models (GLM) can be fitted interactively with either normal, binomial, Poisson or Gamma errors. Appropriate links can be either

• Bartlett and Levene tests for homogeneity of variance

• 1,2,3-way Anova (with automatic variance stabilizing transformations and nonpara-

metric equivalents)

• Tukey post-ANOVA Q test

• Factorial ANOVA with marginal plots

• Repeated measures ANOVA with Helmert matrix of orthonormal contrasts, Mauchly

sphericity test and Greenhouse-Geisser/Huyn-Feldt epsilon corrections

• MANOVA with Wilks lambda, Roy’s largest root, Lawley-Hotelling trace, and Pillai

trace for equality of mean vectors, Box’s test for equality of covariance matrices, and

profile analysis for repeated measurements.

• Canonical variates and correlations for group comparisons

• Mahalanobis distance estimation for allocation to groups using estimative or predictive

Bayesian methods

• Cochran-Mantel-Haenszel 2x2xk contingency table Meta Analysis test

• Binomial test

• Sign test

• various Hotelling T-squared tests

• Goodness of fit and non parametric tests:

– runs (all, conditional, up and down)

– signs

– Wilcoxon-Mann-Whitney U

– Wilcoxon paired-samples signed-ranks

– Kolmogorov-Smirnov 1 and 2 sample

– Kruskal-Wallis

– Friedman

– Median, Mood and David tests and Kendall coefficient of concordance

– Mallows Cp, Akaike AIC, Schwarz SC, Durbin-Watson

– tables and plots of residuals, weighted residuals, deviance residuals, Anscombe

residuals, leverages and studentized residuals as appropriate

– half normal and normal residuals plots

• Multilinear regression by

4

Page 5: 1 The Simfit package - SimFiT · Generalized Linear Models (GLM) can be fitted interactively with either normal, binomial, Poisson or Gamma errors. Appropriate links can be either

– L1 norm

– L2 norm (weighted least squares)

– L∞ norm

– Robust regression (M-estimates)

– Also logistic, binary logistic, log-linear, orthogonal, reduced major axis, with

interactive selection and transformation of variables in all cases.

• Partial Least squares (PLS)

• Survival analysis (Kaplan-Meier, ML-Weibull, Mantel-Haenszel) or, using general-

ized linear models with covariates (Exponential, Weibull, Extreme value, Cox)

• Correlation analysis (Pearson product moment, Kendal Tau, Spearman Rank) on all

possible pairs of columns in a matrix, and canonical correlations when data columns

fall naturally into two groups. Partial correlation coefficients can be calculated for

data sets with more than two variables.

• Shapiro-Wilks normality test with the large sample correction

• Normal scores plots

• Binomial distribution, analysis of proportions, exact parameter confidence limits,

likelihood ratio, odds, odds ratios and graphical tests such as log-odds-ratios plots

with exact confidence limits for systematic variation in binomial p values

• Trinomial distribution (and confidence contour plots)

• Parameter estimates, confidence limits and goodness of fit for:

– uniform

– normal

– binomial

– Poisson

– exponential

– gamma

– beta

– lognormal

– Weibull distributions

• All possible pairwise comparisons between columns of data by KS-2, MWU and

unpaired t tests using the Bonferroni principle.

• Distance matrices for use in cluster analysis with extensive choice of pre-conditioning

transformations and alternative link functions, e.g. Canberra dissimilarity and Bray-

Curtis similarity.

5

Page 6: 1 The Simfit package - SimFiT · Generalized Linear Models (GLM) can be fitted interactively with either normal, binomial, Poisson or Gamma errors. Appropriate links can be either

• Nearest neighbors from a distance matrix

• Classical-metric and non-metric scaling of distance matrices

• Principal components with eigenvalues, scree diagrams, loadings and scores from

multivariate data sets

• Procrustes analysis to estimate the similarity between two matrices

• Varimax or Quartimax rotation of a loading matrix

• Canonical variates with eigenvalues, scree diagrams, loadings and scores from mul-

tivariate data sets. Group means can be plotted with confidence regions to assign

comparison data to existing groups.

• K-means cluster analysis with plots

• Number needed to treat (NNT) and false discovery rate (FDR)

1.4 Calculations

Options are provided for the sort of calculations that are most often required inn data

analysis, for instance singular value decomposition to find the rank of a matrix, or eigenvalue

estimation.

❍ Zeros of polynomials

❍ Zeros of a user-defined function

❍ Zeros of n nonlinear functions in n variables

❍ Integrals of n user-defined functions in m variables

❍ Convolution integrals

❍ Bound-constrained quasi-Newton optimization

❍ Eigenvalues

❍ Determinants

❍ Inverses

❍ Singular value decomposition with right and left singular vectors

❍ LU factorisation as A = PLU with matrix 1 and infinity norms and corresponding

condition numbers

❍ QR factorisation as in A = QR

❍ Cholesky factorisation as in Q = RRT

6

Page 7: 1 The Simfit package - SimFiT · Generalized Linear Models (GLM) can be fitted interactively with either normal, binomial, Poisson or Gamma errors. Appropriate links can be either

❍ Matrix multiplication C = AB, AT B, ABT or AT BT

❍ Evaluation of quadratic forms xT Ax or xT A−1x

❍ Solve full-rank matrix equations Ax = b

❍ Solve over-determined linear systems Ax = b in the L1, L2 or L∞ norms

❍ Solve the symmetric eigenvalue problem (A− λB)x = 0

❍ Areas, derivatives and arc lengths of user supplied functions

❍ Analysis of cooperative ligand binding (zeros of binding polynomial, Hessian, min-

max Hill slope, transformed binding constants, cooperativity indices, plotting species

fractions)

❍ Power and sample size calculations for statistical tests used in clinical trials, including

plotting power as a function of sample size (1 or 2 binomial proportions; 1, 2 or k

normal ANOVA samples; 1 or 2 correlation coefficients; 1 or 2 variances; chi-square

test)

❍ Probabilities and cdf plots for the non-central t, non-central chi-square, non-central

beta or non-central F distributions

❍ Estimation of exact parameter confidence limits for the binomial, normal, Poisson,

etc. distributions and plotting confidence contours for the trinomial distribution.

❍ Robust calculation of location parameter with confidence limits for one sample (me-

dian, trimmed and winsorized means, Hodges-Lehmann estimate, etc.).

❍ Time series smoothing by moving averages, running medians, Hanning or the 4253H-

twice smoother

❍ Time series, sample autocorrelation functions and partial autocorrelation functions

and plots for chosen numbers of lags and associated test statistics

❍ Auto- and cross-correlation matrices for two time series

❍ Shannon, Brilloin, Pielou, and Simpson diversity indices.

❍ Kernel density estimation

1.5 Simulation

Models can be evaluated for plotting, and exact data can be simulated and random error

added to mimic experimental situations, as now listed.

● Generating exact data from a library of models or from user-defined models

7

Page 8: 1 The Simfit package - SimFiT · Generalized Linear Models (GLM) can be fitted interactively with either normal, binomial, Poisson or Gamma errors. Appropriate links can be either

● User defined models can be multiple equations in several variables or sets of non-

linear differential equations, and can have conditional logical branching (equivalent

to if...elseif...else) for models that swap at critical values of subsidiary functions or

independent variables

● Logical operators like IF, IFNOT, AND, OR, NOT, XOR, etc. can be used to control

model execution

● Special functions that can be used within models include

• Airy functions Ai(x), Bi(x) and derivatives

• Hyperbolic and inverse hyperbolic functions cosh(x), sinh(x), tanh(x), arc-

cosh(x), arcsinh(x), arctanh(x)

• Bessel functions J0, J1, Y0, Y1, I0, I1, K0, K1

• Normal integral phi(x), error function erf(x) and complements phic(x) and

erfc(x), Dawson’s integral

• Exponential, sine and cosine integrals, E1(x), Ei(x), Si(x), Ci(x)

• Fresnel, Spence, Debye, Fermi-Dirac, Abramovitz, Clausen integrals

• Kelvin bei(x), ber(x), kei(x) and ker(x) functions

• Elliptic integrals RC, RF, RD, RJ and Jacobi functions sn(u,m), cn(u,m) and

dn(u,m)

• Binomial coefficients, gamma function, incomplete gamma function, log(gamma(x)),

digamma (psi(x)), and trigamma functions

• Struve confluent hypergeometric functions

• Legendre polynomials of degree n and order m, spherical harmonics

• Cumulative distribution functions for the normal, t, F, chi-square, beta and

gamma distributions

• Inverse functions for the normal, t, F, chi-square, beta and gamma distributions

• Impulse functions: Heaviside, Kronecker delta, Dirac delta, triangular spike,

Gaussian

• Periodic wave impulse functions: square, rectified triangle, Morse dot, sawtooth,

rectified sine, rectified half sine, unit impulse

● One-line commands can be used for vector arithmetic (initialisation, norms, dot

products), for evaluating polynomials or Chebyshev expansions, or for calling for

mathematical constants (like Euler’s gamma)

● Submodels can be defined, and these can be called from a main model for function

evaluation, root finding, or adaptive quadrature or they can be called dynamically

with arbitrary arguments

● A one line command is all that is necessary to estimate a convolution integral for any

two sub-models over any range

8

Page 9: 1 The Simfit package - SimFiT · Generalized Linear Models (GLM) can be fitted interactively with either normal, binomial, Poisson or Gamma errors. Appropriate links can be either

● Adding random error to exact data sets to simulate experimental error

● Systems of differential equations

After simulating, selected orbits can be stored and, with autonomous systems, vector

field phase portraits can be plotted to identify singularities.

● Generating random numbers and 1,2,3-D random walks from the:

• uniform

• normal

• chi-square

• F

• logistic

• Weibull

• Cauchy

• Poisson

• binomial distributions.

● Generating n by m normally distributed random matrices.

● Generating random permutations of lists and Latin squares.

1.6 Documentation

■ Help documents are available in several formats as follows:

• ASCII text files can be read by any text editor (e.g. Notepad), or using a SimFIT

viewer, which is safer as it does not allow editing.

• HTML documents can be viewed by a built in HTML browser.

• PostScript documents (.ps) can be read using GSview or transformed into

Portable Document Format.

• Portable Document Format (.pdf) files can be read and printed using Adobe

Acrobat reader.

• Binary files that can only be read using a built in interpreter.

■ A short HTML program can be viewed from the main SimFIT program manager,

w_simfit.exe.

■ Each program has a dedicated self contained tutorial.

■ Many of the specialized controls have individual tutorials.

■ There is an extensive set of readme files (w_readme.0 gives details) which describe

advanced features and technical details.

9

Page 10: 1 The Simfit package - SimFiT · Generalized Linear Models (GLM) can be fitted interactively with either normal, binomial, Poisson or Gamma errors. Appropriate links can be either

■ This document (simfit_summary.pdf) summarizes the package and contains collages.

■ A document (MS_office.pdf) describes the interface to MS Office and, in par-

ticular, explains how to use the macros to extract data from MS Excel spreadsheets.

■ A document (PS_fonts.pdf) lists the PostScript font encodings, for those who

want to create special effects.

■ A detailed reference manual (w_manual.pdf) is provided in .ps and .pdf formats.

The pdf version incorporates hyperlinks between the contents, index and page refer-

ences and provides book marks.

■ There is a full set of test files containing appropriate data to test every SimFIT

procedure.

1.7 Tutorials and worked examples

The SimFIT website provides a set of some 220 short documents explaining how to use the

individual programs. These are also collected together into an 800 page document called

w_examples.pdf which can also be read from the website and is available from the

SimFIT main menu.

1.8 Graphical techniques

The primary hardcopy is encapsulated PostScript (.eps) but these can be used retrospectively

to generate .pdf, .svg, .png, .jpg, .tif, or .bmp. The following procedures are available.

◆ Grouping of data into histograms (with error bars if appropriate) and cdfs

◆ Calculation of means and error bars with arbitrary confidence limits from replicates

can be done interactively or from data files

◆ Error bars can be non-symmetrical or sloping if required and multiple non-orthogonal

error bars can be plotted

◆ Error bars can be added to 2D and 3D bar charts and 3D cylinder plots

◆ Extrapolation of best-fit linear and nonlinear curves to arbitrary end points

◆ Automatic transformation of error bars into various axes (Hill, Lineweaver-Burk,

Scatchard, log-odds, etc.)

◆ Immunoassay type dilution plots using logs to base 2, 3, 4, 5, 6, 7, 8, 9 as well as e

and 10, and with labels as logs, powers of the base or fractions

◆ Multiple axes plots

◆ Pie charts with arbitrary displacements, fill-styles, colours

10

Page 11: 1 The Simfit package - SimFiT · Generalized Linear Models (GLM) can be fitted interactively with either normal, binomial, Poisson or Gamma errors. Appropriate links can be either

◆ Bar charts with arbitrary positions, sizes, fill-styles, colours and error-bars

◆ Presentation box and whisker plots, and pie or bar charts with 3D perspective effects

◆ Orbits and vector field diagrams for systems of differential equations

◆ Dendrograms and 3D-cluster plots for use in cluster analysis

◆ Scree diagrams and score or loading scatter plots for principal components analysis

(score plots can have Hotelling T2 elliptical confidence regions)

◆ 3D-surfaces and 2D-projections of contours

◆ Curves in space and projections onto planes

◆ vast array of plotting characters and maths symbols

◆ Standard PostScript fonts, Symbol, ZapfDingbats and Isolatin1 encoding

◆ Professional quality PostScript files that can easily be edited to change titles, legends,

symbols, line-types and thicknesses, etc.

◆ Interactive PostScript facility for arbitrarily stretching and clipping overcrowded plots

such as dendrograms without changing aspect ratios of fonts or plotting symbols but

by just changing white-space between graphical objects

◆ A PostScript editor is supplied for scaling, rotating, shearing, translating, editing,

making collages and inlays from .eps files

◆ Transformation of .eps files into scalable vector graphics format (svg) or compressed

bitmap graphics formats (e.g. bmp, pcx, tif, jpg, png, pdf)

◆ Plotting user defined parameteric equations such as r (θ), x(t), y(t) in 2-space and

x(t),y(t),z(t) in 3-space

◆ Facility to import PostScript specials automatically into the PostScript file creation

stream in order to redefine fonts, colours, plotting symbols, add logos, etc.

◆ Graphical deconvolution of summation models after fitting

◆ 2D and 3D Biplots for multivariate data sets

◆ Metafiles can be archived then replayed to continue editing

11

Page 12: 1 The Simfit package - SimFiT · Generalized Linear Models (GLM) can be fitted interactively with either normal, binomial, Poisson or Gamma errors. Appropriate links can be either

2 The Simdem package

SimDEM is a set of three dynamic link libraries, available with 32-bit (cdecl or stdcall

compatible) or 64-bit binaries, that can be called from any executables that support one or

other of these conventions. It provides programmers with access to the SimFIT libraries

on order to wrote Windows programs without having to know anything about the Windows

API.

SimDEM was originally designed for Fortran programmers, but the DLLs can be linked

to Windows executables compiled in any programming language. The package has 70

individual programs, each designed to demonstrate a particular GUI or graphical feature,

and the source code is provided so that the calling sequences can be understood. The driver

(simdem.exe or x64_simdem.exe) allows users to execute compiled examples then

view the corresponding source code to understand the calling sequences. It can also run

the SimFIT PostScript editor. Executables linked to SimDEM have access to all the SimFIT

input-output, menu, tutorial, and graphical procedures.

SimDEM is ideally suited for use with the NAG Fortran Builder and Silverfrost Plato IDEs

and it is bundled with these compilers.

32-bit programs compiled using the SimDEM library must be distributed with the following

run–time system

w_clearwin.dllw_menus.dllw_graphics.dllsalflibc.dll

while 64-bit versions require this run–time system

x64_clearwin.dllx64_menus.dllx64_graphics.dllsalflibc64.dllclearwin64.dll.

12

Page 13: 1 The Simfit package - SimFiT · Generalized Linear Models (GLM) can be fitted interactively with either normal, binomial, Poisson or Gamma errors. Appropriate links can be either

3 Gallery

Here are examples of plotting styles, followed by collages constructed from SimFIT .eps

files. Zoom in to the graphs to appreciate that the resolution is only limited by the resolution

of the display or plotting device.

SIMFIT 3D plot for z = f(x,y)

XY

Z

1.000

0.000

1.000

0.0000

10

Contours for Rosenbrock Optimization Trajectory

X

Y-1.500

1.500

1.500-1.500

Key Contour 1 1.425 2 2.838 3 5.663 4 11.313 5 22.613 6 45.212 7 90.412 8 1.808×102

9 3.616×102

10 7.232×102

1

2

2

3

3

4

4

5

56

6

7

7

8

8

9 9

10 10

100%

80%

60%

40%

20%

0%

PC1PC2

PC5PC8

PC6HC8

PC3PC4

PC7HC7

HC424A

33B76B

30B100A

34

53A

76

30A

61B60A

27A27B

52

37B

68

28A

97A26A

60B29

36A36B

31B31A

35B32A

32B35A

72A72B

99A99B

37A47

100B33A

53B73

24B26B

28B97B

91A91B

25A25B

61AHC5

HC6

Per

cent

age

Sim

ilarit

y

Bray-Curtis Similarity Dendrogram

0.15

0.25

0.35

0.45

0.55

0.65

0.75

0.85

0.05 0.15 0.25 0.35 0.45 0.55 0.65 0.75

Trinomial Parameter 95% Confidence Contours

px

p y

7,11,12

70,110,20

210,330,60

9,9,2

90,90,20

270,270,60

0.50

1.00

-4 -2 0 2

x

Plotting Equationsusing PSfrag and

1

σ√

∫ x

−∞exp

−1

2

t − µ

σ

2

dt

LATEX

0

1

2

0 1 2 3 4 5

Oxidation of p-Dimethylaminomethylbenzylamine

Time (min)

Con

cent

ratio

n (m

M)

✧✧ ❜❜

❜❜ ✧✧❜❜

✧✧

CH2NH2

CH2N(Me)2

✲[O] ✧✧ ❜❜

❜❜ ✧✧❜❜

✧✧

CH=0

CH2N(Me)2

+NH3✲

[O] ✧✧ ❜❜

❜❜ ✧✧❜❜

✧✧

C02H

CH2N(Me)2

13

Page 14: 1 The Simfit package - SimFiT · Generalized Linear Models (GLM) can be fitted interactively with either normal, binomial, Poisson or Gamma errors. Appropriate links can be either

MultivariateK-meansClustering Distribution Sites

for UK Airports

Diffusion From a Plane Source

3

-3

1.25

0.00-2

-10

12

0.250.50

0.751.00

0.0

1.6

0.4

0.8

1.2

Z

XY

-1

1

-1 1

Rhodoneae of Abbe Grandi, r = sin(4 )

x

y

0.000

0.100

0.200

0.300

0.400

0.500

-3.0 1.5 6.0 10.5 15.0

Deconvolution of 3 Gaussians

x

y

0.00

0.50

1.00

0 1 2 3 4 5

Fitting a Convolution Integral f*g

Time t

f(t)

, g(t

) an

d f*

g

f(t) = exp(- t)

g(t) = 2 t exp(- t)

f*g

0.0

0.2

0.4

0.6

0.8

1.0

0.0 1.0 2.0

Scatchard Plot

y

y/x 1 Site Model

2 Site Model

T = 21°C[Ca++] = 1.3×10-7M

0

2

4

6

8

0 1

Cochran-Mantel-Haenszel Meta Analysis

log10[Odds Ratio]

Gro

up N

umbe

r

Pooled Estimate 0.00

0.25

0.50

0.75

1.00

0 1 2 3 4 5 6

Data Smoothing by Cubic Splines

X-values

Y-v

alu

es

14

Page 15: 1 The Simfit package - SimFiT · Generalized Linear Models (GLM) can be fitted interactively with either normal, binomial, Poisson or Gamma errors. Appropriate links can be either

0.00

0.25

0.50

0.75

1.00

-2.50 -1.25 0.00 1.25 2.50

GOODNESS OF FIT TO A NORMAL DISTRIBUTION

Sample Values (µm)

Sam

ple

Dis

trib

utio

n F

unct

ion

Sample

N(µ,σ2)µ = 0σ = 1

0.00

0.25

0.50

0.75

1.00

0 50 100 150 200 250 300 350

Kaplan-Meier Product-Limit Survivor Estimates

Days from start of trial

S(t

)

Stage 3

Stage 4

+ indicates censored times

Values

Month 7Month 6

Month 5Month 4

Month 3Month 2

Month 1

Case 1Case 2

Case 3Case 4

Case 5

0

11

Simfit Cylinder Plot with Error Bars

0.0

10.0

20.0

0.00 0.25 0.50 0.75 1.000.00

0.20

0.40

0.60

0.80

1.00

Using QNFIT to fit Beta Function pdfs and cdfs

Random Number Values

His

togr

am a

nd p

df f

it

Step Curve and cdf fit

0

1

2

0 1 2

Phase Portrait for the Lotka-Voterra Equations

y(2)

y(1)

-0.50

0.00

0.50

1.00

1.50

2.00

2.50

-1.25 0.00 1.25

Orbits for a System of Differential Equations

y(2)

y(1)

t Increasing

SIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFIT

Style 1

Style 2

Style 3

Style 4

Style 5

Style 6

Style 7

Style 8

Style 9

Style 10

Pie Chart Fill Styles

Pie key 1

Pie key 2

Pie key 3

Pie key 4

Pie key 5

Pie key 6

Pie key 7

Pie key 8

Pie key 9

Pie key 10

SIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFITSIMFIT

-2.50

0.00

2.50

5.00

7.50

January

February

March

April

May

Perspective Effects In Bar Charts

Ran

ges,

Qua

rtile

s, M

edia

ns

15

Page 16: 1 The Simfit package - SimFiT · Generalized Linear Models (GLM) can be fitted interactively with either normal, binomial, Poisson or Gamma errors. Appropriate links can be either

-80

-60

-40

-20

0

20

40

-60 -40 -20 0 20 40 60

x

y

ToiletKitchen

Bath

Electricity

Water

Radio

TV set

Refrigerator

ChristianArmenian

Jewish

MoslemAm.Colony Sh.Jarah

Shaafat Bet-Hanina

A-Tur Isawiye

Silwan Abu-Tor

Sur-Bahar Bet-Safafa

Mulivariate Biplot Three Dimensional Multivariate Biplot

-.4

.6

0

-1-.4

.35

Toilet KitchenBath

Electricity

Water

Radio

TV set

Refrigerator

Christian

Armenian

Jewish

Moslem

Am.Colony Sh.Jarah

Shaafat Bet-Hanina

A-Tur Isawyie

Silwan Abu-Tor

Sur-Bahar Bet-SafafaXY

Z

SIMFIT Three Dimensional Scatter Diagram

0

10

10

02

46

8 24

68

2

8

3

4

5

6

7

A1A2

A3A4

A5

B1

B2

B3B4

B5

XY

Z

Type A

Type B

-0.30

0.00

0.30

0.60

0 2 4 6 8 10

Partial Autocorrelation Function

Lags

PA

CF

val

ues

2÷√n

-2÷√n

0

20

40

60

0 20 40 60 80 100 120

ARIMA Predictions with 95% Confidence Limits

Time

Obs

erva

tions

Forecasts

Data

0

20

40

60

80

100

0 20 40 60 80

ANOVA Power and Sample Size

Sample Size (n)

Pow

er (

%)

k =

2k

= 4

k =

8k

= 16

k = 32

k = no. of groups2 = 1 (variance) = 1 (difference)

0

200

400

600

0 50 100 150 200 250

Using CSAFIT to analyse Flow Cytometry Histograms

Channel Number

Fre

quen

cy

ControlTreatment

-16

-8

0

8

16

-16 -8 0 8 16

95% Confidence Ellipses

x

y

16

Page 17: 1 The Simfit package - SimFiT · Generalized Linear Models (GLM) can be fitted interactively with either normal, binomial, Poisson or Gamma errors. Appropriate links can be either

A

A∩B

A∩B

B∩A

B

Venn Diagram for the Addition Rule

P{A∪ B} = P{A} + P{B} - P{A∩B} 10

20

30

40

40 60 80 100

Highlighting K-means cluster centroids

Variable 1

Var

iabl

e 2

A

B

C

D

E

F

G

HI

JK

L

M

N

O

P

Q

RS

T

(47.8, 35.8, 16.3, 2.4, 6.7)

(64.0, 25.2, 10.7, 2.8, 6.7)

(81.2, 12.7, 7.2, 2.1, 6.6)

0

1

2

0 1 2

t

Dat

a an

d B

est F

it C

urve

s

1 Exponential

2 Exponentials

-0.75

-0.25

0.25

0 1 2t

log 1

0 [f(

t)]

log10 [f(t)] against t

-5

15

25

35

45

-5 5 10 15 20

Extrapolating Double Reciprocal Inhibition Plots

1/S

1/v

I = 0

I = 1

I = 2

I = 3

I = 4

0

200

400

600

800

1000

1200

0 2 4 6 8 10

Best Fit Epidemic Differential Equations

t

y(1)

, y(2

), y

(3)

SusceptibleInfectedResistant

SIMFIT 3D plot for z = f(x,y)

XY

Z

1

0

1

0

0

1

Twister Curve with Projections onto Planes

-20

20

-20

20

-10

0

10

-10

0

10

0

400

100

200

300

z(t)

x(t)y(t)

-3-2

-10

12

3

-3-2

-10

12

3

0.0

0.15915

XY

Bivariate Normal Distributionµx = µy = 0, σx = σy = 1, ρ = 0

f(x,y)

17

Page 18: 1 The Simfit package - SimFiT · Generalized Linear Models (GLM) can be fitted interactively with either normal, binomial, Poisson or Gamma errors. Appropriate links can be either

4 Collages

0.00

0.50

1.00

1.50

2.00

0 10 20 30 40 50

Binding Curve for the 2 2 isoform at 21 C

Concentration of Free Ligand(µM)

Lig

and

Bou

nd p

er M

ole

of P

rote

in

1 Site Model

2 Site Model

0.00

0.25

0.50

0.75

1.00

0.00 0.50 1.00 1.50 2.00

Scatchard Plot for the 2 2 isoform

y

y/x

(µM

-1)

1 Site Model

2 Site Model

T = 21°C[Ca++] = 1.3×10-7M

0.00

0.25

0.50

0.75

1.00

0 1 2 3 4 5 6

Data Smoothing by Cubic Splines

X-values

Y-v

alu

es

0.00

0.25

0.50

0.75

1.00

-2.50 -1.25 0.00 1.25 2.50

GOODNESS OF FIT TO A NORMAL DISTRIBUTION

Sample Values (µm)

Sam

ple

Dis

trib

utio

n F

unct

ion

Sample

N(µ,σ2)µ = 0σ = 1

Inhibition Kinetics: v = f([S],[I])

[S]/mM

v([S

],[I

])/

M.m

in-1

0

20

40

60

80

100

0 20 40 60 80

[I] = 0

[I] = 0.5mM

[I] = 1.0mM

[I] = 2.0mM

Absorbance, Enzyme Activity and pH

Fraction Number

Ab

sorb

ance

at

280n

m

En

zyme A

ctivity (un

its) and

pH

0.00

0.25

0.50

0.75

1.00

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

0.00

2.00

4.00

6.00

8.00

Absorbance Enzyme Units pH of Eluting Buffer

Plotting a Surface and Contours for z = f(x,y)

XY

Z

1.000

0.000

1.000

0.000

0

1

Three Dimensional Bar Chart

January

June

1996

1992May

AprilMarch

February

19931994

1995

0%

100%

50%

0.00

0.50

1.00

0 5 10 15 20 25

Survival Analysis

Time

Est

imat

ed S

urvi

vor

Func

tion

Kaplan-Meier Estimate

MLE Weibull Curve

Best Fit Line and 95% Limits

x

y

0

5

10

15

0 2 4 6 8 10

SIMFIT 3D plot for z = x2 - y2

XY

Z

1

-1

1

-1-1

1

0.25

0.50

0.75

1.00

-3 -2 -1 0 1 2

x

φ(x) =1

σp

Z x

�∞exp

(

12

t�µσ

�2)

dt

0.000

0.020

0.040

0.060

0.080

0.100

0.120

0 10 20 30 40 50

Binomial Probability Plot for N = 50, p = 0.6

x

Pr(X

= x

)

0

100

200

300

400

500

0 50 100 150 200 250

Using CSAFIT for Flow Cytometry Data Smoothing

Channel Number

Num

ber

of C

ells

x(t), y(t), z(t) curve and projection onto y = - 1

XY

Z

1.000

-1.000

1.000

-1.0000.000

1.000

0

25

50

75

100

125

0 2 4 6 8 10

Time (weeks)

Perc

enta

ge o

f A

vera

ge F

inal

Siz

e

MALE

FEMALE

Using GCFIT to fit Growth Curves

1

2

3

4

5

-1.50 -1.00 -0.50 0.00 0.50 1.00 1.50

Log Odds Plot

log10[p/(1 - p)]

t/W

eeks

0

20

40

60

80

100

0 20 40 60 80

ANOVA (k = no. groups, n = no. per group)

Sample Size (n)

Pow

er (

%)

k =

2k

= 4

k =

8k

= 16

k = 32

2 = 1 (variance) = 1 (difference)

18

Page 19: 1 The Simfit package - SimFiT · Generalized Linear Models (GLM) can be fitted interactively with either normal, binomial, Poisson or Gamma errors. Appropriate links can be either

0.00

0.20

0.40

0.60

0.80

1.00

1.20

0.00 1.00 2.00 3.00 4.00 5.00

t (min)

x(t)

, y(t

), z

(t)

x(t)

y(t)

z(t)

A kinetic study of the oxidation of p-Dimethylaminomethylbenzylamine

"

"

b

b

b

b

"

"b

b

"

"

CH2NH2

CH2N(Me)2

-

[O] "

"

b

b

b

b

"

"b

b

"

"

CH=0

CH2N(Me)2

+ NH3-

[O] "

"

b

b

b

b

"

"b

b

"

"

C02H

CH2N(Me)2

ddt

0

@

xyz

1

A

=

0

@

�k+1 k

�1 0k+1 (�k

�1�k+2) k

�2

0 k+2 �k

�2

1

A

0

@

xyz

1

A

;

0

@

x0

y0

z0

1

A

=

0

@

100

1

A

-0.50

0.00

0.50

1.00

1.50

2.00

2.50

-1.25 0.00 1.25

Orbits for a System of Differential Equations

y(2)y(

1)

0.15

0.25

0.35

0.45

0.55

0.65

0.75

0.85

0.05 0.15 0.25 0.35 0.45 0.55 0.65 0.75

Trinomial Parameter 95% Confidence Regions

px

p y

7,11,2

70,110,20

210,330,60

9,9,290,90,20

270,270,60

Using SIMPLOT to plot a Contour Diagram

X

Y

1.000

0.000

1.0000.000

Key Contour 1 9.025×10-2

2 0.181 3 0.271 4 0.361 5 0.451 6 0.542 7 0.632 8 0.722 9 0.812 10 0.903

1

1

2

2

2

2

3

3

3

4

4

4

5

5

5

6

7

89

10

0.0

10.0

20.0

0.00 0.25 0.50 0.75 1.000.00

0.20

0.40

0.60

0.80

1.00

Using QNFIT to fit Beta Function pdfs and cdfs

Random Number Values

His

togr

am a

nd p

df f

it

Step Curve and cdf fit

-1

0

1

2

-1 0 1 2

Phase Portrait for Lotka-Volterra Equations

y(2)

y(1)

0.000

0.100

0.200

0.300

0.400

0.500

-3.0 1.5 6.0 10.5 15.0

Deconvolution of 3 Gaussians

x

y

Illustrating Detached Segments in a Pie Chart

January

February

March

April

May

June

July

August

SeptemberOctober

November

December

Box and Whisker Plot

Month

Ran

ge,

Qu

arti

les

and

Med

ian

s

-2.00

0.25

2.50

4.75

7.00

Janu

ary

Feb

ruar

y

Mar

ch

Apr

il

May

Bar Chart Features

Overlapping Group

Normal Group

Stack

Hanging Group

Box/Whisker

55%

0%

-35%-3

0

3

6

9

0 10 20 30 40 50

1-Dimensional Random Walk

Number of Steps

Posi

tion

3-Dimensional Random Walk

XY

Z

1

-11

11

-1-11

1

19

Page 20: 1 The Simfit package - SimFiT · Generalized Linear Models (GLM) can be fitted interactively with either normal, binomial, Poisson or Gamma errors. Appropriate links can be either

K-Means Clusters

100%

80%

60%

40%

20% 0%

PC1PC2PC5PC8PC6HC8PC3PC4PC7HC7HC424A33B76B30B

100A34

53A76

30A61B60A27A27B

5237B

6828A97A26A60B

2936A36B31B31A35B32A32B35A72A72B99A99B37A

47100B33A53B

7324B26B28B97B91A91B25A25B61AHC5HC6

Contours for Rosenbrock Optimization Trajectory

X

Y

-1.500

1.500

1.500-1.500

Key Contour 1 1.425 2 2.838 3 5.663 4 11.313 5 22.613 6 45.212 7 90.412 8 1.808×102

9 3.616×102

10 7.232×102

1

2

2

3

3

4

4

5

56

6

7

7

8

8

9 9

10 10

Diffusion From a Plane Source

3

-3

1.25

0.00-2

-10

12

0.250.50

0.751.00

0.0

1.6

0.4

0.8

1.2

Z

XY

Values

Month 7Month 6

Month 5Month 4

Month 3Month 2

Month 1

Case 1Case 2

Case 3Case 4

Case 5

0

11

Simfit Cylinder Plot with Error Bars

0.0

2.0

4.0

6.0

0.0 2.0 4.0 6.0

Slanting and Multiple Error Bars

x

y

20