Books related to R

NOTE: modifications to this page have been suspended while the R webmasters consider how, or whether, to maintain the page in the future.

This page gives a partially annotated list of books that are related to S or R and may be useful to the R user community. See also the list of other publications related to R. An alternative searchable listing of both sets together is available here.

[1] S. van Buuren. Flexible Imputation of Missing Data. Chapman & Hall/CRC Interdisciplinary Statistics. CRC Press LLC, 2018. ISBN 9781138588318. [ bib | ]
[2] Dan E. Kelley. Oceanographic Analysis with R. Springer-Verlag, New York, October 2018. ISBN 978-1-4939-8842-6. [ bib | ]
This book presents the R software environment as a key tool for oceanographic computations and provides a rationale for using R over the more widely-used tools of the field such as MATLAB. Kelley provides a general introduction to R before introducing the ‘oce’ package. This package greatly simplifies oceanographic analysis by handling the details of discipline-specific file formats, calculations, and plots. Designed for real-world application and developed with open-source protocols, oce supports a broad range of practical work. Generic functions take care of general operations such as subsetting and plotting data, while specialized functions address more specific tasks such as tidal decomposition, hydrographic analysis, and ADCP coordinate transformation. In addition, the package makes it easy to document work, because its functions automatically update processing logs stored within its data objects. Kelley teaches key R functions using classic examples from the history of oceanography, specifically the work of Alfred Redfield, Gordon Riley, J. Tuzo Wilson, and Walter Munk. Acknowledging the pervasive popularity of MATLAB, the book provides advice to users who would like to switch to R. Including a suite of real-life applications and over 100 exercises and solutions, the treatment is ideal for oceanographers, technicians, and students who want to add R to their list of tools for oceanographic analysis.
[3] Jean-Francois Mas. Análisis espacial con R: Usa R como un Sistema de Información Geográfica. European Scientific Institute, 2018. ISBN 978-608-4642-66-4. [ bib | ]
[4] Thomas Rahlf. Data Visualisation with R. Springer International Publishing, New York, 2017. ISBN 978-3-319-49750-1. [ bib | Publisher Info | ]
This book introduces readers to the fundamentals of creating presentation graphics using R, based on 100 detailed and complete scripts. It shows how bar and column charts, population pyramids, Lorenz curves, box plots, scatter plots, time series, radial polygons, Gantt charts, heat maps, bump charts, mosaic and balloon charts, and a series of different thematic map types can be created using R’s Base Graphics System. Every example uses real data and includes step-by-step explanations of the figures and their programming.
[5] Steven Murray. Apprendre R en un Jour. SJ Murray, 2017. Ebook. [ bib | ]
'Apprendre R en un Jour' donne au lecteur les compétences clés au travers d'une approche axée sur des exemples et est idéal pour les universitaires, scientifiques, mathématiciens et ingénieurs. Le livre ne suppose aucune connaissance préalable en programmation et couvre progressivement toutes les étapes essentielles pour prendre de l'assurance et devenir compétent en R en une journée. Les sujets couverts incluent: comment importer, manipuler, formater, itérer (en boucle), questionner, effectuer des statistiques élémentaires sur, et tracer des graphiques à partir de données, à l'aide d'une explication étape par étape de la technique et de démonstrations que le lecteur est encouragé de reproduire sur son ordinateur, en utilisant des ensembles de données déjà en mémoire dans R. Chaque fin de chapitre inclut aussi des exercices (avec solutions à la fin du livre) pour s'entraîner, mettre en pratique les compétences clés et habiliter le lecteur à construire sur les bases acquises au cours de ce livre d'introduction.
[6] Lawrence Leemis. Learning Base R. Lightning Source, 2016. ISBN 978-0-9829174-8-0. [ bib | ]
Learning Base R provides an introduction to the R language for those with and without prior programming experience. It introduces the key topics to begin analyzing data and programming in R. The focus is on the R language rather than a particular application. The book can be used for self-study or an introductory class on R. Nearly 200 exercises make this book appropriate for a classroom setting. The chapter titles are Introducing R; R as a Calculator; Simple Objects; Vectors; Matrices; Arrays; Built-In Functions; User-Written Functions; Utilities; Complex Numbers; Character Strings; Logical Elements; Relational Operators; Coercion; Lists; Data Frames; Built-In Data Sets; Input/Output; Probability; High-Level Graphics; Custom Graphics; Conditional Execution; Iteration; Recursion; Simulation; Statistics; Linear Algebra; Packages.
[7] Vikram Dayal. An Introduction to R for Quantitative Economics: Graphing, Simulating and Computing. Springer, 2015. ISBN 978-81-322-2340-5. [ bib | ]
This book gives an introduction to R to build up graphing, simulating and computing skills to enable one to see theoretical and statistical models in economics in a unified way. The great advantage of R is that it is free, extremely flexible and extensible. The book addresses the specific needs of economists, and helps them move up the R learning curve. It covers some mathematical topics such as, graphing the Cobb-Douglas function, using R to study the Solow growth model, in addition to statistical topics, from drawing statistical graphs to doing linear and logistic regression. It uses data that can be downloaded from the internet, and which is also available in different R packages. With some treatment of basic econometrics, the book discusses quantitative economics broadly and simply, looking at models in the light of data. Students of economics or economists keen to learn how to use R would find this book very useful.
[8] C. Sun. Empirical Research in Economics: Growing up with R. Pine Square, Starkville, Mississippi, USA, 1st edition, 2015. ISBN 978-0-9965854-0-8. [ bib | ]
Empirical Research in Economics: Growing up with R presents a systematic approach to conducting empirical research in economics with the flexible and free software of R. At present, there is a lack of integration among course work, research methodology, and software usage in statistical analysis of economic data. The objective of this book is to help young professionals conduct an empirical study in economics over a reasonable period, with the expectation of four months in general.
[9] Matthias Kohl. Einführung in die statistische Datenanalyse mit R., London, 2015. ISBN 978-87-403-1156-3. In German. [ bib | Publisher Info ]
Das Buch bietet eine Einführung in die statistische Datenanalyse unter Verwendung der freien Statistiksoftware R, der derzeit wohl mächtigsten Statistiksoftware. Die Analysen werden anhand realer Daten durchgeführt und besprochen. Nach einer kurzen Beschreibung der Statistiksoftware R werden wichtige Kenngrößen und Diagramme der deskriptiven Statistik vorgestellt. Anschließend werden Empfehlungen für die Erstellung von Diagrammen gegeben, wobei ein spezielles Augenmerk auf die Auswahl geeigneter Farben gelegt wird. Die zweite Hälfte des Buches behandelt die Grundlagen der schließenden Statistik. Zunächst wird eine Reihe von Wahrscheinlichkeitsverteilungen eingeführt und deren Anwendungen anhand von Beispielen illustriert. Es folgt eine Beschreibung, wie die in der Praxis unbekannten Parameter der Verteilungen auf Basis vorliegender Daten geschätzt werden können. Im abschließenden Kapitel werden statistische Hypothesentests eingeführt und die für die Praxis wichtigsten Tests besprochen.
[10] Matthias Kohl. Introduction to statistical data analysis with R., London, 2015. ISBN 978-87-403-1123-5. [ bib | Publisher Info ]
The book offers an introduction to statistical data analysis applying the free statistical software R, probably the most powerful statistical software today. The analyses are performed and discussed using real data. After a brief description of the statistical software R, important parameters and diagrams of descriptive statistics are introduced. Subsequently, recommendations for generating diagrams are provided, where special attention is given to the selection of appropriate colors. The second half of the book addresses the basics of inferential statistics. First, a number of probability distributions are introduced and their applicability is illustrated by examples. Next, the book describes how the parameters of these distributions, which are unknown in practice, may be estimated from given data. The final chapter introduces statistical tests and reviews the most important tests for practical applications.
[11] Marta Blangiardo and Michela Cameletti. Spatial and Spatio-temporal Bayesian Models with R-INLA. Wiley, Chichester, West Sussex, United Kingdom, 1st edition, 2015. ISBN 978-1-118-32655-8. [ bib | ]
[12] Gergely Daróczi. Mastering Data Analysis with R. Packt Publishing, 9 2015. ISBN 9781783982028. [ bib | ]
An intermediate and practical book on various fields of data analysis with R: from loading data from text files, databases or APIs; munging; transformations; modeling with traditional statistical methods and machine learning to visualization of tabular, network, time-series and spatial data with hands-on examples.
[13] Victor A. Bloomfield. Using R for Numerical Analysis in Science and Engineering. Chapman & Hall/CRC, 2014. ISBN 978-1439884485. [ bib | ]
Instead of presenting the standard theoretical treatments that underlie the various numerical methods used by scientists and engineers, Using R for Numerical Analysis in Science and Engineering shows how to use R and its add-on packages to obtain numerical solutions to the complex mathematical problems commonly faced by scientists and engineers. This practical guide to the capabilities of R demonstrates Monte Carlo, stochastic, deterministic, and other numerical methods through an abundance of worked examples and code, covering the solution of systems of linear algebraic equations and nonlinear equations as well as ordinary differential equations and partial differential equations. It not only shows how to use R's powerful graphic tools to construct the types of plots most useful in scientific and engineering work, but also:

* Explains how to statistically analyze and fit data to linear and nonlinear models

* Explores numerical differentiation, integration, and optimization

* Describes how to find eigenvalues and eigenfunctions

* Discusses interpolation and curve fitting

* Considers the analysis of time serie

Using R for Numerical Analysis in Science and Engineering provides a solid introduction to the most useful numerical methods for scientific and engineering data analysis using R.

[14] Torsten Hothorn and Brian S. Everitt. A Handbook of Statistical Analyses Using R. Chapman & Hall/CRC Press, Boca Raton, Florida, USA, 3rd edition, 2014. ISBN 978-1-4822-0458-2. [ bib | ]
[15] Thomas Rahlf. Datendesign mit R. 100 Visualisierungsbeispiele. Open Source Press, München, 2014. ISBN 978-3-95539-094-5. In German. [ bib | Publisher Info | http:/// ]
Die Visualisierung von Daten hat in den vergangenen Jahren stark an Beachtung gewonnen. Zu den traditionellen Anwendungsbereichen in der Wissenschaft oder dem Marketing treten neue Gebiete wie Big-Data-Analysen oder der Datenjournalismus. Mit der Open Source Software R, die sich zunehmend als Standard im Bereich der Statistiksoftware etabliert, steht ein mächtiges Werkzeug zur Verfügung, das hinsichtlich der Visualisierungsmöglichkeiten praktisch keine Wünsche offen lässt. Dieses Buch führt in die Grundlagen der Gestaltung von Präsentationsgrafiken mit R ein und zeigt anhand von 100 vollständigen Skript-Beispielen, wie Sie Balken- und Säulendiagramme, Bevölkerungspyramiden, Lorenzkurven, Streudiagramme, Zeitreihendarstellungen, Radialpolygone, Gantt-Diagramme, Profildiagramme, Heatmaps, Bumpcharts, Mosaik- und Ballonplots sowie eine Reihe verschiedener thematischer Kartentypen mit dem Base Graphics System von R erstellen. Für jedes Beispiel werden reale Daten verwendet sowie die Abbildung und deren Programmierung Schritt für Schritt erläutert. Die gedruckte Ausgabe enthält einen persönlichen Zugangs-Code, der Ihnen kostenlos Zugriff auf die Online-Ausgabe dieses Buches gewährt.
[16] Sarah Stowell. Using R for Statistics. Apress, 2014. ISBN 978-1484201404. [ bib | ]
R is a popular and growing open source statistical analysis and graphics environment as well as a programming language and platform. If you need to use a variety of statistics, then Using R for Statistics will get you the answers to most of the problems you are likely to encounter.

Using R for Statistics is a problem-solution primer for using R to set up your data, pose your problems and get answers using a wide array of statistical tests. The book walks you through R basics and how to use R to accomplish a wide variety statistical operations. You'll be able to navigate the R system, enter and import data, manipulate datasets, calculate summary statistics, create statistical plots and customize their appearance, perform hypothesis tests such as the t-tests and analyses of variance, and build regression models. Examples are built around actual datasets to simulate real-world solutions, and programming basics are explained to assist those who do not have a development background.

After reading and using this guide, you'll be comfortable using and applying R to your specific statistical analyses or hypothesis tests. No prior knowledge of R or of programming is assumed, though you should have some experience with statistics.

What you'll learn:

* How to apply statistical concepts using R and some R programming

* How to work with data files, prepare and manipulate data, and combine and restructure datasets

* How to summarize continuous and categorical variables

* What is a probability distribution

* How to create and customize plots

* How to do hypothesis testing

* How to build and use regression and linear models

Who this book is for: No prior knowledge of R or of programming is assumed, making this book ideal if you are more accustomed to using point-and-click style statistical packages. You should have some prior experience with statistics, however.

[17] Ruey S. Tsay. Multivariate Time Series Analysis With R and Financial Applications. John Wiley, New Jersey, 2014. ISBN 978-1-118-61790-8. [ bib | Publisher Info | ]
This book is based on my experience in teaching and research on multivariate time series analysis over the past 30 years. It summarizes the basic concepts and ideas of analyzing multivariate dependent data, provides econometric and statistical models useful for describing the dynamic dependence between variables, discusses the identifiability problem when the models become too flexible, introduces ways to search for simplifying structure hidden in high-dimensional time series, addresses the applicabilities and limitations of multivariate time series methods, and, equally important, develops the R MTS package for readers to apply the methods and models discussed in the book. The vector autoregressive models and multivariate volatility models are discussed and demonstrated.
[18] J.C. Nash. Nonlinear Parameter Optimization Using R Tools. Wiley, 2014. ISBN 9781118883969. [ bib ]
A systematic and comprehensive treatment of optimization software using R. In recent decades, optimization techniques have been streamlined by computational and artificial intelligence methods to analyze more variables, especially under non–linear, multivariable conditions, more quickly than ever before. Optimization is an important tool for decision science and for the analysis of physical systems used in engineering. Nonlinear Parameter Optimization with R explores the principal tools available in R for function minimization, optimization, and nonlinear parameter determination and features numerous examples throughout.
[19] Michael J. Crawley. Statistics: An Introduction using R. Wiley, 2nd edition, 2014. ISBN 978-1-118-94109-6. [ bib | Publisher Info | ]
The book is primarily aimed at undergraduate students in medicine, engineering, economics and biology --- but will also appeal to postgraduates who have not previously covered this area, or wish to switch to using R.
[20] Lise Bellanger and Richard Tomassone. Exploration de données et méthodes statistiques avec le logiciel R. Références sciences. Ellipses, 1st edition, 2014. ISBN 978-2-7298-8486-4. [ bib | Publisher Info | ]
La Statistique envahit pratiquement tous les domaines d'application, aucun n'en est exclus; elle permet d'explorer et d'analyser des corpus de données de plus en plus volumineux : l'ère des big data et du data mining s'ouvre à nous ! Cette omniprésence s'accompagne bien souvent de l'absence de regard critique tant sur l'origine des données que sur la manière de les traiter. La facilité d'utilisation des logiciels de traitement statistique permet de fournir quasi instantanément des graphiques et des résultats numériques. Le risque est donc grand d'une acceptation aveugle des conclusions qui découlent de son emploi, comme simple citoyen ou comme homme politique. Les auteurs insistent sur les concepts sans négliger la rigueur, ils décrivent les outils de décryptage des données. L'ouvrage couvre un large spectre de méthodes allant du pré-traitement des données aux méthodes de prévision, en passant par celles permettant leur visualisation et leur synthèse. De nombreux exemples issus de champs d'application variés sont traités à l'aide du logiciel libre R, dont les commandes sont commentées. L'ouvrage est destiné aux étudiants de masters scientifiques ou d'écoles d'ingénieurs ainsi qu'aux professionnels voulant utiliser la Statistique de manière réfléchie : des sciences de la vie à l'archéologie, de la sociologie à l'analyse financière.
[21] Yvonnick Noel. Psychologie statistique avec R. Pratique R. Springer, Paris, 2013. ISBN 978-2-8178-0424-8. [ bib ]
This book provides a detailed presentation of all basics of statistical inference for psychologists, both in a fisherian and a bayesian approach. Although many authors have recently advocated for the use of bayesian statistics in psychology (Wagenmaker et al., 2010, 2011; Kruschke, 2010; Rouder et al., 2009) statistical manuals for psychologists barely mention them. This manual provides a full bayesian toolbox for commonly encountered problems in psychology and social sciences, for comparing proportions, variances and means, and discusses the advantages. But all foundations of the frequentist approach are also provided, from data description to probability and density, through combinatorics and set algebra. A special emphasis has been put on the analysis of categorical data and contingency tables. Binomial and multinomial models with beta and Dirichlet priors are presented, and their use for making (between rows or between cells) contrasts in contingency tables is detailed on real data. An automatic search of the best model for all problem types is implemented in the AtelieR package, available on CRAN. ANOVA is also presented in a Bayesian flavor (using BIC), and illustrated on real data with the help of the AtelieR and R2STATS packages (a GUI for GLM and GLMM in R). In addition to classical and Bayesian inference on means, direct and Bayesian inference on effect size and standardized effects are presented, in agreement with recent APA recommendations.
[22] Yihui Xie. Dynamic Documents with R and knitr. Chapman & Hall/CRC, 2013. ISBN 978-1482203530. [ bib | Publisher Info | ]
Suitable for both beginners and advanced users, this book shows you how to write reports in simple languages such as Markdown. The reports range from homework, projects, exams, books, blogs, and web pages to any documents related to statistical graphics, computing, and data analysis. While familiarity with LaTeX and HTML is helpful, the book requires no prior experience with advanced programs or languages. For beginners, the text provides enough features to get started on basic applications. For power users, the last several chapters enable an understanding of the extensibility of the knitr package.
[23] Steven Murray. Learn R in a Day. SJ Murray, 2013. Ebook. [ bib | ]
`Learn R in a Day' provides the reader with key programming skills through an examples-oriented approach and is ideally suited for academics, scientists, mathematicians and engineers. The book assumes no prior knowledge of computer programming and progressively covers all the essential steps needed to become confident and proficient in using R within a day. Topics include how to input, manipulate, format, iterate (loop), query, perform basic statistics on, and plot data, via a step-by-step technique and demonstrations using in-built datasets which the reader is encouraged to replicate on their computer. Each chapter also includes exercises (with solutions) to practice key skills and empower the reader to build on the essentials gained during this introductory course.
[24] Ruey S. Tsay. An Introduction to Analysis of Financial Data with R. John Wiley, New Jersey, 2013. ISBN 978-0-470-89081-3. [ bib | Publisher Info | ]
This book provides a concise introduction to econometric and statistical analysis of financial data. It focuses on scalar financial time series with applications. High-frequency data and volatility models are discussed. The book also uses case studies to illustrate the application of modeling financial data.
[25] Matthias Kohl. Analyse von Genexpressionsdaten --- mit R und Bioconductor. Ventus Publishing ApS, London, 2013. ISBN 978-87-403-0349-0. In German. [ bib | Publisher Info ]
Das Buch bietet eine Einführung in die Verwendung von R und Bioconductor für die Analyse von Mikroarray-Daten. Es werden die Arraytechnologien von Affymetrix und Illumina ausführlich behandelt. Darüber hinaus wird auch auf andere Arraytechnologien eingegangen. Alle notwendigen Schritte beginnend mit dem Einlesen der Daten und der Qualitätskontrolle über die Vorverarbeitung der Daten bis hin zur statistischen Analyse sowie der Enrichment Analyse werden besprochen. Jeder der Schritte wird anhand einfacher Beispiele praktisch vorgeführt, wobei der im Buch verwendete R-Code separat zum Download bereitsteht.
[26] Robert J Knell. Introductory R: A Beginner's Guide to Data Visualisation and Analysis using R. (See web site), March 2013. ISBN 978-0-9575971-0-5. [ bib | ]
R is now the most widely used statistical software in academic science and it is rapidly expanding into other fields such as finance. R is almost limitlessly flexible and powerful, hence its appeal, but can be very difficult for the novice user. There are no easy pull-down menus, error messages are often cryptic and simple tasks like importing your data or exporting a graph can be difficult and frustrating. Introductory R is written for the novice user who knows a bit about statistics but who hasn't yet got to grips with the ways of R. This book: walks you through the basics of R's command line interface; gives a set of simple rules to follow to make sure you import your data properly; introduces the script editor and gives advice on workflow; contains a detailed introduction to drawing graphs in R and gives advice on how to deal with some of the most common errors that you might encounter. The techniques of statistical analysis in R are illustrated by a series of chapters where experimental and survey data are analysed. There is a strong emphasis on using real data from real scientific research, with all the problems and uncertainty that implies, rather than well-behaved made-up data that give ideal and easy to analyse results.
[27] Joseph Hilbe. Methods of Statistical Model Estimation. Chapman & Hall/CRC Press, Boca Raton, FL, 2013. ISBN 978-1-4398-5802-8. [ bib | Discount Info | ]
Methods of Statistical Model Estimation examines the most important and popular methods used to estimate parameters for statistical models and provide informative model summary statistics. Designed for R users, the book is also ideal for anyone wanting to better understand the algorithms used for statistical model fitting. The text presents algorithms for the estimation of a variety of regression procedures using maximum likelihood estimation, iteratively reweighted least squares regression, the EM algorithm, and MCMC sampling. Fully developed, working R code is constructed for each method. The book starts with OLS regression and generalized linear models, building to two-parameter maximum likelihood models for both pooled and panel models. It then covers a random effects model estimated using the EM algorithm and concludes with a Bayesian Poisson model using Metropolis-Hastings sampling. The book's coverage is innovative in several ways. First, the authors use executable computer code to present and connect the theoretical content. Therefore, code is written for clarity of exposition rather than stability or speed of execution. Second, the book focuses on the performance of statistical estimation and downplays algebraic niceties. In both senses, this book is written for people who wish to fit statistical models and understand them.
[28] Gergely Daróczi, Michael Puhle, Edina Berlinger, Péter Csóka, Daniel Havran, Márton Michaletzky, Zsolt Tulassay, Kata Váradi, and Agnes Vidovics-Dancs. Introduction to R for Quantitative Finance. Packt Publishing, November 2013. ISBN 9781783280933. [ bib | ]
The book focuses on how to solve real-world quantitative finance problems using the statistical computing language R. “Introduction to R for Quantitative Finance” covers diverse topics ranging from time series analysis to financial networks. Each chapter briefly presents the theory behind specific concepts and deals with solving a diverse range of problems using R with the help of practical examples.
[29] Christopher Gandrud. Reproducible Research with R and RStudio. Chapman & Hall/CRC The R series. Chapman & Hall/CRC Press, Boca Raton, FL, 2013. ISBN 978-1-4665-7284-3. [ bib | ]
Bringing together computational research tools in one accessible source, Reproducible Research with R and RStudio guides you in creating dynamic and highly reproducible research. Suitable for researchers in any quantitative empirical discipline, it presents practical tools for data collection, data analysis, and the presentation of results. The book takes you through a reproducible research workflow, showing you how to use: R for dynamic data gathering and automated results presentation knitr for combining statistical analysis and results into one document LaTeX for creating PDF articles and slide shows, and Markdown and HTML for presenting results on the web Cloud storage and versioning services that can store data, code, and presentation files; save previous versions of the files; and make the information widely available Unix-like shell programs for compiling large projects and converting documents from one markup language to another RStudio to tightly integrate reproducible research tools in one place.
[30] Dirk Eddelbuettel. Seamless R and C++ Integration with Rcpp. Use R! Springer, New York, 2013. ISBN 978-1-4614-6867-7. [ bib | Discount Info | Publisher Info ]
Seamless R and C ++ Integration with Rcpp provides the first comprehensive introduction to Rcpp, which has become the most widely-used language extension for R, and is deployed by over one-hundred different CRAN and BioConductor packages. Rcpp permits users to pass scalars, vectors, matrices, list or entire R objects back and forth between R and C++ with ease. This brings the depth of the R analysis framework together with the power, speed, and efficiency of C++.
[31] Din Chen. Applied Meta-Analysis with R. Chapman & Hall/CRC Biostatistics series. Chapman & Hall/CRC Press, Boca Raton, FL, 2013. ISBN 978-1-4665-0599-5. [ bib | Discount Info | ]
In biostatistical research and courses, practitioners and students often lack a thorough understanding of how to apply statistical methods to synthesize biomedical and clinical trial data. Filling this knowledge gap, Applied Meta-Analysis with R shows how to implement statistical meta-analysis methods to real data using R. Drawing on their extensive research and teaching experiences, the authors provide detailed, step-by-step explanations of the implementation of meta-analysis methods using R. Each chapter gives examples of real studies compiled from the literature. After presenting the data and necessary background for understanding the applications, various methods for analyzing meta-data are introduced. The authors then develop analysis code using the appropriate R packages and functions. This systematic approach helps readers thoroughly understand the analysis methods and R implementation, enabling them to use R and the methods to analyze their own meta-data. Suitable as a graduate-level text for a meta-data analysis course, the book is also a valuable reference for practitioners and biostatisticians (even those with little or no experience in using R) in public health, medical research, governmental agencies, and the pharmaceutical industry.
[32] Stano Pekar and Marek Brabec. Moderni analyza biologickych dat. 2. Linearni modely s korelacemi v prostredi R [Modern Analysis of Biological Data. 2. Linear Models with Correlations in R]. Masaryk University Press, Brno, 2012. ISBN 978-80-21058-12-5. In Czech. [ bib | Publisher Info ]
Publikace navazuje na prvni dil Moderni analyzy biologickych dat a predstavuje vybrane modely a metody statisticke analyzy korelovanych dat. Tedy linearni metody, ktere jsou vhodnym nastrojem analyzy dat s casovymi, prostorovymi a fylogenetickymi zavislostmi v datech. Text knihy je praktickou priruckou analyzy dat v prostredi jednoho z nejrozsahlejsich statistickych nastroju na svete, volne dostupneho softwaru R. Je sestaven z 19 vzorove vyresenych a okomentovanych prikladu, ktere byly vybrany tak, aby ukazaly spravnou konstrukci modelu a upozornily na problemy a chyby, ktere se mohou v prubehu analyzy dat vyskytnout. Text je psan jednoduchym jazykem srozumitelnym pro ctenare bez specialniho matematickeho vzdelani. Kniha je predevsim urcena studentum i vedeckym pracovnikum biologickych, zemedelskych, veterinarnich, lekarskych a farmaceutickych oboru, kteri potrebuji korektne analyzovat vysledky svych pozorovani ci experimentu s komplikovanejsi strukturou danou zavislostmi mezi opakovanymi merenimi stejneho subjektu.
[33] K. Soetaert, J. Cash, and F. Mazzia. Solving Differential Equations in R. Use R! Springer, 2012. ISBN 978-3-642-28070-2. [ bib | Publisher Info ]
Mathematics plays an important role in many scientific and engineering disciplines. This book deals with the numerical solution of differential equations, a very important branch of mathematics. Our aim is to give a practical and theoretical account of how to solve a large variety of differential equations, comprising ordinary differential equations, initial value problems and boundary value problems, differential algebraic equations, partial differential equations and delay differential equations. The solution of differential equations using R is the main focus of this book. It is therefore intended for the practitioner, the student and the scientist, who wants to know how to use R for solving differential equations. However, it has been our goal that non-mathematicians should at least understand the basics of the methods, while obtaining entrance into the relevant literature that provides more mathematical background. Therefore, each chapter that deals with R examples is preceded by a chapter where the theory behind the numerical methods being used is introduced. In the sections that deal with the use of R for solving differential equations, we have taken examples from a variety of disciplines, including biology, chemistry, physics, pharmacokinetics. Many examples are well-known test examples, used frequently in the field of numerical analysis.
[34] Sarah Stowell. Instant R: An Introduction to R for Statistical Analysis. Jotunheim Publishing, 2012. ISBN 978-0-957-46490-2. [ bib | ]
This book gives an introduction to using R, with a focus on performing popular statistical methods. It is suitable for anyone that is familiar with basic statistics and wants to begin using R to analyse data and create statistical plots. No prior knowledge of R or of programming is assumed, making this book ideal if you are more accustomed to using point-and-click style statistical packages.
[35] Bernhard Pfaff. Financial Risk Modelling and Portfolio Optimisation with R. Wiley, Chichester, UK, 2012. ISBN 978-0-470-97870-2. [ bib | Publisher Info | ]
Introduces the latest techniques advocated for measuring financial market risk and portfolio optimisation, and provides a plethora of R code examples that enable the reader to replicate the results featured throughout the book. Graduate and postgraduate students in finance, economics, risk management as well as practitioners in finance and portfolio optimisation will find this book beneficial. It also serves well as an accompanying text in computer-lab classes and is therefore suitable for self-study.
[36] David Lunn. The BUGS Book: A Practical Introduction to Bayesian Analysis. Chapman & Hall/CRC Texts in Statistical Science series. Chapman & Hall/CRC Press, Boca Raton, FL, 2012. ISBN 978-1-5848-8849-9. [ bib | Discount Info | ]
Bayesian statistical methods have become widely used for data analysis and modelling in recent years, and the BUGS software has become the most popular software for Bayesian analysis worldwide. Authored by the team that originally developed this software, The BUGS Book provides a practical introduction to this program and its use. The text presents complete coverage of all the functionalities of BUGS, including prediction, missing data, model criticism, and prior sensitivity. It also features a large number of worked examples and a wide range of applications from various disciplines. The book introduces regression models, techniques for criticism and comparison, and a wide range of modelling issues before going into the vital area of hierarchical models, one of the most common applications of Bayesian methods. It deals with essentials of modelling without getting bogged down in complexity. The book emphasises model criticism, model comparison, sensitivity analysis to alternative priors, and thoughtful choice of prior distributions---all those aspects of the “art” of modelling that are easily overlooked in more theoretical expositions. More pragmatic than ideological, the authors systematically work through the large range of “tricks” that reveal the real power of the BUGS software, for example, dealing with missing data, censoring, grouped data, prediction, ranking, parameter constraints, and so on. Many of the examples are biostatistical, but they do not require domain knowledge and are generalisable to a wide range of other application areas. Full code and data for examples, exercises, and some solutions can be found on the book's website.
[37] Michael Lawrence. Programming Graphical User Interfaces in R. Chapman & Hall/CRC the R series. Chapman & Hall/CRC Press, Boca Raton, FL, 2012. ISBN 978-1-4398-5682-6. [ bib | Discount Info | ]
Programming Graphical User Interfaces with R introduces each of the major R packages for GUI programming: RGtk2, qtbase, Tcl/Tk, and gWidgets. With examples woven through the text as well as stand-alone demonstrations of simple yet reasonably complete applications, the book features topics especially relevant to statisticians who aim to provide a practical interface to functionality implemented in R. The accompanying package, ProgGUIinR, includes the complete code for all examples as well as functions for browsing the examples from the respective chapters. Accessible to seasoned, novice, and occasional R users, this book shows that for many purposes, adding a graphical interface to one's work is not terribly sophisticated or time consuming.
[38] Göran Broström. Event History Analysis with R. Chapman & Hall/CRC the R series. Chapman & Hall/CRC Press, Boca Raton, FL, 2012. ISBN 978-1-4398-3164-9. [ bib | Discount Info | ]
With an emphasis on social science applications, Event History Analysis with R presents an introduction to survival and event history analysis using real-life examples. Keeping mathematical details to a minimum, the book covers key topics, including both discrete and continuous time data, parametric proportional hazards, and accelerated failure times. A much-needed primer, Event History Analysis with R is a didactically excellent resource for students and practitioners of applied event history and survival analysis.
[39] Dimitris Rizopoulos. Joint Models for Longitudinal and Time-to-Event Data, with Applications in R. Chapman & Hall/CRC, Boca Raton, 2012. ISBN 978-1-4398-7286-4. [ bib | Publisher Info | ]
The last 20 years have seen an increasing interest in the class of joint models for longitudinal and time-to-event data. These models constitute an attractive paradigm for the analysis of follow-up data that is mainly applicable in two settings: First, when focus is on a survival outcome and we wish to account for the effect of an endogenous time-dependent covariate measured with error, and second, when focus is on the longitudinal outcome and we wish to correct for nonrandom dropout. Aimed at applied researchers and graduate students, this text provides a comprehensive overview of the framework of random effects joint models. Emphasis is given on applications such that readers will obtain a clear view on the type of research questions that are best answered using a joint modeling approach, the basic features of these models, and how they can be extended in practice. Special mention is given in checking the assumptions using residual plots, and on dynamic predictions for the survival and longitudinal outcomes.
[40] Brian Dennis. The R Student Companion. Chapman & Hall/CRC Press, Boca Raton, FL, 2012. ISBN 978-1-4398-7540-7. [ bib | Discount Info | ]
R is the amazing, free, open-access software package for scientific graphs and calculations used by scientists worldwide. The R Student Companion is a student-oriented manual describing how to use R in high school and college science and mathematics courses. Written for beginners in scientific computation, the book assumes the reader has just some high school algebra and has no computer programming background. The author presents applications drawn from all sciences and social sciences and includes the most often used features of R in an appendix. In addition, each chapter provides a set of computational challenges: exercises in R calculations that are designed to be performed alone or in groups. Several of the chapters explore algebra concepts that are highly useful in scientific applications, such as quadratic equations, systems of linear equations, trigonometric functions, and exponential functions. Each chapter provides an instructional review of the algebra concept, followed by a hands-on guide to performing calculations and graphing in R. R is intuitive, even fun. Fantastic, publication-quality graphs of data, equations, or both can be produced with little effort. By integrating mathematical computation and scientific illustration early in a student's development, R use can enhance one's understanding of even the most difficult scientific concepts. While R has gained a strong reputation as a package for statistical analysis, The R Student Companion approaches R more completely as a comprehensive tool for scientific computing and graphing.
[41] Pierre-Andre Cornillon. R for Statistics. Chapman & Hall/CRC Press, Boca Raton, FL, 2012. ISBN 978-1-4398-8145-3. [ bib | Discount Info | ]
Although there are currently a wide variety of software packages suitable for the modern statistician, R has the triple advantage of being comprehensive, widespread, and free. Published in 2008, the second edition of Statistiques avec R enjoyed great success as an R guidebook in the French-speaking world. Translated and updated, R for Statistics includes a number of expanded and additional worked examples. Organized into two sections, the book focuses first on the R software, then on the implementation of traditional statistical methods with R. After a short presentation of the method, the book explicitly details the R command lines and gives commented results. Accessible to novices and experts alike, R for Statistics is a clear and enjoyable resource for any scientist.
[42] A. B. Shipunov, E. M. Baldin, P. A. Volkova, A. I. Korobejnikov, S. A. Nazarova, S. V. Petrov, and V. G. Sufijanov. Nagljadnaja statistika. Ispoljzuem R! / Vusial statistics. Use R! DMK Press, Moscow, 2012. ISBN 978-5-94074-828-1. [ bib ]
This is the first “big” book about R in Russian. It is intended to help people who begin to learn statistical methods. All explanations are based on R. The book may also serve as an introduction reference to R.
[43] Yves Aragon. Séries temporelles avec R. Méthodes et cas. Springer, Collection Pratique R, 1st edition, 2011. ISBN 978-2-8178-0207-7. [ bib ]
Ce livre étudie sous un angle original le concept de série temporelle, dont la complexité théorique et l'utilisation sont souvent sources de difficultés. La théorie distingue par exemple les notions de séries stationnaire et non stationnaire, mais il n'est pas rare de pouvoir modéliser une série par deux modèles incompatibles. De plus, un peu d'intimité avec les séries montre qu'on peut s'appuyer sur des graphiques variés pour en comprendre assez rapidement la structure, avant toute modélisation. Ainsi, au lieu d'étudier des méthodes de modélisation, puis de les illustrer, l'auteur prend ici le parti de s'intéresser à un nombre limité de séries afin de trouver ce qu'on peut dire de chacune. Avant d'aborder ces études de cas, il procéde à quelques rappels et commence par présenter les graphiques pour séries temporelles offerts par R. Il revient ensuite sur des notions fondamentales de statistique mathématique, puis révise les concepts et les modèles classiques de séries. Il présente les structures de séries temporelles dans R et leur importation. Il revisite le lissage exponentiel à la lumière des travaux les plus récents. Un chapitre est consacré à la simulation. Six séries sont ensuite étudiées par le menu en confrontant plusieurs approches.
[44] Pierre André Cornillon and Eric Matzner-Lober. Régression avec R. Springer, Collection Pratique R, 1st edition, 2011. ISBN 978-2-8178-0183-4. [ bib ]
Cet ouvrage expose en détail l'une des méthodes statistiques les plus courantes : la régression. Il concilie théorie et applications, en insistant notamment sur l'analyse de données réelles avec le logiciel R. Les premiers chapitres sont consacrés à la régression linéaire simple et multiple, et expliquent les fondements de la méthode, tant au niveau des choix opérés que des hypothèses et de leur utilité. Puis ils développent les outils permettant de vérifier les hypothèses de base mises en œuvre par la régression, et présentent les modèles d'analyse de la variance et covariance. Suit l'analyse du choix de modèle en régression multiple. Les derniers chapitres présentent certaines extensions de la régression, comme la régression sous contraintes (ridge, lasso et lars), la régression sur composantes (PCR et PLS), et, enfin, introduisent à la régression non paramétrique (spline et noyau). La présentation témoigne d'un réel souci pédagogique des auteurs qui bénéficient d'une expérience d'enseignement auprès de publics très variés. Les résultats exposés sont replacés dans la perspective de leur utilité pratique grâce à l'analyse d'exemples concrets. Les commandes permettant le traitement des exemples sous le logiciel R figurent dans le corps du texte. Chaque chapitre est complété par une suite d'exercices corrigés. Le niveau mathématique requis rend ce livre accessible aux élèves ingénieurs, aux étudiants de niveau Master et aux chercheurs actifs dans divers domaines des sciences appliquées.
[45] Luiz Alexandre Peternelli and Marcio Pupin Mello. Conhecendo o R: uma visão estatística. Série Didática. Editora UFV, Viçosa, MG, Brazil, 1 edition, March 2011. ISBN 978-85-7269-400-1. [ bib | ]
Este material é de grande valia para estudantes ou pesquisadores que usam ferramentas estatísticas em trabalhos de pesquisa ou em uma simples análise de dados, constitui ponto de partida para aqueles que desejam começar a utilizar o R e suas ferramentas estatísticas ou, mesmo, para os que querem ter sempre à mão material de referência fácil, objetivo e abrangente para uso desse software.
[46] Paul Teetor. R Cookbook. O'Reilly, first edition, 2011. ISBN 978-0-596-80915-7. [ bib ]
Perform data analysis with R quickly and efficiently with the task-oriented recipes in this cookbook. Although the R language and environment include everything you need to perform statistical work right out of the box, its structure can often be difficult to master. R Cookbook will help both beginners and experienced statistical programmers unlock and use the power of R.
[47] Paul Teetor. 25 Recipes for Getting Started with R. O'Reilly, 2011. ISBN 978-1-4493-0322-8. [ bib | ]
This short, concise book provides beginners with a selection of how-to recipes to solve simple problems with R. Each solution gives you just what you need to know to get started with R for basic statistics, graphics, and regression. These solutions were selected from O'Reilly's R Cookbook, which contains more than 200 recipes for R.
[48] Paul Murrell. R Graphics, Second Edition. Chapman & Hall/CRC the R series. Chapman & Hall/CRC Press, Boca Raton, FL, 2011. ISBN 978-1-4398-3176-2. [ bib | Discount Info | ]
Extensively updated to reflect the evolution of statistics and computing, the second edition of the bestselling R Graphics comes complete with new packages and new examples. Paul Murrell, widely known as the leading expert on R graphics, has developed an in-depth resource that helps both neophyte and seasoned users master the intricacies of R graphics. Organized into five parts, R Graphics covers both “traditional” and newer, R-specific graphics systems. The book reviews the graphics facilities of the R language and describes R's powerful grid graphics system. It then covers the graphics engine, which represents a common set of fundamental graphics facilities, and provides a series of brief overviews of the major areas of application for R graphics and the major extensions of R graphics.
[49] Laura Chihara and Tim Hesterberg. Mathematical Statistics with Resampling and R. Wiley, 1st edition, 2011. ISBN 978-1-1180-2985-5. [ bib | Publisher Info | ]
Resampling helps students understand the meaning of sampling distributions, sampling variability, P-values, hypothesis tests, and confidence intervals. This book shows how to apply modern resampling techniques to mathematical statistics. Extensively class-tested to ensure an accessible presentation, Mathematical Statistics with Resampling and R utilizes the powerful and flexible computer language R to underscore the significance and benefits of modern resampling techniques. The book begins by introducing permutation tests and bootstrap methods, motivating classical inference methods. Striking a balance between theory, computing, and applications, the authors explore additional topics such as: Exploratory data analysis, Calculation of sampling distributions, The Central Limit Theorem, Monte Carlo sampling, Maximum likelihood estimation and properties of estimators, Confidence intervals and hypothesis tests, Regression, Bayesian methods. Case studies on diverse subjects such as flight delays, birth weights of babies, and telephone company repair times illustrate the relevance of the material. Mathematical Statistics with Resampling and R is an excellent book for courses on mathematical statistics at the upper-undergraduate and graduate levels. It also serves as a valuable reference for applied statisticians working in the areas of business, economics, biostatistics, and public health who utilize resampling methods in their everyday work.
[50] John Fox and Sanford Weisberg. An R Companion to Applied Regression. Sage Publications, Thousand Oaks, CA, USA, second edition, 2011. ISBN 978-1-4129-7514-8. [ bib | ]
A companion book to a text or course on applied regression (such as “Applied Regression Analysis and Generalized Linear Models, Second Edition” by John Fox or “Applied Linear Regression, Third edition” by Sanford Weisberg). It introduces R, and concentrates on how to use linear and generalized-linear models in R while assuming familiarity with the statistical methodology.
[51] Hrishi Mittal. R Graphs Cookbook. Packt Publishing, 2011. ISBN 1849513066. [ bib ]
The R Graph Cookbook takes a practical approach to teaching how to create effective and useful graphs using R. This practical guide begins by teaching you how to make basic graphs in R and progresses through subsequent dedicated chapters about each graph type in depth. It will demystify a lot of difficult and confusing R functions and parameters and enable you to construct and modify data graphics to suit your analysis, presentation, and publication needs.
[52] Graham Williams. Data Mining with Rattle and R: The art of excavating data for knowledge discovery. Use R! Springer, 2011. ISBN 978-1-4419-9889-7. [ bib | Discount Info | Publisher Info | ]
Data mining is the art and science of intelligent data analysis. By building knowledge from information, data mining adds considerable value to the ever increasing stores of electronic data that abound today. In performing data mining many decisions need to be made regarding the choice of methodology, the choice of data, the choice of tools, and the choice of algorithms. Throughout this book the reader is introduced to the basic concepts and some of the more popular algorithms of data mining. With a focus on the hands-on end-to-end process for data mining, Williams guides the reader through various capabilities of the easy to use, free, and open source Rattle Data Mining Software built on the sophisticated R Statistical Software. The focus on doing data mining rather than just reading about data mining is refreshing. The book covers data understanding, data preparation, data refinement, model building, model evaluation, and practical deployment. The reader will learn to rapidly deliver a data mining project using software easily installed for free from the Internet. Coupling Rattle with R delivers a very sophisticated data mining environment with all the power, and more, of the many commercial offerings.
[53] Manfred Gilli, Dietmar Maringer, and Enrico Schumann. Numerical Methods and Optimization in Finance. Academic Press, 2011. ISBN 978-0-12-375662-6. [ bib | Publisher Info | ]
The book explains tools for computational finance. It covers fundamental numerical analysis and computational techniques, for example for option pricing, but two topics are given special attention: simulation and optimization. Many chapters are organized as case studies, dealing with problems like portfolio insurance or risk estimation; in particular, several chapters explain optimization heuristics and how to use them for portfolio selection or the calibration of option pricing models. Such practical examples allow readers to learn the required steps for solving specific problems, and to apply these steps to other problems, too. At the same time, the chosen applications are relevant enough to make the book a useful reference on how to handle given problems. Matlab and R sample code is provided in the text and can be downloaded from the book's website; an R package `NMOF' is also available.
[54] Bruno Falissard. Analysis of Questionnaire Data with R. Chapman & Hall/CRC Press, Boca Raton, FL, 2011. ISBN 978-1-4398-1766-7. [ bib | Discount Info | ]
While theoretical statistics relies primarily on mathematics and hypothetical situations, statistical practice is a translation of a question formulated by a researcher into a series of variables linked by a statistical tool. As with written material, there are almost always differences between the meaning of the original text and translated text. Additionally, many versions can be suggested, each with their advantages and disadvantages. Analysis of Questionnaire Data with R translates certain classic research questions into statistical formulations. As indicated in the title, the syntax of these statistical formulations is based on the well-known R language, chosen for its popularity, simplicity, and power of its structure. Although syntax is vital, understanding the semantics is the real challenge of any good translation. In this book, the semantics of theoretical-to-practical translation emerges progressively from examples and experience, and occasionally from mathematical considerations. Sometimes the interpretation of a result is not clear, and there is no statistical tool really suited to the question at hand. Sometimes data sets contain errors, inconsistencies between answers, or missing data. More often, available statistical tools are not formally appropriate for the given situation, making it difficult to assess to what extent this slight inadequacy affects the interpretation of results. Analysis of Questionnaire Data with R tackles these and other common challenges in the practice of statistics.
[55] Randall L. Eubank. Statistical Computing with C++ and R. Chapman & Hall/CRC the R series. Chapman & Hall/CRC Press, Boca Raton, FL, 2011. ISBN 978-1-4200-6650-0. [ bib | Discount Info | ]
With the advancement of statistical methodology inextricably linked to the use of computers, new methodological ideas must be translated into usable code and then numerically evaluated relative to competing procedures. In response to this, Statistical Computing in C++ and R concentrates on the writing of code rather than the development and study of numerical algorithms per se. The book discusses code development in C++ and R and the use of these symbiotic languages in unison. It emphasizes that each offers distinct features that, when used in tandem, can take code writing beyond what can be obtained from either language alone. The text begins with some basics of object-oriented languages, followed by a “boot-camp” on the use of C++ and R. The authors then discuss code development for the solution of specific computational problems that are relevant to statistics including optimization, numerical linear algebra, and random number generation. Later chapters introduce abstract data structures (ADTs) and parallel computing concepts. The appendices cover R and UNIX Shell programming. The translation of a mathematical problem into its computational analog (or analogs) is a skill that must be learned, like any other, by actively solving relevant problems. The text reveals the basic principles of algorithmic thinking essential to the modern statistician as well as the fundamental skill of communicating with a computer through the use of the computer languages C++ and R. The book lays the foundation for original code development in a research environment.
[56] Claus Thorn Ekstrom. The R Primer. Chapman & Hall/CRC Press, Boca Raton, FL, 2011. ISBN 978-1-4398-6206-3. [ bib | Discount Info | ]
Newcomers to R are often intimidated by the command-line interface, the vast number of functions and packages, or the processes of importing data and performing a simple statistical analysis. The R Primer provides a collection of concise examples and solutions to R problems frequently encountered by new users of this statistical software. Rather than explore the many options available for every command as well as the ever-increasing number of packages, the book focuses on the basics of data preparation and analysis and gives examples that can be used as a starting point. The numerous examples illustrate a specific situation, topic, or problem, including data importing, data management, classical statistical analyses, and high-quality graphics production. Each example is self-contained and includes R code that can be run exactly as shown, enabling results from the book to be replicated. While base R is used throughout, other functions or packages are listed if they cover or extend the functionality. After working through the examples found in this text, new users of R will be able to better handle data analysis and graphics applications in R. Additional topics and R code are available from the book's supporting website at
[57] James Michael Curran. Introduction to Data Analysis with R for Forensic Scientists. CRC Press, Boca Raton, FL, 2011. ISBN 9781420088267. [ bib | Publisher Info ]
Keywords: Criminal investigation, Data processing, Forensic sciences, Forensic statistics, R (Computer program language), Statistical methods
[58] Christian P. Robert and George Casella. Méthodes de Monte-Carlo avec R. Pratique R. Springer, 1st edition, 2011. ISBN 978-2-8178-0180-3. French translation of Introducting Monte Carlo Methods with R. [ bib ]
Les techniques informatiques de simulation sont essentielles au statisticien. Afin que celui-ci puisse les utiliser en vue de résoudre des problèmes statistiques, il lui faut au préalable développer son intuition et sa capacité à produire lui-même des modèles de simulation. Ce livre adopte donc le point de vue du programmeur pour exposer ces outils fondamentaux de simulation stochastique. Il montre comment les implémenter sous R et donne les clés d'une meilleure compréhension des méthodes exposées en vue de leur comparaison, sans s'attarder trop longuement sur leur justification théorique. Les auteurs présentent les algorithmes de base pour la génération de données aléatoires, les techniques de Monte-Carlo pour l'intégration et l'optimisation, les diagnostics de convergence, les chaînes de Markov, les algorithmes adaptatifs, les algorithmes de Metropolis- Hastings et de Gibbs. Tous les chapitres incluent des exercices. Les programmes R sont disponibles dans un package spécifique. Le livre s'adresse à toute personne que la simulation statistique intéresse et n'exige aucune connaissance préalable du langage R, ni aucune expertise en statistique bayésienne, bien que nombre d'exercices relèvent de ce champ précis. Cet ouvrage sera utile aux étudiants et aux professionnels actifs dans les domaines de la statistique, des télécommunications, de l'économétrie, de la finance et bien d'autres encore.
[59] Chris Hay Jahans. R Companion to Linear Models. Chapman & Hall/CRC Press, Boca Raton, FL, 2011. ISBN 978-1-4398-7365-6. [ bib | Discount Info | ]
Focusing on user-developed programming, An R Companion to Linear Statistical Models serves two audiences: those who are familiar with the theory and applications of linear statistical models and wish to learn or enhance their skills in R; and those who are enrolled in an R-based course on regression and analysis of variance. For those who have never used R, the book begins with a self-contained introduction to R that lays the foundation for later chapters. This book includes extensive and carefully explained examples of how to write programs using the R programming language. These examples cover methods used for linear regression and designed experiments with up to two fixed-effects factors, including blocking variables and covariates. It also demonstrates applications of several pre-packaged functions for complex computational procedures.
[60] Damon M. Berridge. Multivariate Generalized Linear Mixed Models Using R. Chapman & Hall/CRC Press, Boca Raton, FL, 2011. ISBN 978-1-4398-1326-3. [ bib | Discount Info | ]
Multivariate Generalized Linear Mixed Models Using R presents robust and methodologically sound models for analyzing large and complex data sets, enabling readers to answer increasingly complex research questions. The book applies the principles of modeling to longitudinal data from panel and related studies via the Sabre software package in R. The authors first discuss members of the family of generalized linear models, gradually adding complexity to the modeling framework by incorporating random effects. After reviewing the generalized linear model notation, they illustrate a range of random effects models, including three-level, multivariate, endpoint, event history, and state dependence models. They estimate the multivariate generalized linear mixed models (MGLMMs) using either standard or adaptive Gaussian quadrature. The authors also compare two-level fixed and random effects linear models. The appendices contain additional information on quadrature, model estimation, and endogenous variables, along with SabreR commands and examples. In medical and social science research, MGLMMs help disentangle state dependence from incidental parameters. Focusing on these sophisticated data analysis techniques, this book explains the statistical theory and modeling involved in longitudinal studies. Many examples throughout the text illustrate the analysis of real-world data sets. Exercises, solutions, and other material are available on a supporting website.
[61] Shravan Vasishth and Michael Broe. The Foundations of Statistics: A Simulation-based Approach. Springer, 2010. ISBN 978-3-642-16312-8. [ bib | Discount Info | Publisher Info ]
Statistics and hypothesis testing are routinely used in areas (such as linguistics) that are traditionally not mathematically intensive. In such fields, when faced with experimental data, many students and researchers tend to rely on commercial packages to carry out statistical data analysis, often without understanding the logic of the statistical tests they rely on. As a consequence, results are often misinterpreted, and users have difficulty in flexibly applying techniques relevant to their own research --- they use whatever they happen to have learned. A simple solution is to teach the fundamental ideas of statistical hypothesis testing without using too much mathematics. This book provides a non-mathematical, simulation-based introduction to basic statistical concepts and encourages readers to try out the simulations themselves using the source code and data provided (the freely available programming language R is used throughout). Since the code presented in the text almost always requires the use of previously introduced programming constructs, diligent students also acquire basic programming abilities in R. The book is intended for advanced undergraduate and graduate students in any discipline, although the focus is on linguistics, psychology, and cognitive science. It is designed for self-instruction, but it can also be used as a textbook for a first course on statistics. Earlier versions of the book have been used in undergraduate and graduate courses in Europe and the US.
[62] Robert A. Muenchen and Joseph M. Hilbe. R for Stata Users. Statistics and Computing. Springer, 2010. ISBN 978-1-4419-1317-3. [ bib | Discount Info | Publisher Info ]
This book shows you how to extend the power of Stata through the use of R. It introduces R using Stata terminology with which you are already familiar. It steps through more than 30 programs written in both languages, comparing and contrasting the two packages' different approaches. When finished, you will be able to use R in conjunction with Stata, or separately, to import data, manage and transform it, create publication quality graphics, and perform basic statistical analyses.
[63] Rob Kabacoff. R in Action. Manning, 2010. [ bib | ]
R in Action is the first book to present both the R system and the use cases that make it such a compelling package for business developers. The book begins by introducing the R language, including the development environment. As you work through various examples illustrating R's features, you'll also get a crash course in practical statistics, including basic and advanced models for normal and non- normal data, longitudinal and survival data, and a wide variety of multivariate methods. Both data mining methodologies and approaches to messy and incomplete data are included.
[64] Pierre-André Cornillon, Arnaud Guyader, François Husson, Nicolas Jégou, Julie Josse, Maela Kloareg, Eric Matzner-Lober, and Laurent Rouviere. Statistiques avec R. Didact Statistiques. Presses Universitaires de Rennes, 2nd edition, 2010. ISBN 978-2-7535-1087-6. [ bib | ]
Après seulement dix ans d'existence, le logiciel R est devenu un outil incontournable de statistique et de visualisation de données tant dans le monde universitaire que dans celui de l'entreprise. Ce développement exceptionnel s'explique par ses trois principales qualités: il est gratuit, très complet et en essor permanent. Ce livre s'articule en deux grandes parties : la première est centrée sur le fonctionnement du logiciel R tandis que la seconde met en oeuvre une vingtaine de méthodes statistiques au travers de fiches. Ces fiches sont chacune basées sur un exemple concret et balayent un large spectre de techniques classiques en traitement de données. Ce livre s'adresse aux débutants comme aux utilisateurs réguliers de R. Il leur permettra de réaliser rapidement des graphiques et des traitements statistiques simples ou élaborés. Pour cette deuxième édition, le texte a été révisé et augmenté. Certaines fiches ont été complétées, d'autres utilisent de nouveaux exemples. Enfin des fiches ont été ajoutées ainsi que quelques nouveaux exercices.
[65] Pierre Lafaye de Micheaux, Rémy Drouilhet, and Benoît Liquet. Le Logiciel R. Maîtriser le langage, effectuer des analyses statistiques. Springer, Collection Statistiques et Probabilités appliquées, 1st edition, 2010. ISBN 9782817801148. [ bib | ]
Ce livre est consacré à un outil désormais incontournable pour l'analyse de données, l'élaboration de graphiques et le calcul statistique : le logiciel R. Après avoir introduit les principaux concepts permettant une utilisation sereine de cet environnement informatique (organisation des données, importation et exportation, accès à la documentation, représentations graphiques, programmation, maintenance, etc.), les auteurs de cet ouvrage détaillent l'ensemble des manipulations permettant la manipulation avec R d'un très grand nombre de méthodes et de notions statistiques : simulation de variables aléatoires, intervalles de confiance, tests d'hypothèses, valeur-p, bootstrap, régression linéaire, ANOVA (y compris répétées), et d'autres encore. Écrit avec un grand souci de pédagogie et clarté, et agrémenté de nombreux exercices et travaux pratiques, ce livre accompagnera idéalement tous les utilisateurs de R -- et cela sur les environnements Windows, Macintosh ou Linux -- qu'ils soient débutants ou d'un niveau avancé : étudiants, enseignants ou chercheurs en statistique, mathématiques, médecine, informatique, biologie, psychologie, sciences infirmières, etc. Il leur permettra de maîtriser en profondeur le fonctionnement de ce logiciel. L'ouvrage sera aussi utile aux utilisateurs plus confirmés qui retrouveront exposé ici l'ensemble des fonctions R les plus couramment utilisées.
[66] Joseph Adler. R in a Nutshell [deutsche Ausgabe]. O'Reilly Verlag, Köln, 1. edition, 2010. ISBN 978-3-89721-649-5. Mit Funktions- und Datensatzreferenz; Begleitpaket nutshellDE mit Beispieldaten und -code (auf der Verlagsseite des Buchs). [ bib | Publisher Info ]
Das Buch ist ein umfangreiches Handbuch und Nachschlagewerk zu R. Es beschreibt die Installation und Erweiterung der Software und gibt einen breiten Überblick über die Programmiersprache. Anhand unzähliger Beispiele aus Medizin, Wirtschaft, Sport und Bioinformatik behandelt es, wie Daten eingelesen, transformiert und grafisch dargestellt werden. Anhand realer Datensätze werden zahlreiche Methoden und Verfahren der statistischen Datenanalyse mit R demonstriert. Die Funktionsreferenz wurde für die deutsche Ausgabe vollständig neu verfasst.
[67] John M. Quick. The Statistical Analysis with R Beginners Guide. Packt Publishing, 2010. ISBN 1849512086. [ bib ]
The Statistical Analysis with R Beginners Guide will take you on a journey as the strategist for an ancient Chinese kingdom. Along the way, you will learn how to use R to arrive at practical solutions and how to effectively communicate your results. Ultimately, the fate of the kingdom depends on your ability to make informed, data- driven decisions with R.
[68] Francois Husson, Sébastien Lê, and Jérôme Pagès. Exploratory Multivariate Analysis by Example Using R. Computer Sciences and Data Analysis. Chapman & Hall/CRC, 2010. ISBN 978-1-4398-3580-7. [ bib | Discount Info | ]
Full of real-world case studies and practical advice, Exploratory Multivariate Analysis by Example Using R focuses on four fundamental methods of multivariate exploratory data analysis that are most suitable for applications. It covers principal component analysis (PCA) when variables are quantitative, correspondence analysis (CA) and multiple correspondence analysis (MCA) when variables are categorical, and hierarchical cluster analysis. The authors take a geometric point of view that provides a unified vision for exploring multivariate data tables. Within this framework, they present the principles, indicators, and ways of representing and visualizing objects that are common to the exploratory methods. The authors show how to use categorical variables in a PCA context in which variables are quantitative, how to handle more than two categorical variables in a CA context in which there are originally two variables, and how to add quantitative variables in an MCA context in which variables are categorical. They also illustrate the methods and the ways they can be exploited using examples from various fields. Throughout the text, each result correlates with an R command accessible in the FactoMineR package developed by the authors. All of the data sets and code are available at By using the theory, examples, and software presented in this book, readers will be fully equipped to tackle real-life multivariate data.
[69] David Ruppert. Statistics and Data Analysis for Financial Engineering. Use R! Springer, 2010. ISBN 978-1-4419-7786-1. [ bib | Discount Info | Publisher Info ]
Financial engineers have access to enormous quantities of data but need powerful methods for extracting quantitative information, particularly about volatility and risks. Key features of this textbook are: illustration of concepts with financial markets and economic data, R Labs with real-data exercises, and integration of graphical and analytic methods for modeling and diagnosing modeling errors. Despite some overlap with the author's undergraduate textbook Statistics and Finance: An Introduction, this book differs from that earlier volume in several important aspects: it is graduate-level; computations and graphics are done in R; and many advanced topics are covered, for example, multivariate distributions, copulas, Bayesian computations, VaR and expected shortfall, and cointegration. The prerequisites are basic statistics and probability, matrices and linear algebra, and calculus. Some exposure to finance is helpful.
[70] Christian Robert and George Casella. Introducing Monte Carlo Methods with R. Use R. Springer, 2010. ISBN 978-1-4419-1575-7. [ bib | Discount Info | Publisher Info ]
Computational techniques based on simulation have now become an essential part of the statistician's toolbox. It is thus crucial to provide statisticians with a practical understanding of those methods, and there is no better way to develop intuition and skills for simulation than to use simulation to solve statistical problems. Introducing Monte Carlo Methods with R covers the main tools used in statistical simulation from a programmer's point of view, explaining the R implementation of each simulation technique and providing the output for better understanding and comparison. While this book constitutes a comprehensive treatment of simulation methods, the theoretical justification of those methods has been considerably reduced, compared with Robert and Casella (2004). Similarly, the more exploratory and less stable solutions are not covered here. This book does not require a preliminary exposure to the R programming language or to Monte Carlo methods, nor an advanced mathematical background. While many examples are set within a Bayesian framework, advanced expertise in Bayesian statistics is not required. The book covers basic random generation algorithms, Monte Carlo techniques for integration and optimization, convergence diagnoses, Markov chain Monte Carlo methods, including Metropolis-Hastings and Gibbs algorithms, and adaptive algorithms. All chapters include exercises and all R programs are available as an R package called mcsm. The book appeals to anyone with a practical interest in simulation methods but no previous exposure. It is meant to be useful for students and practitioners in areas such as statistics, signal processing, communications engineering, control theory, econometrics, finance and more. The programming parts are introduced progressively to be accessible to any reader.
[71] Din Chen. Clinical Trial Data Analysis with R. Chapman & Hall/CRC Biostatistics series. Chapman & Hall/CRC Press, Boca Raton, FL, 2010. ISBN 978-1-4398-4020-7. [ bib | ]
Too often in biostatistical research and clinical trials, a knowledge gap exists between developed statistical methods and the applications of these methods. Filling this gap, Clinical Trial Data Analysis Using R provides a thorough presentation of biostatistical analyses of clinical trial data and shows step by step how to implement the statistical methods using R. The book's practical, detailed approach draws on the authors' 30 years of real-world experience in biostatistical research and clinical development. Each chapter presents examples of clinical trials based on the authors' actual experiences in clinical drug development. Various biostatistical methods for analyzing the data are then identified. The authors develop analysis code step by step using appropriate R packages and functions. This approach enables readers to gain an understanding of the analysis methods and R implementation so that they can use R to analyze their own clinical trial data. With step-by-step illustrations of R implementations, this book shows how to easily use R to simulate and analyze data from a clinical trial. It describes numerous up-to-date statistical methods and offers sound guidance on the processes involved in clinical trials.
[72] Carlo Gaetan and Xavier Guyon. Spatial Statistics and Modeling. Springer Series in Statistics. Springer, 2010. ISBN 978-0-387-92256-0. [ bib | Discount Info | Publisher Info ]
Spatial statistics are useful in subjects as diverse as climatology, ecology, economics, environmental and earth sciences, epidemiology, image analysis and more. This book covers the best-known spatial models for three types of spatial data: geostatistical data (stationarity, intrinsic models, variograms, spatial regression and space-time models), areal data (Gibbs-Markov fields and spatial auto-regression) and point pattern data (Poisson, Cox, Gibbs and Markov point processes). The level is relatively advanced, and the presentation concise but complete. The most important statistical methods and their asymptotic properties are described, including estimation in geostatistics, autocorrelation and second-order statistics, maximum likelihood methods, approximate inference using the pseudo-likelihood or Monte-Carlo simulations, statistics for point processes and Bayesian hierarchical models. A chapter is devoted to Markov Chain Monte Carlo simulation (Gibbs sampler, Metropolis-Hastings algorithms and exact simulation). A large number of real examples are studied with R, and each chapter ends with a set of theoretical and applied exercises. While a foundation in probability and mathematical statistics is assumed, three appendices introduce some necessary background. The book is accessible to senior undergraduate students with a solid math background and Ph.D. students in statistics. Furthermore, experienced statisticians and researchers in the above-mentioned fields will find the book valuable as a mathematically sound reference. This book is the English translation of Modélisation et Statistique Spatiales published by Springer in the series Mathématiques & Applications, a series established by Société de Mathématiques Appliquées et Industrielles (SMAI).
[73] Andrew P. Robinson and Jeff D. Hamann. Forest Analytics with R. Use R! Springer, 2010. ISBN 978-1-4419-7761-8. [ bib | Discount Info | Publisher Info ]
Forest Analytics with R combines practical, down-to-earth forestry data analysis and solutions to real forest management challenges with state-of-the-art statistical and data-handling functionality. The authors adopt a problem-driven approach, in which statistical and mathematical tools are introduced in the context of the forestry problem that they can help to resolve. All the tools are introduced in the context of real forestry datasets, which provide compelling examples of practical applications. The modeling challenges covered within the book include imputation and interpolation for spatial data, fitting probability density functions to tree measurement data using maximum likelihood, fitting allometric functions using both linear and non-linear least-squares regression, and fitting growth models using both linear and non-linear mixed-effects modeling. The coverage also includes deploying and using forest growth models written in compiled languages, analysis of natural resources and forestry inventory data, and forest estate planning and optimization using linear programming. The book would be ideal for a one-semester class in forest biometrics or applied statistics for natural resources management. The text assumes no programming background, some introductory statistics, and very basic applied mathematics.
[74] Hrishikesh D. Vinod, editor. Advances in Social Science Research Using R. Lecture Notes in Statistics. Springer, 2010. ISBN 978-1-4419-1763-8. [ bib | Discount Info | Publisher Info ]
This book covers recent advances for quantitative researchers with practical examples from social sciences. The following twelve chapters written by distinguished authors cover a wide range of issues--all providing practical tools using the free R software. McCullough: R can be used for reliable statistical computing, whereas most statistical and econometric software cannot. This is illustrated by the effect of abortion on crime. Koenker: Additive models provide a clever compromise between parametric and non-parametric components illustrated by risk factors for Indian malnutrition. Gelman: R graphics in the context of voter participation in US elections. Vinod: New solutions to the old problem of efficient estimation despite autocorrelation and heteroscedasticity among regression errors are proposed and illustrated by the Phillips curve tradeoff between inflation and unemployment. Markus and Gu: New R tools for exploratory data analysis including bubble plots. Vinod, Hsu and Tian: New R tools for portfolio selection borrowed from computer scientists and data-mining experts, relevant to anyone with an investment portfolio. Foster and Kecojevic: Extends the usual analysis of covariance (ANCOVA) illustrated by growth charts for Saudi children. Imai, Keele, Tingley, and Yamamoto: New R tools for solving the age-old scientific problem of assessing the direction and strength of causation. Their job search illustration is of interest during current times of high unemployment. Haupt, Schnurbus, and Tschernig: consider the choice of functional form for an unknown, potentially nonlinear relationship, explaining a set of new R tools for model visualization and validation. Rindskopf: R methods to fit a multinomial based multivariate analysis of variance (ANOVA) with examples from psychology, sociology, political science, and medicine. Neath: R tools for Bayesian posterior distributions to study increased disease risk in proximity to a hazardous waste site. Numatsi and Rengifo: explain persistent discrete jumps in financial series subject to misspecification.
[75] Victor Bloomfield. Computer Simulation and Data Analysis in Molecular Biology and Biophysics: An Introduction Using R. Springer, 2009. ISBN 978-1-4419-0083-8. [ bib | Publisher Info ]
This book provides an introduction, suitable for advanced undergraduates and beginning graduate students, to two important aspects of molecular biology and biophysics: computer simulation and data analysis. It introduces tools to enable readers to learn and use fundamental methods for constructing quantitative models of biological mechanisms, both deterministic and with some elements of randomness, including complex reaction equilibria and kinetics, population models, and regulation of metabolism and development; to understand how concepts of probability can help in explaining important features of DNA sequences; and to apply a useful set of statistical methods to analysis of experimental data from spectroscopic, genomic, and proteomic sources. These quantitative tools are implemented using the free, open source software program R. R provides an excellent environment for general numerical and statistical computing and graphics, with capabilities similar to Matlab. Since R is increasingly used in bioinformatics applications such as the BioConductor project, it can serve students as their basic quantitative, statistical, and graphics tool as they develop their careers
[76] Uwe Ligges. Programmieren mit R. Springer-Verlag, Heidelberg, 3rd edition, 2009. ISBN 978-3-540-79997-9. In German. [ bib | Publisher Info | ]
R ist eine objekt-orientierte und interpretierte Sprache und Programmierumgebung für Datenanalyse und Grafik --- frei erhältlich unter der GPL. Das Buch führt in die Grundlagen der Sprache R ein und vermittelt ein umfassendes Verständnis der Sprachstruktur. Die enormen Grafikfähigkeiten von R werden detailliert beschrieben. Der Leser kann leicht eigene Methoden umsetzen, Objektklassen definieren und ganze Pakete aus Funktionen und zugehöriger Dokumentation zusammenstellen. Ob Diplomarbeit, Forschungsprojekte oder Wirtschaftsdaten, das Buch unterstützt alle, die R als flexibles Werkzeug zur Datenanalyse und -visualisierung einsetzen möchten.
[77] Stano Pekar and Marek Brabec. Moderni analyza biologickych dat. 1. Zobecnene linearni modely v prostredi R [Modern Analysis of Biological Data. 1. Generalised Linear Models in R]. Biologie dnes. Scientia, Praha, 2009. ISBN 978-80-86960-44-9. In Czech. [ bib | Publisher Info ]
Kniha je zamerena na regresni modely, konkretne jednorozmerne zobecnene linearni modely (GLM). Je urcena predevsim studentum a kolegum z biologickych oboru a vyzaduje pouze zakladni statisticke vzdelani, jakym je napr. jednosemestrovy kurz biostatistiky. Text knihy obsahuje nezbytne minimum statisticke teorie, predevsim vsak reseni 18 realnych prikladu z oblasti biologie. Kazdy priklad je rozpracovan od popisu a stanoveni cile pres vyvoj statistickeho modelu az po zaver. K analyze dat je pouzit popularni a volne dostupny statisticky software R. Priklady byly zamerne vybrany tak, aby upozornily na lecktere problemy a chyby, ktere se mohou v prubehu analyzy dat vyskytnout. Zaroven maji ctenare motivovat k tomu, jak o statistickych modelech premyslet a jak je pouzivat. Reseni prikladu si muse ctenar vyzkouset sam na datech, jez jsou dodavana spolu s knihou.
[78] Robert A. Muenchen. R for SAS and SPSS Users. Springer Series in Statistics and Computing. Springer, 2009. ISBN 978-1-4614-0685-3. [ bib | Discount Info | Publisher Info ]
This book demonstrates which of the add-on packages are most like SAS and SPSS and compares them to R's built-in functions. It steps through over 30 programs written in all three packages, comparing and contrasting the packages' differing approaches. The programs and practice datasets are available for download.
[79] Richard M. Heiberger and Erich Neuwirth. R Through Excel. Use R. Springer, 2009. ISBN 978-1-4419-0051-7. [ bib | Discount Info | Publisher Info ]
The primary focus of the book is on the use of menu systems from the Excel menu bar into the capabilities provided by R. The presentation is designed as a computational supplement to introductory statistics texts. The authors provide RExcel examples for most topics in the introductory course. Data can be transferred from Excel to R and back. The clickable RExcel menu supplements the powerful R command language. Results from the analyses in R can be returned to the spreadsheet. Ordinary formulas in spreadsheet cells can use functions written in R.
[80] Peter D. Hoff. A First Course in Bayesian Statistical Methods. Springer Series in Statistics for Social and Behavioral Sciences. Springer, 2009. ISBN 978-0-387-92299-7. [ bib | Discount Info | Publisher Info ]
This book provides a compact self-contained introduction to the theory and application of Bayesian statistical methods. The book is accessible to readers with only a basic familiarity with probability, yet allows more advanced readers to quickly grasp the principles underlying Bayesian theory and methods. R code is provided throughout the text. Much of the example code can be run “as is” in R, and essentially all of it can be run after downloading the relevant datasets from the companion website for this book.
[81] Paul S. P. Cowpertwait and Andrew Metcalfe. Introductory Time Series with R. Springer Series in Statistics. Springer, 2009. ISBN 978-0-387-88697-8. [ bib | Discount Info | Publisher Info ]
This book gives you a step-by-step introduction to analysing time series using the open source software R. Once the model has been introduced it is used to generate synthetic data, using R code, and these generated data are then used to estimate its parameters. This sequence confirms understanding of both the model and the R routine for fitting it to the data. Finally, the model is applied to an analysis of a historical data set. By using R, the whole procedure can be reproduced by the reader. All the data sets used in the book are available on the website The book is written for undergraduate students of mathematics, economics, business and finance, geography, engineering and related disciplines, and postgraduate students who may need to analyze time series as part of their taught program or their research.
[82] Owen Jones, Robert Maillardet, and Andrew Robinson. Introduction to Scientific Programming and Simulation Using R. Chapman & Hall/CRC, Boca Raton, FL, 2009. ISBN 978-1-4200-6872-6. [ bib | Publisher Info ]
This book teaches the skills needed to perform scientific programming while also introducing stochastic modelling. Stochastic modelling in particular, and mathematical modelling in general, are intimately linked to scientific programming because the numerical techniques of scientific programming enable the practical application of mathematical models to real-world problems.
[83] M. Henry H. Stevens. A Primer of Ecology with R. Use R. Springer, 2009. ISBN 978-0-387-89881-0. [ bib | Discount Info | Publisher Info ]
This book combines an introduction to the major theoretical concepts in general ecology with the programming language R, a cutting edge Open Source tool. Starting with geometric growth and proceeding through stability of multispecies interactions and species-abundance distributions, this book demystifies and explains fundamental ideas in population and community ecology. Graduate students in ecology, along with upper division undergraduates and faculty, will all find this to be a useful overview of important topics.
[84] Kurt Varmuza and Peter Filzmoser. Introduction to Multivariate Statistical Analysis in Chemometrics. CRC Press, Boca Raton, FL, 2009. ISBN 9781420059472. [ bib | Publisher Info | ]
Using formal descriptions, graphical illustrations, practical examples, and R software tools, Introduction to Multivariate Statistical Analysis in Chemometrics presents simple yet thorough explanations of the most important multivariate statistical methods for analyzing chemical data. It includes discussions of various statistical methods, such as principal component analysis, regression analysis, classification methods, and clustering. Written by a chemometrician and a statistician, the book reflects both the practical approach of chemometrics and the more formally oriented one of statistics. To enable a better understanding of the statistical methods, the authors apply them to real data examples from chemistry. They also examine results of the different methods, comparing traditional approaches with their robust counterparts. In addition, the authors use the freely available R package to implement methods, encouraging readers to go through the examples and adapt the procedures to their own problems. Focusing on the practicality of the methods and the validity of the results, this book offers concise mathematical descriptions of many multivariate methods and employs graphical schemes to visualize key concepts. It effectively imparts a basic understanding of how to apply statistical methods to multivariate scientific data.
[85] Karl W. Broman and Saunak Sen. A Guide to QTL Mapping with R/qtl. SBH/Statistics for Biology and Health. Springer, 2009. ISBN 978-0-387-92124-2. [ bib | Discount Info | Publisher Info ]
This book is a comprehensive guide to the practice of QTL mapping and the use of R/qtl, including study design, data import and simulation, data diagnostics, interval mapping and generalizations, two-dimensional genome scans, and the consideration of complex multiple-QTL models. Two moderately challenging case studies illustrate QTL analysis in its entirety. The book alternates between QTL mapping theory and examples illustrating the use of R/qtl. Novice readers will find detailed explanations of the important statistical concepts and, through the extensive software illustrations, will be able to apply these concepts in their own research. Experienced readers will find details on the underlying algorithms and the implementation of extensions to R/qtl.
[86] Kai Velten. Mathematical Modeling and Simulation: Introduction for Scientists and Engineers. Wiley-VCH, 2009. ISBN 978-3-527-40758-3. [ bib | Publisher Info ]
This introduction into mathematical modeling and simulation is exclusively based on open source software, and it includes many examples from such diverse fields as biology, ecology, economics, medicine, agricultural, chemical, electrical, mechanical, and process engineering. Requiring only little mathematical prerequisite in calculus and linear algebra, it is accessible to scientists, engineers, and students at the undergraduate level. The reader is introduced into CAELinux, Calc, Code-Saturne, Maxima, R, and Salome-Meca, and the entire book software --- including 3D CFD and structural mechanics simulation software --- can be used based on a free CAELinux-Live-DVD that is available in the Internet (works on most machines and operating systems).
[87] Jim Albert. Bayesian Computation with R. Springer Series in Statistics. Springer, 2nd edition, 2009. ISBN 978-0-387-92298-0. [ bib | Discount Info | Publisher Info ]
Bayesian Computing Using R introduces Bayesian modeling by the use of computation using the R language. The early chapters present the basic tenets of Bayesian thinking by use of familiar one and two-parameter inferential problems. Bayesian computational methods such as Laplace's method, rejection sampling, and the SIR algorithm are illustrated in the context of a random effects model. The construction and implementation of Markov Chain Monte Carlo (MCMC) methods is introduced. These simulation-based algorithms are implemented for a variety of Bayesian applications such as normal and binary response regression, hierarchical modeling, order-restricted inference, and robust modeling. Algorithms written in R are used to develop Bayesian tests and assess Bayesian models by use of the posterior predictive distribution. The use of R to interface with WinBUGS, a popular MCMC computing language, is described with several illustrative examples. The second edition contains several new topics such as the use of mixtures of conjugate priors and the use of Zellner's g priors to choose between models in linear regression. There are more illustrations of the construction of informative prior distributions, such as the use of conditional means priors and multivariate normal priors in binary regressions. The new edition contains changes in the R code illustrations according to the latest edition of the LearnBayes package.
[88] J. O. Ramsay, Giles Hooker, and Spencer Graves. Functional Data Analysis with R and Matlab. Use R. Springer, 2009. ISBN 978-0-387-98184-0. [ bib | Discount Info | Publisher Info ]
This volume in the UseR! Series is aimed at a wide range of readers, and especially those who would like apply these techniques to their research problems. It complements Functional Data Analysis, Second Edition and Applied Functional Data Analysis: Methods and Case Studies by providing computer code in both the R and Matlab languages for a set of data analyses that showcase the functional data analysis. The authors make it easy to get up and running in new applications by adapting the code for the examples, and by being able to access the details of key functions within these pages. This book is accompanied by additional web-based support at for applying existing functions and developing new ones in either language.
[89] Hadley Wickham. ggplot: Elegant Graphics for Data Analysis. Use R. Springer, 2009. ISBN 978-0-98140-6. [ bib | Discount Info ]
This book will be useful to everyone who has struggled with displaying their data in an informative and attractive way. You will need some basic knowledge of R (i.e., you should be able to get your data into R), but ggplot2 is a mini-language specifically tailored for producing graphics, and you'll learn everything you need in the book. After reading this book you'll be able to produce graphics customized precisely for your problems, to and you'll find it easy to get graphics out of your head and on to the screen or page.
[90] Günther Sawitzki. Computational Statistics. Chapman & Hall/CRC Press, Boca Raton, FL, 2009. ISBN 978-1-4200-8678-2. Includes bibliographical references and index. [ bib | Publisher Info ]
Suitable for a compact course or self-study, Computational Statistics: An Introduction to R illustrates how to use the freely available R software package for data analysis, statistical programming, and graphics. Integrating R code and examples throughout, the text only requires basic knowledge of statistics and computing. This introduction covers one-sample analysis and distribution diagnostics, regression, two-sample problems and comparison of distributions, and multivariate analysis. It uses a range of examples to demonstrate how R can be employed to tackle statistical problems. In addition, the handy appendix includes a collection of R language elements and functions, serving as a quick reference and starting point to access the rich information that comes bundled with R. Accessible to a broad audience, this book explores key topics in data analysis, regression, statistical distributions, and multivariate statistics. Full of examples and with a color insert, it helps readers become familiar with R.
[91] Giovanni Petris, Sonia Petrone, and Patriza Campagnoli. Dynamic Linear Models with R. Use R. Springer, 2009. ISBN 978-0-387-77238-7. [ bib | Discount Info | Publisher Info ]
After a detailed introduction to general state space models, this book focuses on dynamic linear models, emphasizing their Bayesian analysis. Whenever possible it is shown how to compute estimates and forecasts in closed form; for more complex models, simulation techniques are used. A final chapter covers modern sequential Monte Carlo algorithms. The book illustrates all the fundamental steps needed to use dynamic linear models in practice, using R. Many detailed examples based on real data sets are provided to show how to set up a specific model, estimate its parameters, and use it for forecasting. All the code used in the book is available online. No prior knowledge of Bayesian statistics or time series analysis is required, although familiarity with basic statistics and R is assumed.
[92] Gael Millot. Comprendre et réaliser les tests statistiques à l'aide de R. de boeck université, Louvain-la-Neuve, Belgique, 1st edition, 2009. ISBN 2804101797. [ bib | ]
Ce livre s'adresse aux étudiants, médecins et chercheurs désirant réaliser des tests alors qu'ils débutent en statistique. Son originalité est de proposer non seulement une explication très détaillée sur l'utilisation des tests les plus classiques, mais aussi la possibilité de réaliser ces tests à l'aide de R. Illustré par de nombreuses figures et accompagné d'exercices avec correction, l'ouvrage traite en profondeur de notions essentielles comme la check-list à effectuer avant de réaliser un test, la gestion des individus extrêmes, l'origine de la p value, la puissance ou la conclusion d'un test. Il explique comment choisir un test à partir de ses propres données. Il décrit 35 tests statistiques sous forme de fiches, dont 24 non paramétriques, ce qui couvre la plupart des tests à une ou deux variables observées. Il traite de toutes les subtilités des tests, comme les corrections de continuité, les corrections de Welch pour le test t et l'anova, ou les corrections de p value lors des comparaisons multiples. Il propose un exemple d'application de chaque test à l'aide de R, en incluant toutes les étapes du test, et notamment l'analyse graphique des données. En résumé, cet ouvrage devrait contenter à la fois ceux qui recherchent un manuel de statistique expliquant le fonctionnement des tests et ceux qui recherchent un manuel d'utilisation de R.
[93] Francois Husson, Sébastien Lê, and Jérôme Pagès. Analyse de données avec R. Didact Statistiques. Presses Universitaires de Rennes, 2009. ISBN 978-2-7535-0938-2. [ bib | Publisher Info | ]
Ce livre est focalisé sur les quatre méthodes fondamentales de l'analyse des données, celles qui ont le plus vaste potentiel d'application : analyse en composantes principales, analyse factorielle des correspondances, analyse des correspondances multiples et classification ascendante hiérarchique. La plus grande place accordée aux méthodes factorielles tient d'une part aux concepts plus nombreux et plus complexes nécessaires à leur bonne utilisation et d'autre part au fait que c'est à travers elles que sont abordées les spécificités des différents types de données. Pour chaque méthode, la démarche adoptée est la même. Un exemple permet d'introduire la problématique et concrétise presque pas à pas les éléments théoriques. Cet exposé est suivi de plusieurs exemples traités de façon détaillée pour illustrer l'apport de la méthode dans les applications. Tout le long du texte, chaque résultat est accompagné de la commande R qui permet de l'obtenir. Toutes ces commandes sont accessibles à partir de FactoMineR, package R développé par les auteurs. Ainsi, avec cet ouvrage, le lecteur dispose d'un équipement complet (bases théoriques, exemples, logiciels) pour analyser des données multidimensionnelles.
[94] Ewout W. Steyerberg. Clinical Prediction Models: A Practical Approach to Development, Validation, and Updating. SBH/Statistics for Biology and Health. Springer, 2009. ISBN 978-0-387-77243-1. [ bib | Discount Info | Publisher Info ]
This book provides insight and practical illustrations on how modern statistical concepts and regression methods can be applied in medical prediction problems, including diagnostic and prognostic outcomes. Many advances have been made in statistical approaches towards outcome prediction, but these innovations are insufficiently applied in medical research. Old-fashioned, data hungry methods are often used in data sets of limited size, validation of predictions is not done or done simplistically, and updating of previously developed models is not considered. A sensible strategy is needed for model development, validation, and updating, such that prediction models can better support medical practice. Clinical prediction models presents a practical checklist with seven steps that need to be considered for development of a valid prediction model. These include preliminary considerations such as dealing with missing values; coding of predictors; selection of main effects and interactions for a multivariable model; estimation of model parameters with shrinkage methods and incorporation of external data; evaluation of performance and usefulness; internal validation; and presentation formats. The steps are illustrated with many small case-studies and R code, with data sets made available in the public domain. The book further focuses on generalizability of prediction models, including patterns of invalidity that may be encountered in new settings, approaches to updating of a model, and comparisons of centers after case-mix adjustment by a prediction model. The text is primarily intended for clinical epidemiologists and biostatisticians. It can be used as a textbook for a graduate course on predictive modeling in diagnosis and prognosis. It is beneficial if readers are familiar with common statistical models in medicine: linear regression, logistic regression, and Cox regression. The book is practical in nature. But it provides a philosophical perspective on data analysis in medicine that goes beyond predictive modeling. In this era of evidence-based medicine, randomized clinical trials are the basis for assessment of treatment efficacy. Prediction models are key to individualizing diagnostic and treatment decision making.
[95] Detlev Reymann. Wettbewerbsanalysen für kleine und mittlere Unternehmen (KMUs) --- Theoretische Grundlagen und praktische Anwendung am Beispiel gartenbaulicher Betriebe. Verlag Detlev Reymann, Geisenheim, 2009. ISBN 978-3-00-027013-0. [ bib | Publisher Info | ]
In diesem Buch werden die Grundlagen wesentlicher Komponenten von unternehmens- und konkurrentenbezogenen Wettbewerbsanalysen dargestellt. Dabei stehen folgende Teilanalysen im Mittelpunkt: Die Analyse des Einzugsgebietes; die Ermittlung des Marktpotentials und des Marktanteiles; die Ermittlung der Stärken und Schwächen im Verhältnis zur Konkurrenz; die Analyse der Kundenstruktur (Kundentypologisierung). Zu jeder der Teilanalysen werden nach der Darstellung der theoretischen Grundlagen Hinweise und Anleitungen zur praktischen Umsetzung und Durchführung gegeben und jeweils eine vertiefende Betrachtung angeschlossen. Das Buch zielt insbesondere auf kleine und mittlere Unternehmen (KMUs) ab, in denen keine großen spezialisierten Marketingabteilungen existieren. Verwendet werden Verfahren, bei denen sich zum einen der zeitliche Aufwand für die Durchführung in vertretbaren Grenzen hält, zum anderen Analysen, die mit Hilfe von frei verfügbarer Software oder frei verfügbaren Daten durchzuführen sind. Für den Statistikteil werden R-Skripte verwendet, die alle frei von der Webseite des Autors heruntergeladen werden können. Es handelt sich dabei um Skripte zur Berechnung des breaking-points nach Converse, zur Berechnung der Einkaufswahrscheinlichkeit nach Huff und zur Erstellung von Profildiagrammen im Rahmen von SWOT-Analysen sowie von Imageprofilen. Im Kapitel zur Kundentypologisierung wird die Durchführung von Cluster- und Faktoranlysen zur Typologisierung erläutert und der Anhang gibt Hinweise zur Installation und zum Einsatz von R für die beschriebenen Analysen.
[96] Daniel B. Wright and Kamala London. Modern Regression Techniques Using R: A Practical Guide. SAGE, London, UK, 2009. ISBN 9781847879035. [ bib | Publisher Info ]
Techniques covered in this book include multilevel modeling, ANOVA and ANCOVA, path analysis, mediation and moderation, logistic regression (generalized linear models), generalized additive models, and robust methods. These are all tested out using a range of real research examples conducted by the authors in every chapter, and datasets are available from the book's web page at The authors are donating all royalties from the book to the American Partnership for Eosinophilic Disorders.
[97] Christian Ritz and Jens C. Streibig. Nonlinear Regression with R. Springer, New York, 2009. ISBN 978-0-387-09615-5. [ bib | Discount Info | Publisher Info ]
R is a rapidly evolving lingua franca of graphical display and statistical analysis of experiments from the applied sciences. Currently, R offers a wide range of functionality for nonlinear regression analysis, but the relevant functions, packages and documentation are scattered across the R environment. This book provides a coherent and unified treatment of nonlinear regression with R by means of examples from a diversity of applied sciences such as biology, chemistry, engineering, medicine and toxicology. The book starts out giving a basic introduction to fitting nonlinear regression models in R. Subsequent chapters explain the salient features of the main fitting function nls(), the use of model diagnostics, how to deal with various model departures, and carry out hypothesis testing. In the final chapter grouped-data structures, including an example of a nonlinear mixed-effects regression model, are considered.
[98] Andrea S. Foulkes. Applied Statistical Genetics with R: For Population-Based Association Studies. Use R. Springer, 2009. ISBN 978-0-387-89554-3. [ bib | Discount Info | Publisher Info ]
In this introductory graduate level text, Dr. Foulkes elucidates core concepts that undergird the wide range of analytic techniques and software tools for the analysis of data derived from population-based genetic investigations. Applied Statistical Genetics with R offers a clear and cogent presentation of several fundamental statistical approaches that researchers from multiple disciplines, including medicine, public health, epidemiology, statistics and computer science, will find useful in exploring this emerging field.
[99] Alain Zuur, Elena N. Ieno, Neil Walker, Anatoly A. Saveiliev, and Graham M. Smith. Mixed Effects Models and Extensions in Ecology with R. Springer, New York, 2009. ISBN 978-0-387-87457-9. [ bib | Discount Info | Publisher Info ]
Building on the successful Analysing Ecological Data (2007) by Zuur, Ieno and Smith, the authors now provide an expanded introduction to using regression and its extensions in analysing ecological data. As with the earlier book, real data sets from postgraduate ecological studies or research projects are used throughout. The first part of the book is a largely non-mathematical introduction to linear mixed effects modelling, GLM and GAM, zero inflated models, GEE, GLMM and GAMM. The second part provides ten case studies that range from koalas to deep sea research. These chapters provide an invaluable insight into analysing complex ecological datasets, including comparisons of different approaches to the same problem. By matching ecological questions and data structure to a case study, these chapters provide an excellent starting point to analysing your own data. Data and R code from all chapters are available from
[100] Alain F. Zuur, Elena N. Ieno, and Erik Meesters. A Beginner's Guide to R. Use R. Springer, 2009. ISBN 978-0-387-93836-3. [ bib | Discount Info | Publisher Info ]
Based on their extensive experience with teaching R and statistics to applied scientists, the authors provide a beginner's guide to R. To avoid the difficulty of teaching R and statistics at the same time, statistical methods are kept to a minimum. The text covers how to download and install R, import and manage data, elementary plotting, an introduction to functions, advanced plotting, and common beginner mistakes. This book contains everything you need to know to get started with R.
[101] Stefano M. Iacus. Simulation and Inference for Stochastic Differential Equations: With R Examples. Springer, New York, 2008. ISBN 978-0-387-75838-1. [ bib | Discount Info | Publisher Info ]
This book is very different from any other publication in the field and it is unique because of its focus on the practical implementation of the simulation and estimation methods presented. The book should be useful to practitioners and students with minimal mathematical background, but because of the many R programs, probably also to many mathematically well educated practitioners. Many of the methods presented in the book have, so far, not been used much in practice because the lack of an implementation in a unified framework. This book fills the gap. With the R code included in this book, a lot of useful methods become easy to use for practitioners and students. An R package called `sde' provides functionswith easy interfaces ready to be used on empirical data from real life applications. Although it contains a wide range of results, the book has an introductory character and necessarily does not cover the whole spectrum of simulation and inference for general stochastic differential equations. The book is organized in four chapters. The first one introduces the subject and presents several classes of processes used in many fields of mathematics, computational biology, finance and the social sciences. The second chapter is devoted to simulation schemes and covers new methods not available in other milestones publication known so far. The third one is focused on parametric estimation techniques. In particular, it includes exact likelihood inference, approximated and pseudo-likelihood methods, estimating functions, generalized method of moments and other techniques. The last chapter contains miscellaneous topics like nonparametric estimation, model identification and change point estimation. The reader non-expert in R language, will find a concise introduction to this environment focused on the subject of the book which should allow for instant use of the proposed material. To each R functions presented in the book a documentation page is available at the end of the book.
[102] Simon Sheather. A Modern Approach to Regression with R. Springer, New York, 2008. ISBN 978-0-387-09607-0. [ bib | Discount Info | Publisher Info ]
A Modern Approach to Regression with R focuses on tools and techniques for building regression models using real-world data and assessing their validity. When weaknesses in the model are identified, the next step is to address each of these weaknesses. A key theme throughout the book is that it makes sense to base inferences or conclusions only on valid models. The regression output and plots that appear throughout the book have been generated using R. On the book website you will find the R code used in each example in the text. You will also find SAS code and STATA code to produce the equivalent output on the book website. Primers containing expanded explanations of R, SAS and STATA and their use in this book are also available on the book website. The book contains a number of new real data sets from applications ranging from rating restaurants, rating wines, predicting newspaper circulation and magazine revenue, comparing the performance of NFL kickers, and comparing finalists in the Miss America pageant across states. One of the aspects of the book that sets it apart from many other regression books is that complete details are provided for each example. The book is aimed at first year graduate students in statistics and could also be used for a senior undergraduate class.
[103] Deepayan Sarkar. Lattice: Multivariate Data Visualization with R. Springer, New York, 2008. ISBN 978-0-387-75968-5. [ bib | Discount Info | Publisher Info | ]
R is rapidly growing in popularity as the environment of choice for data analysis and graphics both in academia and industry. Lattice brings the proven design of Trellis graphics (originally developed for S by William S. Cleveland and colleagues at Bell Labs) to R, considerably expanding its capabilities in the process. Lattice is a powerful and elegant high level data visualization system that is sufficient for most everyday graphics needs, yet flexible enough to be easily extended to handle demands of cutting edge research. Written by the author of the lattice system, this book describes it in considerable depth, beginning with the essentials and systematically delving into specific low levels details as necessary. No prior experience with lattice is required to read the book, although basic familiarity with R is assumed. The book contains close to 150 figures produced with lattice. Many of the examples emphasize principles of good graphical design; almost all use real data sets that are publicly available in various R packages. All code and figures in the book are also available online, along with supplementary material covering more advanced topics.
[104] Roger S. Bivand, Edzer J. Pebesma, and Virgilio Gómez-Rubio. Applied Spatial Data Analysis with R. Springer, New York, 2008. ISBN 978-0-387-78170-9. [ bib | Discount Info | Publisher Info ]
Applied Spatial Data Analysis with R is divided into two basic parts, the first presenting R packages, functions, classes and methods for handling spatial data. This part is of interest to users who need to access and visualise spatial data. Data import and export for many file formats for spatial data are covered in detail, as is the interface between R and the open source GRASS GIS. The second part showcases more specialised kinds of spatial data analysis, including spatial point pattern analysis, interpolation and geostatistics, areal data analysis and disease mapping. The coverage of methods of spatial data analysis ranges from standard techniques to new developments, and the examples used are largely taken from the spatial statistics literature. All the examples can be run using R contributed packages available from the CRAN website, with code and additional data sets from the book's own website. This book will be of interest to researchers who intend to use R to handle, visualise, and analyse spatial data. It will also be of interest to spatial data analysts who do not use R, but who are interested in practical aspects of implementing software for spatial data analysis. It is a suitable companion book for introductory spatial statistics courses and for applied methods courses in a wide range of subjects using spatial data, including human and physical geography, geographical information systems, the environmental sciences, ecology, public health and disease control, economics, public administration and political science. The book has a website where coloured figures, complete code examples, data sets, and other support material may be found:
[105] Roger D. Peng and Francesca Dominici. Statistical Methods for Environmental Epidemiology with R: A Case Study in Air Pollution and Health. Springer, New York, 2008. ISBN 978-0-387-78166-2. [ bib | Discount Info | Publisher Info ]
Advances in statistical methodology and computing have played an important role in allowing researchers to more accurately assess the health effects of ambient air pollution. The methods and software developed in this area are applicable to a wide array of problems in environmental epidemiology. This book provides an overview of the methods used for investigating the health effects of air pollution and gives examples and case studies in R which demonstrate the application of those methods to real data. The book will be useful to statisticians, epidemiologists, and graduate students working in the area of air pollution and health and others analyzing similar data. The authors describe the different existing approaches to statistical modeling and cover basic aspects of analyzing and understanding air pollution and health data. The case studies in each chapter demonstrate how to use R to apply and interpret different statistical models and to explore the effects of potential confounding factors. A working knowledge of R and regression modeling is assumed. In-depth knowledge of R programming is not required to understand and run the examples. Researchers in this area will find the book useful as a “live” reference. Software for all of the analyses in the book is downloadable from the web and is available under a Free Software license. The reader is free to run the examples in the book and modify the code to suit their needs. In addition to providing the software for developing the statistical models, the authors provide the entire database from the National Morbidity, Mortality, and Air Pollution Study (NMMAPS) in a convenient R package. With the database, readers can run the examples and experiment with their own methods and ideas.
[106] Robert Gentleman. Bioinformatics with R. Chapman & Hall/CRC, Boca Raton, FL, 2008. ISBN 1-420-06367-7. [ bib ]
[107] Robert Gentleman. R Programming for Bioinformatics. Computer Science & Data Analysis. Chapman & Hall/CRC, Boca Raton, FL, 2008. ISBN 9781420063677. [ bib | Discount Info | Publisher Info | ]
Thanks to its data handling and modeling capabilities and its flexibility, R is becoming the most widely used software in bioinformatics. R Programming for Bioinformatics builds the programming skills needed to use R for solving bioinformatics and computational biology problems. Drawing on the author's experiences as an R expert, the book begins with coverage on the general properties of the R language, several unique programming aspects of R, and object-oriented programming in R. It presents methods for data input and output as well as database interactions. The author also examines different facets of string handling and manipulations, discusses the interfacing of R with other languages, and describes how to write software packages. He concludes with a discussion on the debugging and profiling of R code.
[108] Phil Spector. Data Manipulation with R. Springer, New York, 2008. ISBN 978-0-387-74730-9. [ bib | Discount Info | Publisher Info ]
Since its inception, R has become one of the preeminent programs for statistical computing and data analysis. The ready availability of the program, along with a wide variety of packages and the supportive R community make R an excellent choice for almost any kind of computing task related to statistics. However, many users, especially those with experience in other languages, do not take advantage of the full power of R. Because of the nature of R, solutions that make sense in other languages may not be very efficient in R. This book presents a wide array of methods applicable for reading data into R, and efficiently manipulating that data. In addition to the built-in functions, a number of readily available packages from CRAN (the Comprehensive R Archive Network) are also covered. All of the methods presented take advantage of the core features of R: vectorization, efficient use of subscripting, and the proper use of the varied functions in R that are provided for common data management tasks. Most experienced R users discover that, especially when working with large data sets, it may be helpful to use other programs, notably databases, in conjunction with R. Accordingly, the use of databases in R is covered in detail, along with methods for extracting data from spreadsheets and datasets created by other programs. Character manipulation, while sometimes overlooked within R, is also covered in detail, allowing problems that are traditionally solved by scripting languages to be carried out entirely within R. For users with experience in other languages, guidelines for the effective use of programming constructs like loops are provided. Since many statistical modeling and graphics functions need their data presented in a data frame, techniques for converting the output of commonly used functions to data frames are provided throughout the book. Using a variety of examples based on data sets included with R, along with easily simulated data sets, the book is recommended to anyone using R who wishes to advance from simple examples to practical real-life data manipulation solutions.
[109] Bernhard Pfaff. Analysis of Integrated and Cointegrated Time Series with R, Second Edition. Springer, New York, 2nd edition, 2008. ISBN 978-0-387-75966-1. [ bib | Discount Info | Publisher Info ]
The analysis of integrated and co-integrated time series can be considered as the main methodology employed in applied econometrics. This book not only introduces the reader to this topic but enables him to conduct the various unit root tests and co-integration methods on his own by utilizing the free statistical programming environment R. The book encompasses seasonal unit roots, fractional integration, coping with structural breaks, and multivariate time series models. The book is enriched by numerous programming examples to artificial and real data so that it is ideally suited as an accompanying text book to computer lab classes. The second edition adds a discussion of vector auto-regressive, structural vector auto-regressive, and structural vector error-correction models. To analyze the interactions between the investigated variables, further impulse response function and forecast error variance decompositions are introduced as well as forecasting. The author explains how these model types relate to each other. Bernhard Pfaff studied economics at the universities of Göttingen, Germany; Davis, California; and Freiburg im Breisgau, Germany. He obtained a diploma and a doctorate degree at the economics department of the latter entity where he was employed as a research and teaching assistant. He has worked for many years as economist and quantitative analyst in research departments of financial institutions and he is the author and maintainer of the contributed R packages “urca” and “vars.”
[110] Peter Dalgaard. Introductory Statistics with R. Springer, 2nd edition, 2008. ISBN 978-0-387-79053-4. [ bib | Discount Info | Publisher Info ]
This book provides an elementary-level introduction to R, targeting both non-statistician scientists in various fields and students of statistics. The main mode of presentation is via code examples with liberal commenting of the code and the output, from the computational as well as the statistical viewpoint. A supplementary R package can be downloaded and contains the data sets. The statistical methodology includes statistical standard distributions, one- and two-sample tests with continuous data, regression analysis, one- and two-way analysis of variance, regression analysis, analysis of tabular data, and sample size calculations. In addition, the last six chapters contain introductions to multiple linear regression analysis, linear models in general, logistic regression, survival analysis, Poisson regression, and nonlinear regression.
[111] Maria L. Rizzo. Statistical Computing with R. Chapman & Hall/CRC, Boca Raton, FL, 2008. ISBN 9781584885450. [ bib | Discount Info ]
This book covers the traditional core material of computational statistics, with an emphasis on using the R language via an examples-based approach. Suitable for an introductory course in computational statistics or for self-study, it includes R code for all examples and R notes to help explain the R programming concepts.
[112] Luke Keele. Semiparametric Regression for the Social Sciences. Wiley, Chichester, UK, 2008. ISBN 978-0470319918. [ bib | Publisher Info | ]
Smoothing methods have been little used within the social sciences. Semiparametric Regression for the Social Sciences sets out to address this situation by providing an accessible introduction to the subject, filled with examples drawn from the social and political sciences. Readers are introduced to the principles of nonparametric smoothing and to a wide variety of smoothing methods. The author also explains how smoothing methods can be incorporated into parametric linear and generalized linear models. The use of smoothers with these standard statistical models allows the estimation of more flexible functional forms whilst retaining the interpretability of parametric models. The full potential of these techniques is highlighted via the use of detailed empirical examples drawn from the social and political sciences. Each chapter features exercises to aid in the understanding of the methods and applications. All examples in the book were estimated in R. The book contains an appendix with R commands to introduce readers to estimating these models in R. All the R code for the examples in the book are available from the author's website and the publishers website.
[113] Jonathan D. Cryer and Kung-Sik Chan. Time Series Analysis With Applications in R. Springer, New York, 2008. ISBN 978-0-387-75958-6. [ bib | Discount Info | Publisher Info ]
Time Series Analysis With Applications in R, Second Edition, presents an accessible approach to understanding time series models and their applications. Although the emphasis is on time domain ARIMA models and their analysis, the new edition devotes two chapters to the frequency domain and three to time series regression models, models for heteroscedasticty, and threshold models. All of the ideas and methods are illustrated with both real and simulated data sets. A unique feature of this edition is its integration with the R computing environment. The tables and graphical displays are accompanied by the R commands used to produce them. An extensive R package, TSA, which contains many new or revised R functions and all of the data used in the book, accompanies the written text. Script files of R commands for each chapter are available for download. There is also an extensive appendix in the book that leads the reader through the use of R commands and the new R package to carry out the analyses.
[114] John M. Chambers. Software for Data Analysis: Programming with R. Springer, New York, 2008. ISBN 978-0-387-75935-7. [ bib | Discount Info | Publisher Info | ]
The R version of S4 and other R techniques. This book guides the reader in programming with R, from interactive use and writing simple functions to the design of R packages and intersystem interfaces.
[115] Hrishikesh D. Vinod. Hands-on Intermediate Econometrics Using R: Templates for Extending Dozens of Practical Examples. World Scientific, Hackensack, NJ, 2008. ISBN 10-981-281-885-5. 10.1142/6895. [ bib | DOI ]
This book explains how to use R software to teach econometrics by providing interesting examples, using actual data applied to important policy issues. It helps readers choose the best method from a wide array of tools and packages available. The data used in the examples along with R program snippets, illustrate the economic theory and sophisticated statistical methods extending the usual regression. The R program snippets are included on a CD accompanying the book. These are not merely given as black boxes, but include detailed comments which help the reader better understand the software steps and use them as templates for possible extension and modification. The book has received endorsements from top econometricians.
[116] G. P. Nason. Wavelet Methods in Statistics with R. Springer, New York, 2008. ISBN 978-0-387-75960-9. [ bib | Discount Info | Publisher Info ]
Wavelet methods have recently undergone a rapid period of development with important implications for a number of disciplines including statistics. This book fulfils three purposes. First, it is a gentle introduction to wavelets and their uses in statistics. Second, it acts as a quick and broad reference to many recent developments in the area. The book concentrates on describing the essential elements and provides comprehensive source material references. Third, the book intersperses R code that explains and demonstrates both wavelet and statistical methods. The code permits the user to learn the methods, to carry out their own analyses and further develop their own methods. The book is designed to be read in conjunction with WaveThresh4, the freeware R package for wavelets. The book introduces the wavelet transform by starting with the simple Haar wavelet transform and then builds to consider more general wavelets such as the Daubechies compactly supported series. The book then describes the evolution of wavelets in the directions of complex-valued wavelets, non-decimated transforms, multiple wavelets and wavelet packets as well as giving consideration to boundary conditions initialization. Later chapters explain the role of wavelets in nonparametric regression problems via a variety of techniques including thresholding, cross-validation, SURE, false-discovery rate and recent Bayesian methods, and also consider how to deal with correlated and non-Gaussian noise structures. The book also looks at how nondecimated and packet transforms can improve performance. The penultimate chapter considers the role of wavelets in both stationary and non-stationary time series analysis. The final chapter describes recent work concerning the role of wavelets for variance stabilization for non-Gaussian intensity estimation. The book is aimed at final year undergraduate and Masters students in a numerate discipline (such as mathematics, statistics, physics, economics and engineering) and would also suit as a quick reference for postgraduate or research level activity. The book would be ideal for a researcher to learn about wavelets, to learn how to use wavelet software and then to adapt the ideas for their own purposes.
[117] Clemens Reimann, Peter Filzmoser, Robert Garrett, and Rudolf Dutter. Statistical Data Analysis Explained: Applied Environmental Statistics with R. Wiley, Chichester, UK, 2008. ISBN 978-0-470-98581-6. [ bib | Publisher Info | ]
Few books on statistical data analysis in the natural sciences are written at a level that a non-statistician will easily understand. This is a book written in colloquial language, avoiding mathematical formulae as much as possible, trying to explain statistical methods using examples and graphics instead. To use the book efficiently, readers should have some computer experience. The book starts with the simplest of statistical concepts and carries readers forward to a deeper and more extensive understanding of the use of statistics in environmental sciences. The book concerns the application of statistical and other computer methods to the management, analysis and display of spatial data. These data are characterised by including locations (geographic coordinates), which leads to the necessity of using maps to display the data and the results of the statistical methods. Although the book uses examples from applied geochemistry, and a large geochemical survey in particular, the principles and ideas equally well apply to other natural sciences, e.g., environmental sciences, pedology, hydrology, geography, forestry, ecology, and health sciences/epidemiology. The book is unique because it supplies direct access to software solutions (based on R, the Open Source version of the S-language for statistics) for applied environmental statistics. For all graphics and tables presented in the book, the R-scripts are provided in the form of executable R-scripts. In addition, a graphical user interface for R, called DAS+R, was developed for convenient, fast and interactive data analysis. Statistical Data Analysis Explained: Applied Environmental Statistics with R provides, on an accompanying website, the software to undertake all the procedures discussed, and the data employed for their description in the book.
[118] Julien Claude. Morphometrics with R. Springer, New York, 2008. ISBN 978-0-387-77789-4. [ bib | Discount Info | Publisher Info ]
Quantifying shape and size variation is essential in evolutionary biology and in many other disciplines. Since the “morphometric revolution of the 90s,” an increasing number of publications in applied and theoretical morphometrics emerged in the new discipline of statistical shape analysis. The R language and environment offers a single platform to perform a multitude of analyses from the acquisition of data to the production of static and interactive graphs. This offers an ideal environment to analyze shape variation and shape change. This open-source language is accessible for novices and for experienced users. Adopting R gives the user and developer several advantages for performing morphometrics: evolvability, adaptability, interactivity, a single and comprehensive platform, possibility of interfacing with other languages and software, custom analyses, and graphs. The book explains how to use R for morphometrics and provides a series of examples of codes and displays covering approaches ranging from traditional morphometrics to modern statistical shape analysis such as the analysis of landmark data, Thin Plate Splines, and Fourier analysis of outlines. The book fills two gaps: the gap between theoreticians and students by providing worked examples from the acquisition of data to analyses and hypothesis testing, and the gap between user and developers by providing and explaining codes for performing all the steps necessary for morphometrics rather than providing a manual for a given software or package. Students and scientists interested in shape analysis can use the book as a reference for performing applied morphometrics, while prospective researchers will learn how to implement algorithms or interfacing R for new methods. In addition, adopting the R philosophy will enhance exchanges within and outside the morphometrics community. Julien Claude is evolutionary biologist and palaeontologist at the University of Montpellier 2 where he got his Ph.D. in 2003. He works on biodiversity and phenotypic evolution of a variety of organisms, especially vertebrates. He teaches evolutionary biology and biostatistics to undergraduate and graduate students and has developed several functions in R for the package APE.
[119] Christian Kleiber and Achim Zeileis. Applied Econometrics with R. Springer, New York, 2008. ISBN 978-0-387-77316-2. [ bib | Discount Info | Publisher Info ]
This is the first book on applied econometrics using the R system for statistical computing and graphics. It presents hands-on examples for a wide range of econometric models, from classical linear regression models for cross-section, time series or panel data and the common non-linear models of microeconometrics such as logit, probit and tobit models, to recent semiparametric extensions. In addition, it provides a chapter on programming, including simulations, optimization, and an introduction to R tools enabling reproducible econometric research. An R package accompanying this book, AER, is available from the Comprehensive R Archive Network (CRAN) at It contains some 100 data sets taken from a wide variety of sources, the full source code for all examples used in the text plus further worked examples, e.g., from popular textbooks. The data sets are suitable for illustrating, among other things, the fitting of wage equations, growth regressions, hedonic regressions, dynamic regressions and time series models as well as models of labor force participation or the demand for health care. The goal of this book is to provide a guide to R for users with a background in economics or the social sciences. Readers are assumed to have a background in basic statistics and econometrics at the undergraduate level. A large number of examples should make the book of interest to graduate students, researchers and practitioners alike.
[120] Benjamin M. Bolker. Ecological Models and Data in R. Princeton University Press, 2008. ISBN 978-0-691-12522-0. [ bib | Publisher Info | ]
This book is a truly practical introduction to modern statistical methods for ecology. In step-by-step detail, the book teaches ecology graduate students and researchers everything they need to know in order to use maximum likelihood, information-theoretic, and Bayesian techniques to analyze their own data using the programming language R. The book shows how to choose among and construct statistical models for data, estimate their parameters and confidence limits, and interpret the results. The book also covers statistical frameworks, the philosophy of statistical modeling, and critical mathematical functions and probability distributions. It requires no programming background--only basic calculus and statistics.
[121] W. John Braun and Duncan J. Murdoch. A First Course in Statistical Programming with R. Cambridge University Press, Cambridge, 2007. ISBN 978-0521872652. [ bib | ]
This book introduces students to statistical programming, using R as a basis. Unlike other introductory books on the R system, this book emphasizes programming, including the principles that apply to most computing languages, and techniques used to develop more complex projects.
[122] Scott M. Lynch. Introduction to Applied Bayesian Statistics and Estimation for Social Scientists. Springer, New York, 2007. ISBN 978-0-387-71265-9. [ bib | Discount Info | Publisher Info ]
Introduction to Bayesian Statistics and Estimation for Social Scientists covers the complete process of Bayesian statistical analysis in great detail from the development of a model through the process of making statistical inference. The key feature of this book is that it covers models that are most commonly used in social science research-including the linear regression model, generalized linear models, hierarchical models, and multivariate regression models-and it thoroughly develops each real-data example in painstaking detail.
[123] Sandrine Dudoit and Mark J. van der Laan. Multiple Testing Procedures and Applications to Genomics. Springer Series in Statistics. Springer, 2007. ISBN 978-0-387-49317-6. [ bib | Discount Info | Publisher Info ]
This book provides a detailed account of the theoretical foundations of proposed multiple testing methods and illustrates their application to a range of testing problems in genomics.
[124] Philip J. Boland. Statistical and Probabilistic Methods in Actuarial Science. Chapman & Hall/CRC, Boca Raton, FL, 2007. ISBN 9781584886952. [ bib | Discount Info | Publisher Info ]
This book covers many of the diverse methods in applied probability and statistics for students aspiring to careers in insurance, actuarial science, and finance. It presents an accessible, sound foundation in both the theory and applications of actuarial science. It encourages students to use the statistical software package R to check examples and solve problems.
[125] Michael Greenacre. Correspondence Analysis in Practice, Second Edition. Chapman & Hall/CRC, Boca Raton, FL, 2007. ISBN 9781584886167. [ bib | Publisher Info ]
This book shows how the versatile method of correspondence analysis (CA) can be used for data visualization in a wide variety of situations. T his completely revised, up-to-date edition features a didactic approach with self-contained chapters, extensive marginal notes, informative figure and table captions, and end-of-chapter summaries. It includes a computational appendix that provides the R commands that correspond to most of the analyses featured in the book.
[126] John Maindonald and John Braun. Data Analysis and Graphics Using R. Cambridge University Press, Cambridge, 2nd edition, 2007. ISBN 978-0-521-86116-8. [ bib | Publisher Info | ]
Following a brief introduction to R, this has extensive examples that illustrate practical data analysis using R. There is extensive advice on practical data analysis. Topics covered include exploratory data analysis, tests and confidence intervals, regression, genralized linear models, survival analysis, time series, multi-level models, trees and random forests, classification, and ordination.
[127] Jean-Michel Marin and Christian P. Robert. Bayesian Core: A Practical Approach to Computational Bayesian Statistics. Springer, New York, 2007. ISBN 978-0-387-38979-0. [ bib | Discount Info | Publisher Info ]
This Bayesian modeling book is intended for practitioners and applied statisticians looking for a self-contained entry to computational Bayesian statistics. Focusing on standard statistical models and backed up by discussed real datasets available from the book website, it provides an operational methodology for conducting Bayesian inference, rather than focusing on its theoretical justifications. Special attention is paid to the derivation of prior distributions in each case and specific reference solutions are given for each of the models. Similarly, computational details are worked out to lead the reader towards an effective programming of the methods given in the book. While R programs are provided on the book website and R hints are given in the computational sections of the book, The Bayesian Core requires no knowledge of the R language and it can be read and used with any other programming language.
[128] Dianne Cook and Deborah F. Swayne. Interactive and Dynamic Graphics for Data Analysis. Springer, New York, 2007. ISBN 978-0-387-71761-6. [ bib | Discount Info | Publisher Info ]
This richly illustrated book describes the use of interactive and dynamic graphics as part of multidimensional data analysis. Chapters include clustering, supervised classification, and working with missing values. A variety of plots and interaction methods are used in each analysis, often starting with brushing linked low-dimensional views and working up to manual manipulation of tours of several variables. The role of graphical methods is shown at each step of the analysis, not only in the early exploratory phase, but in the later stages, too, when comparing and evaluating models. All examples are based on freely available software: GGobi for interactive graphics and R for static graphics, modeling, and programming. The printed book is augmented by a wealth of material on the web, encouraging readers follow the examples themselves. The web site has all the data and code necessary to reproduce the analyses in the book, along with movies demonstrating the examples.
[129] David Siegmund and Benjamin Yakir. The Statistics of Gene Mapping. Springer, New York, 2007. ISBN 978-0-387-49684-9. [ bib | Discount Info | Publisher Info ]
This book details the statistical concepts used in gene mapping, first in the experimental context of crosses of inbred lines and then in outbred populations, primarily humans. It presents elementary principles of probability and statistics, which are implemented by computational tools based on the R programming language to simulate genetic experiments and evaluate statistical analyses. Each chapter contains exercises, both theoretical and computational, some routine and others that are more challenging. The R programming language is developed in the text.
[130] Simon N. Wood. Generalized Additive Models: An Introduction with R. Chapman & Hall/CRC, Boca Raton, FL, 2006. ISBN 9781584884743. [ bib | ]
This book imparts a thorough understanding of the theory and practical applications of GAMs and related advanced models, enabling informed use of these very flexible tools. The author bases his approach on a framework of penalized regression splines, and builds a well- grounded foundation through motivating chapters on linear and generalized linear models. While firmly focused on the practical aspects of GAMs, discussions include fairly full explanations of the theory underlying the methods. The treatment is rich with practical examples, and it includes an entire chapter on the analysis of real data sets using R and the author's add-on package mgcv. Each chapter includes exercises, for which complete solutions are provided in an appendix.
[131] Robert H. Shumway and David S. Stoffer. Time Series Analysis and Its Applications With R Examples. Springer, New York, 2006. ISBN 978-0-387-29317-2. [ bib | Discount Info | Publisher Info ]
Time Series Analysis and Its Applications presents a balanced and comprehensive treatment of both time and frequency domain methods with accompanying theory. Numerous examples using non-trivial data illustrate solutions to problems such as evaluating pain perception experiments using magnetic resonance imaging or monitoring a nuclear test ban treaty. The book is designed to be useful as a text for graduate level students in the physical, biological and social sciences and as a graduate level text in statistics. Some parts may also serve as an undergraduate introductory course. Theory and methodology are separated to allow presentations on different levels. Material from the earlier 1988 Prentice-Hall text Applied Statistical Time Series Analysis has been updated by adding modern developments involving categorical time sries analysis and the spectral envelope, multivariate spectral methods, long memory series, nonlinear models, longitudinal data analysis, resampling techniques, ARCH models, stochastic volatility, wavelets and Monte Carlo Markov chain integration methods. These add to a classical coverage of time series regression, univariate and multivariate ARIMA models, spectral analysis and state-space models. The book is complemented by ofering accessibility, via the World Wide Web, to the data and an exploratory time series analysis program ASTSA for Windows that can be downloaded as Freeware.
[132] Peter J. Diggle and Paulo Justiniano Ribeiro. Model-based Geostatistics. Springer, 2006. ISBN 978-0-387-48536-2. [ bib | Discount Info | Publisher Info ]
Geostatistics is concerned with estimation and prediction problems for spatially continuous phenomena, using data obtained at a limited number of spatial locations. The name reflects its origins in mineral exploration, but the methods are now used in a wide range of settings including public health and the physical and environmental sciences. Model-based geostatistics refers to the application of general statistical principles of modeling and inference to geostatistical problems. This volume is the first book-length treatment of model-based geostatistics.
[133] Nhu D. Le and James V. Zidek. Statistical Analysis of Environmental Space-Time Processes. Springer, 2006. ISBN 978-0-387-35429-3. [ bib | Discount Info | Publisher Info ]
This book provides a broad introduction to the subject of environmental space-time processes, addressing the role of uncertainty. It covers a spectrum of technical matters from measurement to environmental epidemiology to risk assessment. It showcases non-stationary vector-valued processes, while treating stationarity as a special case. In particular, with members of their research group the authors developed within a hierarchical Bayesian framework, the new statistical approaches presented in the book for analyzing, modeling, and monitoring environmental spatio-temporal processes. Furthermore they indicate new directions for development.
[134] Lothar Sachs and Jürgen Hedderich. Angewandte Statistik. Methodensammlung mit R. Springer, Berlin, Heidelberg, 12th (completely revised) edition, 2006. ISBN 978-3-540-32160-6. [ bib | Publisher Info ]
Die Anwendung statistischer Methoden wird heute in der Regel durch den Einsatz von Computern unterstützt. Das Programm R ist dabei ein leicht erlernbares und flexibel einzusetzendes Werkzeug, mit dem der Prozess der Datenanalyse nachvollziehbar verstanden und gestaltet werden kann. Diese 12., vollständig neu bearbeitete Auflage veranschaulicht Anwendung und Nutzen des Programms anhand zahlreicher mit R durchgerechneter Beispiele. Sie erläutert statistische Ansätze und gibt leicht fasslich, anschaulich und praxisnah Studenten, Dozenten und Praktikern mit unterschiedlichen Vorkenntnissen die notwendigen Details, um Daten zu gewinnen, zu beschreiben und zu beurteilen. Neben Hinweisen zur Planung und Auswertung von Studien ermöglichen viele Beispiele, Querverweise und ein ausführliches Sachverzeichnis einen gezielten Zugang zur Statistik, insbesondere für Mediziner, Ingenieure und Naturwissenschaftler.
[135] Julian J. Faraway. Extending Linear Models with R: Generalized Linear, Mixed Effects and Nonparametric Regression Models. Chapman & Hall/CRC, Boca Raton, FL, 2006. ISBN 9781584884248. [ bib | Publisher Info | ]
This book surveys the techniques that grow from the regression model, presenting three extensions to that framework: generalized linear models (GLMs), mixed effect models, and nonparametric regression models. The author's treatment is thoroughly modern and covers topics that include GLM diagnostics, generalized linear mixed models, trees, and even the use of neural networks in statistics. To demonstrate the interplay of theory and practice, throughout the book the author weaves the use of the R software environment to analyze the data of real examples, providing all of the R commands necessary to reproduce the analyses.
[136] Jana Jureckova and Jan Picek. Robust Statistical Methods with R. Chapman & Hall/CRC, Boca Raton, FL, 2006. ISBN 9781584884545. [ bib | Discount Info ]
This book provides a systematic treatment of robust procedures with an emphasis on practical application. The authors work from underlying mathematical tools to implementation, paying special attention to the computational aspects. They cover the whole range of robust methods, including differentiable statistical functions, distance of measures, influence functions, and asymptotic distributions, in a rigorous yet approachable manner. Highlighting hands- on problem solving, many examples and computational algorithms using the R software supplement the discussion. The book examines the characteristics of robustness, estimators of real parameter, large sample properties, and goodness-of-fit tests. It also includes a brief overview of R in an appendix for those with little experience using the software.
[137] Emmanuel Paradis. Analysis of Phylogenetics and Evolution with R. Use R. Springer, New York, 2006. ISBN 978-1-4614-1743-9. [ bib | Discount Info | Publisher Info ]
This book integrates a wide variety of data analysis methods into a single and flexible interface: the R language, an open source language is available for a wide range of computer systems and has been adopted as a computational environment by many authors of statistical software. Adopting R as a main tool for phylogenetic analyses sease the workflow in biologists' data analyses, ensure greater scientific repeatability, and enhance the exchange of ideas and methodological developments.
[138] Brian Everitt and Torsten Hothorn. A Handbook of Statistical Analyses Using R. Chapman & Hall/CRC, Boca Raton, FL, 2006. ISBN 1-584-88539-4. [ bib | Discount Info | ]
With emphasis on the use of R and the interpretation of results rather than the theory behind the methods, this book addresses particular statistical techniques and demonstrates how they can be applied to one or more data sets using R. The authors provide a concise introduction to R, including a summary of its most important features. They cover a variety of topics, such as simple inference, generalized linear models, multilevel models, longitudinal data, cluster analysis, principal components analysis, and discriminant analysis. With numerous figures and exercises, A Handbook of Statistical Analysis using R provides useful information for students as well as statisticians and data analysts.
[139] Richard C. Deonier, Simon Tavaré, and Michael S. Waterman. Computational Genome Analysis: An Introduction. Springer, 2005. ISBN 978-0-387-28807-9. [ bib | Discount Info | Publisher Info ]
Computational Genome Analysis: An Introduction presents the foundations of key p roblems in computational molecular biology and bioinformatics. It focuses on com putational and statistical principles applied to genomes, and introduces the mat hematics and statistics that are crucial for understanding these applications. A ll computations are done with R.
[140] Paul Murrell. R Graphics. Chapman & Hall/CRC, Boca Raton, FL, 2005. ISBN 9781584884866. [ bib | Discount Info | Publisher Info | ]
A description of the core graphics features of R including: a brief introduction to R; an introduction to general R graphics features. The “base” graphics system of R: traditional S graphics. The power and flexibility of grid graphics. Building on top of the base or grid graphics: Trellis graphics and developing new graphics functions.
[141] Michael J. Crawley. Statistics: An Introduction using R. Wiley, 2005. ISBN 0-470-02297-3. [ bib | ]
The book is primarily aimed at undergraduate students in medicine, engineering, economics and biology --- but will also appeal to postgraduates who have not previously covered this area, or wish to switch to using R.
[142] John Verzani. Using R for Introductory Statistics. Chapman & Hall/CRC, Boca Raton, FL, 2005. ISBN 9781584884507. [ bib | Publisher Info ]
There are few books covering introductory statistics using R, and this book fills a gap as a true “beginner” book. With emphasis on data analysis and practical examples, `Using R for Introductory Statistics' encourages understanding rather than focusing on learning the underlying theory. It includes a large collection of exercises and numerous practical examples from a broad range of scientific disciplines. It comes complete with an online resource containing datasets, R functions, selected solutions to exercises, and updates to the latest features. A full solutions manual is available from Chapman & Hall/CRC.
[143] Fionn Murtagh. Correspondence Analysis and Data Coding with JAVA and R. Chapman & Hall/CRC, Boca Raton, FL, 2005. ISBN 9781420034943. [ bib | Discount Info | Publisher Info | ]
This book provides an introduction to methods and applications of correspondence analysis, with an emphasis on data coding --- the first step in correspondence analysis. It features a practical presentation of the theory with a range of applications from data mining, financial engineering, and the biosciences. Implementation of the methods is presented using JAVA and R software.
[144] Brian S. Everitt. An R and S-Plus Companion to Multivariate Analysis. Springer, 2005. ISBN 978-1-84628-124-2. [ bib | Discount Info | ]
In this book the core multivariate methodology is covered along with some basic theory for each method described. The necessary R and S-Plus code is given for each analysis in the book, with any differences between the two highlighted.
[145] Andreas Behr. Einführung in die Statistik mit R. WiSo Kurzlehrbücher. Vahlen, München, 2005. ISBN 3-8006-3219-5. In German. [ bib ]
[146] Robert Gentleman, Vince Carey, Wolfgang Huber, Rafael Irizarry, and Sandrine Dudoit, editors. Bioinformatics and Computational Biology Solutions Using R and Bioconductor. Statistics for Biology and Health. Springer, 2005. ISBN 978-0-387-29362-2. [ bib | Discount Info | Publisher Info ]
This volume's coverage is broad and ranges across most of the key capabilities of the Bioconductor project, including importation and preprocessing of high-throughput data from microarray, proteomic, and flow cytometry platforms.
[147] S. Mase, T. Kamakura, M. Jimbo, and K. Kanefuji. Introduction to Data Science for engineers--- Data analysis using free statistical software R (in Japanese). Suuri-Kogaku-sha, Tokyo, April 2004. ISBN 4901683128. [ bib ]
[148] Richard M. Heiberger and Burt Holland. Statistical Analysis and Data Display: An Intermediate Course with Examples in S-Plus, R, and SAS. Springer Texts in Statistics. Springer, 2004. ISBN 978-1-4757-4284-8. [ bib | Discount Info | Publisher Info | ]
A contemporary presentation of statistical methods featuring 200 graphical displays for exploring data and displaying analyses. Many of the displays appear here for the first time. Discusses construction and interpretation of graphs, principles of graphical design, and relation between graphs and traditional tabular results. Can serve as a graduate-level standalone statistics text and as a reference book for researchers. In-depth discussions of regression analysis, analysis of variance, and design of experiments are followed by introductions to analysis of discrete bivariate data, nonparametrics, logistic regression, and ARIMA time series modeling. Concepts and techniques are illustrated with a variety of case studies. S-Plus, R, and SAS executable functions are provided and discussed. S functions are provided for each new graphical display format. All code, transcript and figure files are provided for readers to use as templates for their own analyses.
[149] Julian J. Faraway. Linear Models with R. Chapman & Hall/CRC, Boca Raton, FL, 2004. ISBN 9781584884255. [ bib | Publisher Info | ]
The book focuses on the practice of regression and analysis of variance. It clearly demonstrates the different methods available and in which situations each one applies. It covers all of the standard topics, from the basics of estimation to missing data, factorial designs, and block designs, but it also includes discussion of topics, such as model uncertainty, rarely addressed in books of this type. The presentation incorporates an abundance of examples that clarify both the use of each technique and the conclusions one can draw from the results.
[150] Dubravko Dolic. Statistik mit R. Einführung für Wirtschafts- und Sozialwissenschaftler. R. Oldenbourg, München, Wien, 2004. ISBN 3-486-27537-2. In German. [ bib ]
[151] Sylvie Huet, Annie Bouvier, Marie-Anne Gruet, and Emmanuel Jolivet. Statistical Tools for Nonlinear Regression. Springer, New York, 2003. ISBN 978-0-387-21574-7. [ bib | Discount Info | Publisher Info ]
[152] Stefano Iacus and Guido Masarotto. Laboratorio di statistica con R. McGraw-Hill, Milano, 2003. ISBN 88-386-6084-0. [ bib ]
[153] Giovanni Parmigiani, Elizabeth S. Garrett, Rafael A. Irizarry, and Scott L. Zeger. The Analysis of Gene Expression Data. Springer, New York, 2003. ISBN 978-0-387-21679-9. [ bib | Discount Info | Publisher Info ]
[154] William N. Venables and Brian D. Ripley. Modern Applied Statistics with S. Fourth Edition. Springer, New York, 2002. ISBN 978-0-387-21706-2. [ bib | Discount Info | Publisher Info | ]
A highly recommended book on how to do statistical data analysis using R or S-Plus. In the first chapters it gives an introduction to the S language. Then it covers a wide range of statistical methodology, including linear and generalized linear models, non-linear and smooth regression, tree-based methods, random and mixed effects, exploratory multivariate analysis, classification, survival analysis, time series analysis, spatial statistics, and optimization. The `on-line complements' available at the books homepage provide updates of the book, as well as further details of technical material.
[155] John Fox. An R and S-Plus Companion to Applied Regression. Sage Publications, Thousand Oaks, CA, USA, 2002. ISBN 0-761-92279-2. [ bib | ]
A companion book to a text or course on applied regression (such as “Applied Regression, Linear Models, and Related Methods” by the same author). It introduces S, and concentrates on how to use linear and generalized-linear models in S while assuming familiarity with the statistical methodology.
[156] Manuel Castejón Limas, Joaquín Ordieres Meré, Fco. Javier de Cos Juez, and Fco. Javier Martínez de Pisón Ascacibar. Control de Calidad. Metodologia para el analisis previo a la modelización de datos en procesos industriales. Fundamentos teóricos y aplicaciones con R. Servicio de Publicaciones de la Universidad de La Rioja, 2001. ISBN 84-95301-48-2. [ bib ]
This book, written in Spanish, is oriented to researchers interested in applying multivariate analysis techniques to real processes. It combines the theoretical basis with applied examples coded in R.
[157] Frank E. Harrell. Regression Modeling Strategies, with Applications to Linear Models, Survival Analysis and Logistic Regression. Springer, 2001. ISBN 978-3-319-19425-7. [ bib | Discount Info | Publisher Info | ]
There are many books that are excellent sources of knowledge about individual statistical tools (survival models, general linear models, etc.), but the art of data analysis is about choosing and using multiple tools. In the words of Chatfield “... students typically know the technical details of regression for example, but not necessarily when and how to apply it. This argues the need for a better balance in the literature and in statistical teaching between techniques and problem solving strategies.” Whether analyzing risk factors, adjusting for biases in observational studies, or developing predictive models, there are common problems that few regression texts address. For example, there are missing data in the majority of datasets one is likely to encounter (other than those used in textbooks!) but most regression texts do not include methods for dealing with such data effectively, and texts on missing data do not cover regression modeling.
[158] William N. Venables and Brian D. Ripley. S Programming. Springer, New York, 2000. ISBN 978-0-387-21856-4. [ bib | Discount Info | Publisher Info | ]
This provides an in-depth guide to writing software in the S language which forms the basis of both the commercial S-Plus and the Open Source R data analysis software systems.
[159] Terry M. Therneau and Patricia M. Grambsch. Modeling Survival Data: Extending the Cox Model. Statistics for Biology and Health. Springer, 2000. ISBN 978-1-4757-3294-8. [ bib | Discount Info | Publisher Info ]
This is a book for statistical practitioners, particularly those who design and analyze studies for survival and event history data. Its goal is to extend the toolkit beyond the basic triad provided by most statistical packages: the Kaplan-Meier estimator, log-rank test, and Cox regression model.
[160] Jose C. Pinheiro and Douglas M. Bates. Mixed-Effects Models in S and S-Plus. Springer, 2000. ISBN 978-0-387-22747-4. [ bib | Discount Info | Publisher Info ]
A comprehensive guide to the use of the `nlme' package for linear and nonlinear mixed-effects models.
[161] Deborah Nolan and Terry Speed. Stat Labs: Mathematical Statistics Through Applications. Springer Texts in Statistics. Springer, 2000. ISBN 978-0-387-22743-6. [ bib | Discount Info | Publisher Info | ]
Integrates theory of statistics with the practice of statistics through a collection of case studies (“labs”), and uses R to analyze the data.
[162] John M. Chambers. Programming with Data. Springer, New York, 1998. ISBN 978-0-387-98503-9. [ bib | Discount Info | Publisher Info ]
This “Green Book” describes version 4 of S, a major revision of S designed by John Chambers to improve its usefulness at every stage of the programming process.
[163] John M. Chambers and Trevor J. Hastie. Statistical Models in S. Chapman & Hall, London, 1992. ISBN 9780412830402. [ bib | Discount Info | Publisher Info ]
This is also called the “White Book”. It described software for statistical modeling in S and introduced the S3 version of classes and methods.
[164] Richard A. Becker, John M. Chambers, and Allan R. Wilks. The New S Language. Chapman & Hall, London, 1988. [ bib ]
This book is often called the “Blue Book”, and introduced what is now known as S version 3, or S3.

This file was generated by bibtex2html 1.99.