About this book

We believe that many foundational ideas of Probability and Statistics are best understood when their natural connection is emphasised. We feel that the interested student should learn the mathematical rigour of Probability, the motivating examples and techniques from Statistics, and an instructive technology to perform computations relating to both in an inclusive manner. These formed our main motivations for writing this book. We have chosen to use the R software environment to demonstrate an available computational tool.

The book is intended to be an undergraduate text for a course on Probability Theory. We had in mind courses such as the one year (two semester) Probability course at many universities in India such as the Indian Statistical Institute or Chennai Mathematical Institue, or a one semester (or two quarter) Probability course as is commonly offered as an upper division, post-calculus elective at many North American universities. The Statistics material and the package R are introduced so as to emphasise motivations and applications of the probabilistic material. We assume that our readers are well-versed in calculus, have a basic understanding of the theory of sets and functions, combinatorics, and proof techniques, and have at least a passing awareness of the distinction between countable and uncountable infinities. We do not assume any particular experience of Linear Algebra or Real Analysis.

Using this book

The book is a work in progress. We are making draft version of the book available for comments and feedback, which you may send by email to any of the authors below. The detailed contents and direct links to each chapter is given below. You are free to use this book for educational purposes.

Suggested BibTeX citation:

@misc{AST-2016,
      AUTHOR = {Siva Athreya, Deepayan Sarkar, and Steve Tanner},
      TITLE = {Probability and Statistics with Examples using R},
      YEAR = {2024},
      NOTE = {Unfinished Book, Last Compilation November 18th, 2024, available 
      at \url{https://psweur.github.io}}}

Chapters

  • Table of contents

  • Preface

  • Chapter 1: Basic Concepts

    • 1.1 Defnitions and Properties
      • 1.1.1 Definitions
      • 1.1.2 Basic Properties
    • 1.2 Equally Likely Outcomes
    • 1.3 Conditional Probability and Bayes' Theorem
    • 1.4 Bayes' Theorem
    • 1.5 Independence
    • 1.6 Using R for computation
  • Chapter 2: Sampling and Repeated Trials

    • 2.1 Bernoulli Trials
      • 2.1.1 Using R to compute probabilities
    • 2.2 Poisson Approximation
    • 2.3 Sampling With and Without Replacement
      • 2.3.1 The Hypergeometric Distribution
      • 2.3.2 Hypergeometric Distributions as a Series of Dependent Trials
      • 2.3.3 Binomial Approximation to the Hypergeometric Distribution
  • Chapter 3: Discrete Random Variables

    • 3.1 Random Variables as Functions
      • 3.1.1 Common Distributions
    • 3.2 Independent and Dependent Variables
      • 3.2.1 Independent Variables
      • 3.2.2 Conditional, Joint, and Marginal Distributions
      • 3.2.3 Memoryless Property of the Geometric Random Variable
      • 3.2.4 Multinomial Distributions
    • 3.3 Functions of Random Variables
      • 3.3.1 Distribution of $f(X)$ and $f(X_1, X_2, \dots , X_n)$
      • 3.3.2 Functions and Independence
  • Chapter 4: Summarizing Discrete Random Variables

    • 4.1 Expected Value
      • 4.1.1 Properties of the Expected Value
      • 4.1.2 Expected Value of a Product
      • 4.1.3 Expected Values of Common Distributions
      • 4.1.4 Expected Value of $f(X_1, X_2, \dots , X_n)$
    • 4.2 Variance and Standard Deviation
      • 4.2.1 Properties of Variance and Standard Deviation
      • 4.2.2 Variances of Common Distributions
      • 4.2.3 Standardized Variables
    • 4.3 Standard Units
      • 4.3.1 Markov and Chebyshev Inequalities
    • 4.4 Conditional Expectation and Conditional Variance
    • 4.5 Covariance and Correlation
      • 4.5.1 Covariance
      • 4.5.2 Correlation
    • 4.6 Exchangeable Random Variables
  • Chapter 5: Continuous Probabilities and Random Variables

    • 5.1 Uncountable Sample Spaces and Densities
      • 5.1.1 Probability Densities on $\mathbb R$
    • 5.2 Continuous Random Variables
      • 5.2.1 Common Distributions
      • 5.2.2 A word about individual outcomes
    • 5.3 Transformation of Continuous Random Variables
    • 5.4 Multiple Continuous Random Variables
      • 5.4.1 Marginal Distributions
      • 5.4.2 Independence
      • 5.4.3 Conditional Density
    • 5.5. Functions of Independent Random variables
      • 5.5.1 Distributions of Sums of Independent Random variables
      • 5.5.2 Distributions of Quotients of Independent Random varibles.
  • Chapter 6: Summarising Continuous Random Variables

    • 6.1 Expectation, and Variance
    • 6.2 Covariance, Correlation, Conditional Expectation and Conditional Variance
    • 6.3 Moment Generating Functions
    • 6.4 Bivariate Normals
  • Chapter 7: Sampling and Descriptive Statistics

    • 7.1 The empirical distribution
    • 7.2 Descriptive Statistics
      • 7.2.1 Sample Mean
      • 7.2.2 Sample Variance
      • 7.2.3 Sample proportion
    • 7.3 Simulation
    • 7.4 Plots
      • 7.4.1 Empirical Distribution Plot for Discrete Distributions
      • 7.4.2 Histograms for Continuous Distributions
      • 7.4.3 Hanging Rootograms for Comparing with Theoretical Distributions
      • 7.4.4 Q-Q Plots for Continuous Distributions
  • Chapter 8: Sampling Distributions and Limit Theorems

    • 8.1 Multi-dimensional continous random variables
      • 8.1.1 Order Statistics and their Distributions
    • 8.2 Distribution of Sampling Statistics from a Normal population
    • 8.3 Weak Law of Large Numbers
    • 8.4 Convergence in Distribution
    • 8.5 Central Limit Theorem
      • 8.6 Normal Approximation and Continuity Correction
  • Chapter 9: Estimation

    • 9.1 Method of Moments
    • 9.2 Maximum Likelihood
    • 9.3 Confidence Intervals
      • 9.3.1 Pivotal Quantity Approach
      • 9.3.2 Empirical Coverage Probability of Confidence Intervals
      • 9.3.3 Approximate Confidence Intervals using CLT
      • 9.3.4 Confidence Intervals for the Population Median
  • Chapter 10: Hypothesis Testing

    • 10.1 Introduction
    • 10.2 The Goodness of Fit Problem in the Multinomial Model
    • 10.3 Independence of Two Categorical Attributes
    • 10.4 Testing in the Parametric Setup: The Intuitive Approach
    • 10.5 The General Approach: Likelihood Ratio Test
    • 10.6 Specific Examples
    • 10.7 Testing for Goodness of Fit
    • 10.8 Testing for Independence of Categorical Attributes
  • Appendix A: Some Mathematical Details

    • A.1 Transformation of Continuous Random Variables - Jacobian
    • A.2 Strong Law of Large Numbers
  • Appendix B: Tables

Contact

Siva Athreya
Survey No. 151 Shivakote
Hesaraghatta Hobli
Bengaluru, 560089.
Email:athreya@icts.res.in
Deepayan Sarkar
Indian Statistical Institute
7 S.J.S. Sansanwal Marg
New Delhi, 110016
Email:deepayan@isid.ac.in
Steve Tanner
Eastern Oregon University
One University Boulevard
La Grande, OR 97850-2807
Email:stanner@eou.edu