Topic 4: Statistics and Probability
Concepts
Essential understandings:
Statistics is concerned with the collection, analysis and interpretation of quantitative data and uses the theory of probability to estimate parameters, discover empirical laws, test hypotheses and predict the occurrence of events. Statistical representations and measures allow us to represent data in many different forms to aid interpretation.
Probability enables us to quantify the likelihood of events occurring and so evaluate risk. Both statistics and probability provide important representations which enable us to make predictions, valid comparisons and informed decisions. These fields have power and limitations and should be applied with care and critically questioned, in detail, to differentiate between the theoretical and the empirical/observed. Probability theory allows us to make informed choices, to evaluate risk and to make predictions about seemingly random events.
Suggested concepts embedded in this topic:
Quantity, validity, approximation, modelling, relationships, patterns.
AHL: Systems, representation.
Content-specific conceptual understandings:
Organizing, representing, analysing and interpreting data, and utilizing different statistical tools facilitates prediction and drawing of conclusions.
Different statistical techniques require justification and the identification of their limitations and validity.
Approximation in data can approach the truth but may not always achieve it.
Correlation and regression are powerful tools for identifying patterns and equivalence of systems.
Modelling and finding structure in seemingly random events facilitates prediction.
Different probability distributions provide a representation of the relationship between the theory and reality, allowing us to make predictions about what might happen.
AHL: Statistical literacy involves identifying reliability and validity of samples and whole populations in a closed system.
AHL: A systematic approach to hypothesis testing allows statistical inferences to be tested for validity.
AHL: Representation of probabilities using transition matrices enables us to efficiently predict long-term behaviour and outcomes.
SL Content
SL 4.1
Concepts of population, sample, random sample, discrete and continuous data.
Reliability of data sources and bias in sampling.
Interpretation of outliers.
Sampling techniques and their effectiveness.
SL 4.2
Presentation of data (discrete and continuous): frequency distributions (tables).
Histograms.
Cumulative frequency; cumulative frequency graphs; use to find median, quartiles, percentiles, range and interquartile range (IQR).
Production and understanding of box and whisker diagrams.
SL 4.3
Measures of central tendency (mean, median and mode).
Estimation of mean from grouped data.
Modal class.
Measures of dispersion (interquartile range, standard deviation and variance).
Effect of constant changes on the original data.
Quartiles of discrete data.
SL 4.4
Linear correlation of bivariate data.
Scatter diagrams; lines of best fit, by eye, passing through the mean point.
Use of the equation of the regression line for prediction purposes.
SL 4.5
Concepts of trial, outcome, equally likely outcomes, relative frequency, sample space (U) and event.
Expected number of occurrences.
SL 4.6
Use of Venn diagrams, tree diagrams, sample space diagrams and tables of outcomes to calculate probabilities.
SL 4.7
Concept of discrete random variables and their probability distributions.
Applications.
SL 4.8
Binomial distribution.
Mean and variance of the binomial distribution.
SL 4.9
The normal distribution and curve.
Properties of the normal distribution.
Diagrammatic representation.
Normal probability calculations.
Inverse normal calculations
SL 4.10
Awareness of the appropriateness and limitations of Pearson’s product moment correlation coefficient and Spearman’s rank correlation coefficient, and the effect of outliers on each.
SL 4.11
Expected and observed frequencies.
Using one-tailed and two-tailed tests.
AHL Content
AHL 4.12
Design of valid data collection methods, such as surveys and questionnaires.
Selecting relevant variables from many variables.
Choosing relevant and appropriate data to analyse.
Definition of reliability and validity. Reliability tests.
Validity tests.
AHL 4.13
Non-linear regression.
Evaluation of least squares regression curves using technology.
AHL 4.14
Linear transformation of a single random variable.
AHL 4.15
Central limit theorem.
AHL 4.16
Confidence intervals for the mean of a normal population.
AHL 4.17
Poisson distribution, its mean and variance.
Sum of two independent Poisson distributions has a Poisson distribution.
AHL 4.18
Critical values and critical regions.
Test for population mean for normal distribution.
Test for proportion using binomial distribution.
Test for population mean using Poisson distribution.
Type I and II errors including calculations of their probabilities.
AHL 4.19
Transition matrices.
Powers of transition matrices.
Regular Markov chains.
Initial state probability matrices.
Calculation of steady state and long-term probabilities by repeated multiplication of the transition matrix or by solving a system of linear equations.
Last updated