the sum of all the values divided by the number of values
A segment or Ray that joins a vertex to the midpoint of the opposite side
the most common value
The difference between the greatest number and the least number in a set of data.
Measures of Central Tendency
used to describe the center of a set of data: mean, median, mode
a data value that is either much greater or much less than the median
Relative Frequency
the ratio of the frequency of a category to the total frequency
dot plot
graph with each individual entry
A bar graph depicting a frequency distribution. The height of the bars indicates the frequency of a group of scores.
box plot
also called a box and whisker plot
Flip the left side of the equation to the right side.
Skewed right
mean > median
Skewed left
mean < median
normal distribution
Bell-shaped probability distribution where the frequencies start low, then increase to one or two high frequencies, then decrease to a low frequency. The distribution is approximately symmetric.
Standard Deviation
measures how spread out the data is in relationship to the mean
conditional relative frequency
based only on a specific row or column in a 2 way table
We ________________ to eliminate units
standardized value
Value found by subtracting the mean and dividing by the standard deviation.
Adding a constant to the mean, the median, and the quartiles, but does not change the standard deviation or IQR.
Multiple each data value by a constant multiplies both the measures of position and the measures of spread by that constant.
Normal Model
A useful family of models for unimodel, symmetric distributions.
Number that describes a population
A number that describes a sample
z is the distance from the mean of the normal distribution expressed in units of standard deviation.
Displays the 5-number summary as a central box with whiskers that extend to the non-outlying data values.
Far Outlier
If a point is more than 3.0 IQR from either end of the box in a boxplot.
Comparing Distributions
Consider: shape, center, spread
Comparing Boxplots
Compare Shapes; Compare Medians; Compare IQRS; Check for outliers
Displays data that change overtime.
Standard Deviation
Square Root of the Var.
Measures of spread
A calculated summary is said to be ________________ if outliers have only a small effect on it.
5 Number Summary
Reports the min., Q1, the median, Q3 and the max.
The # that falls above i% of the data.
Interquartile Range (IQR)
The difference between the 1st and 3rd Quartiles.
Values that are very unusual in the sense that they are very far away from most of the data.
Distribution is _____________________ if it's not symmetric and 1 tail stretches out farther than the other.
The parts that typically trail off on either side.
A distribution that's roughly flat.
1 mode
2 modes
More than 2 modes
A numerical summary of how tightly the values are clustered around the center. Measures: IQR, Standard Dev.
Each regular polygon has a center because it can be inscribed in a circle.
To describe the _________ of a distribution, look for: single vs. mult. modes; symmetry vs skewness; outliers and gaps.
Consists of a graph in which each data value is plotted as a point (or dot) along a scale of values. Dots representing equal values are stacked.
Stem and Leaf Display
Shows quantitative data values in a way that sketches the distribution of the data.
A region of the distribution where there are no values.
Frequency Table (Relative Frequency Table)
Lists the categories in a categorical var. and gives the count of percentages of each categories observation.
The _____________________________ of a var. gives: possible values of the variance; the relative frequency of each value.
area principle
In a statistical display, each data value should be represented by the same amount of area.
bar chart
Shows a bar whose area represents the count (or percentage) of observations for each category of a categorical variance.
Pie Chart
Graphical representation of data in the form of a circle containing wedges.
Contingency Table
Displays counts and, sometimes, percentages of individuals falling into named categories on 2 or more var.
Marginal Distribution
In a contingency table, the distribution of either var. alone.
Conditional Distribution
The distribution of a var. restricting the who to consider only a smaller group of individuals.
Variables are ________________ if the conditional distribution of one variables is the same for each category of the other.
Segmented Bar Chart
Displays the conditional distribution of a categorical var. within each category of another var.
simpson's paradox
When averages are taken across different groups, they can appear to contradict the overall averages.
Tells who was measured, what was measured, how the data were collected, where the data was collected, and when and why the study was performed.
A collection of information gathered for a purpose. Data may be in the form of either words or numbers.
Data Table
An arrangement of data in which each row represents a case and each column represents a variable.
Individual about whom or which we have data.
the entire group of items or individuals for which a sample is taken (the entire American _______________________, New jersey is a sample)
a randomly selected group chosen for the purpose of collecting data. ie 3 middle schools in Monmouth county to determine information regarding typical Middle School student in Monmouth County
an alphabetic character representing a number, called the value, which is either arbitrary or not fully specified or unknown. It is usually a letter like x or y.
A quantity or amount adopted as a standard of measurement, such as dollars, hours, or grams.
categorical variable
A variable that names categories (words/numbers)
Quantitative Variable
A variable in which the numbers act as numerical values - always have units.
Random Phenomenon
If we know what outcomes could happen, but not which particular valves will happen.
each result/observation of an experiment, such as one roll of a number cube.
any one of the possible results of an action
a single outcome or a group of outcomes
sample space
is the set of all possible outcomes
Law of Large Numbers
As the number of trials in a probability experiment increases, the experimental probability approaches the theoretical probability.
Empirical Probability
The probability comes from the long-run relative frequency of the event's occurence.
Theoretical Probability
What the outcomes were supposed to be theoretically.
Personal Probability
When the probability is subjective and represents your personal degree of belief.
observational study
The researcher observes the experimental units in their natural setting and records the variable(s) of interest. The researcher makes no attempt to control any aspect of the experimental units.
retrospective study
An observational study in which subjects are selected and then their previous conditions or behaviors are determined.
prospective study
An observational study in which subjects are followed to observe future outcomes.
an organized procedure for testing a hypothesis.
A number that divides evenly into another number. 3 is a factor of 15 (look at the chart!)
Dependent (response) variables
experimental units
Individuals on whom an experiment is performed.
The specific values that the experimenter chooses for a factor.
The process, intervention, or other controlled circumstance applied to randomly assigned experimental units.
Priciples of Experimental Design
Control; Randomize; Replicate; Block
Control group
Consists of the units who are not to receive the treatment that is the focus of the experiment
placebo effect
The tendency of many human subjects to show a response even when adminstered a placebo.
Any individual associated with an experiment who is not aware of how subjects have been allocated to treatment groups.
A treatment known to have no affect.
Levels of one factor are associated with the levels of another factor in such a way that their effects cannot be separated.