Level 232
Level 234

#### 58 words 0 ignored

Ready to learn
Ready to review

## Ignore words

Check the boxes below to ignore/unignore words, then click save at the bottom. Ignored words will never appear in any learning session.

**Ignore?**

Mode

the most common value

Unimodal

1 mode

Bimodal

2 modes

multimodal

More than 2 modes

uniform

A distribution that's roughly flat.

Symmetric

Flip the left side of the equation to the right side.

tails

The parts that typically trail off on either side.

skewed

Distribution is _____________________ if it's not symmetric and 1 tail stretches out farther than the other.

outliers

Values that are very unusual in the sense that they are very far away from most of the data.

timeplot

Displays data that change overtime.

Center

Each regular polygon has a center because it can be inscribed in a circle.

Median

A segment or Ray that joins a vertex to the midpoint of the opposite side

spread

A numerical summary of how tightly the values are clustered around the center. Measures: IQR, Standard Dev.

Range

The difference between the greatest number and the least number in a set of data.

quartile

the lower of this is the value with a quarter of the data below it; the upper of this has a quarter of the data above it

interquartile range

the difference between the first and third quartiles

Percentile

The # that falls above i% of the data.

5-number summary

consists of the minimum and maximum, the quartiles Q1 and Q3, and the median

boxplot

Displays the 5-number summary as a central box with whiskers that extend to the non-outlying data values.

Mean

the sum of all the values divided by the number of values

Variance

Measures of spread

Standard Deviation

the square root of the variance

Comparing Distributions

Consider: shape, center, spread

shifting

Adding a constant to the mean, the median, and the quartiles, but does not change the standard deviation or IQR.

Rescaling

Multiple each data value by a constant multiplies both the measures of position and the measures of spread by that constant.

standardizing

done to eliminate units; values can be compared and combined even if the original variables had different units and magnitudes

standardized value

Value found by subtracting the mean and dividing by the standard deviation.

Normal Model

A useful family of models for unimodel, symmetric distributions.

parameter

Number that describes a population

Statistic

A number that describes a sample

Z-score

z is the distance from the mean of the normal distribution expressed in units of standard deviation.

standard normal model

a normal model with a mean of 0 and a standard deviation of 1

68-95-99.7 rule

in a normal model, about 68% of values fall within 1 standard deviation of the mean, about 95% fall within 2 standard deviations of the mean, and about 99.7% fall within 3 standard deviations of the mean

normal percentile

this corresponding to a z-score gives the percentage of values in a standard normal distribution found at that z-score or below

normal probability plot

a display to help assess whether a distribution of data is approximately normal; if it is nearly straight, the data satisfy the nearly normal condition

changing center and spread

doing this is equivalent to changing its units

scatterplots

shows the relationship between two quantitative variables measured on the same cases

direction

a positive ________ or association means that, in general, as one variable increases, so does the other; when increases in one variable generally correspond to decreases in the other, the association is negative

form

the mood and figure of a syllogism

strength

a scatterplot shows an association that is this if there is little scatter around the underlying relationship

correlation

summarizes the direction of the association between two quantitative variables and the strength of its linear trend

outlier

a data value that is either much greater or much less than the median

lurking variable

A variable other than x and y that simultaneously affects both variables, accounting for the correlation between the two.

Model

An equation of formula that simplifies and represents reality.

Linear Model

An equation of a line. To interpret a linear model, we need to know the variables and their units.

Residuals

The vertical deviation between the observations and the LSRL

predicted value

The value of y^ found for a given x-value in the data. This is found by substituting the x-value in reg. equation.

slope

(y₂-y₁)/(x₂-x₁)

Regression to the mean

Because correlation is always less than 1.0 in magnitude, each predicted y^ tends to be fewer standard deviation from its mean than its corresponding x was from its mean.

regression line

predicts the value for the response variable y as a straight line function of the value x of the explanatory variable

Intercept

The intercept b (little o), gives a starting value in y-units. It's the y^ - value when x = 0.

Least Squares

Specifics the unique line that minimizes the variance of the residuals or, equivalently, the sum of the squared residuals.

subset

if data consist of two or more groups that have been thrown together, it is usually best to fit different linear models to each group than to try to fit a single model to all of the data

Leverage

Data points whose x-value are far from the man of x, are said to exert _____________________________ on a linear model.

Influential Point

A point that influences where the LSRL is located; if removed, it will significantly change the slope of the LSRL

re-express data

we do this by taking the logarithm, the square root, the reciprocal, or some other mathematical operation on all values in the data set

ladder of powers

Places in order the effects that many re-expressions have on the data.

random

Outcomes occur at random if each outcome is equally likely to occur.