Level 232 Level 234
Level 233

Statistics IV


58 words 0 ignored

Ready to learn       Ready to review

Ignore words

Check the boxes below to ignore/unignore words, then click save at the bottom. Ignored words will never appear in any learning session.

All None

Ignore?
Mode
the most common value
Unimodal
1 mode
Bimodal
2 modes
multimodal
More than 2 modes
uniform
A distribution that's roughly flat.
Symmetric
Flip the left side of the equation to the right side.
tails
The parts that typically trail off on either side.
skewed
Distribution is _____________________ if it's not symmetric and 1 tail stretches out farther than the other.
outliers
Values that are very unusual in the sense that they are very far away from most of the data.
timeplot
Displays data that change overtime.
Center
Each regular polygon has a center because it can be inscribed in a circle.
Median
A segment or Ray that joins a vertex to the midpoint of the opposite side
spread
A numerical summary of how tightly the values are clustered around the center. Measures: IQR, Standard Dev.
Range
The difference between the greatest number and the least number in a set of data.
quartile
the lower of this is the value with a quarter of the data below it; the upper of this has a quarter of the data above it
interquartile range
the difference between the first and third quartiles
Percentile
The # that falls above i% of the data.
5-number summary
consists of the minimum and maximum, the quartiles Q1 and Q3, and the median
boxplot
Displays the 5-number summary as a central box with whiskers that extend to the non-outlying data values.
Mean
the sum of all the values divided by the number of values
Variance
Measures of spread
Standard Deviation
the square root of the variance
Comparing Distributions
Consider: shape, center, spread
shifting
Adding a constant to the mean, the median, and the quartiles, but does not change the standard deviation or IQR.
Rescaling
Multiple each data value by a constant multiplies both the measures of position and the measures of spread by that constant.
standardizing
done to eliminate units; values can be compared and combined even if the original variables had different units and magnitudes
standardized value
Value found by subtracting the mean and dividing by the standard deviation.
Normal Model
A useful family of models for unimodel, symmetric distributions.
parameter
Number that describes a population
Statistic
A number that describes a sample
Z-score
z is the distance from the mean of the normal distribution expressed in units of standard deviation.
standard normal model
a normal model with a mean of 0 and a standard deviation of 1
68-95-99.7 rule
in a normal model, about 68% of values fall within 1 standard deviation of the mean, about 95% fall within 2 standard deviations of the mean, and about 99.7% fall within 3 standard deviations of the mean
normal percentile
this corresponding to a z-score gives the percentage of values in a standard normal distribution found at that z-score or below
normal probability plot
a display to help assess whether a distribution of data is approximately normal; if it is nearly straight, the data satisfy the nearly normal condition
changing center and spread
doing this is equivalent to changing its units
scatterplots
shows the relationship between two quantitative variables measured on the same cases
direction
a positive ________ or association means that, in general, as one variable increases, so does the other; when increases in one variable generally correspond to decreases in the other, the association is negative
form
the mood and figure of a syllogism
strength
a scatterplot shows an association that is this if there is little scatter around the underlying relationship
correlation
summarizes the direction of the association between two quantitative variables and the strength of its linear trend
outlier
a data value that is either much greater or much less than the median
lurking variable
A variable other than x and y that simultaneously affects both variables, accounting for the correlation between the two.
Model
An equation of formula that simplifies and represents reality.
Linear Model
An equation of a line. To interpret a linear model, we need to know the variables and their units.
Residuals
The vertical deviation between the observations and the LSRL
predicted value
The value of y^ found for a given x-value in the data. This is found by substituting the x-value in reg. equation.
slope
(y₂-y₁)/(x₂-x₁)
Regression to the mean
Because correlation is always less than 1.0 in magnitude, each predicted y^ tends to be fewer standard deviation from its mean than its corresponding x was from its mean.
regression line
predicts the value for the response variable y as a straight line function of the value x of the explanatory variable
Intercept
The intercept b (little o), gives a starting value in y-units. It's the y^ - value when x = 0.
Least Squares
Specifics the unique line that minimizes the variance of the residuals or, equivalently, the sum of the squared residuals.
subset
if data consist of two or more groups that have been thrown together, it is usually best to fit different linear models to each group than to try to fit a single model to all of the data
Leverage
Data points whose x-value are far from the man of x, are said to exert _____________________________ on a linear model.
Influential Point
A point that influences where the LSRL is located; if removed, it will significantly change the slope of the LSRL
re-express data
we do this by taking the logarithm, the square root, the reciprocal, or some other mathematical operation on all values in the data set
ladder of powers
Places in order the effects that many re-expressions have on the data.
random
Outcomes occur at random if each outcome is equally likely to occur.