Level 231 Level 233
Level 232

Statistics III


89 words 0 ignored

Ready to learn       Ready to review

Ignore words

Check the boxes below to ignore/unignore words, then click save at the bottom. Ignored words will never appear in any learning session.

All None

Ignore?
What is a type 1 error?
when H₀ is true and you reject H₀
What is a type 2 error?
when H₀ is false and you accept H₀
the significance level,
What is the value of α called? what values do we usually take as α?
t = r(√(n-z)/(1-r²))
To use standard tables you should compute the t value, how do you do this?
the critical value c which satisfies;
What do you find when looking in the t tables?
When would you reject H₀?
if the value of |t|>c at level α.
normally distributed data.
What kind of distribution uses a t test?
H₁: p ≠ 0
If H₀: p = 0, then what is the alternative H₁?
a two-tailed test.
If you include H₀ and H₁ what type of test is this?
When do you use a one-tailed test?
when you have stated what the connection between the data is in your hypothesis.
When do you use a two-tailed test?
when you have stated there is a connection but not the direction in you hypothesis.
Population
the entire group of items or individuals for which a sample is taken (the entire American _______________________, New jersey is a sample)
parameter
Number that describes a population
GOAL
Estimation, confidence interval, hypothesis testing
Statistic
A number that describes a sample
Response variable
The variable being studied by the experimenter. The experiment will investigate how the response variable behaves when the investigator manipulates one or more explanatory variables or factors.
lurking variable
A variable other than x and y that simultaneously affects both variables, accounting for the correlation between the two.
Randomization
The best defense against bias; each individual is given a fair, random chance of selection.
Blocking
Using extraneous factors to create groups blocks that are similar. All experimental conditions are then tried in each block.
Control
Holding extraneous factors constant so that their effects are not confounded with those of the experimental conditions.
prospective study
An observational study in which subjects are followed to observe future outcomes.
retrospective study
An observational study in which subjects are selected and then their previous conditions or behaviors are determined.
Experimental study
assigns to each subject a treatment and then observes the outcome on the response variable
Treatments
experimental conditions which correspond to assigned values of the explanatory variable
observational study
The researcher observes the experimental units in their natural setting and records the variable(s) of interest. The researcher makes no attempt to control any aspect of the experimental units.
Simple random sample
each possible sample is equally likely
Clusters random sample
identify clusters of subjects, take simple random sample of the clusters
stratified random sample
A sampling design in which the population is divided into several subpopulations, or strata, and random samples are then drawn from each stratum.
census
Used to measure a variable for every unit of a population.
survey
a question or set of questions designed to collect data
Observational
subjects responses are recorded under various conditions that are not manipulated by the researcher
Designed
subjects responses are recorded under various experimental conditions that are manipulated by the researcher
sampling frame
The list of possible subjects who could be selected in a sample.
experimental units
Individuals on whom an experiment is performed.
Randomized block
a block design with random assignment of treatments to units within blocks
Variable
an alphabetic character representing a number, called the value, which is either arbitrary or not fully specified or unknown. It is usually a letter like x or y.
Categorical
What type of data is the color of M&M's?
Quantitative
observations on it take numerical values that represent different magnitudes of the variable
Discrete
is a quantitative variable that is usually a count such as 0, 1, 2, 3
Continuous
is a quantitative variable that has a continuum of infinitely many possible values
frequency table
a data display that shows how often an item appears in a category
Pie Chart
Graphical representation of data in the form of a circle containing wedges.
Bar graph
Bars do not touch; categorical data is typically on the horizontal axis; to describe: comment on which occurred the most often or least often
dot plot
graph with each individual entry
Stem-and-leaf plot
it displays individual observations
Histogram
A bar graph depicting a frequency distribution. The height of the bars indicates the frequency of a group of scores.
distribution
The _____________________________ of a var. gives: possible values of the variance; the relative frequency of each value.
skew
they have no relationship
Skewed to the left
the left tail is longer than the right tail
Skewed to the right
the right tail is longer than the left tail
Mean
the sum of all the values divided by the number of values
Median
A segment or Ray that joins a vertex to the midpoint of the opposite side
Range
The difference between the greatest number and the least number in a set of data.
Confounding variables
when we are uncertain which two variables is causing an effect
Blind
subjects don't know the treatment to which they are assigned
Double blind
whoever has with subjects and subjects are not aware of the treatment
outliers
Values that are very unusual in the sense that they are very far away from most of the data.
Z score
Standardized score
Mean and Median
describe the center of a distribution
Range and Standard Deviation
describe the variability of the distribution
Standard Deviation
Square Root of the Var.
Association
exists between two variables if a particular value for one variable is more likely to occur with certain values of the other variable
Contigency table
is a display for two categorical variables
Scatter plot
is a graph used to determine whether there is a relationship between paired data. Scatter plots can show trends in data.
Positive association
as x goes up, y tends to go up
Negative association
as x goes up, y tends to go down
correlation
summarizes the direction of the association between two quantitative variables and the strength of its linear trend
regression line
predicts the value for the response variable y as a straight line function of the value x of the explanatory variable
Residuals
The vertical deviation between the observations and the LSRL
Least Squares Method
method produces the line that has the smallest value for the residual sum of squares
R squared
it is the percentage of the response variable variation that is explained by a linear model
context
Tells who was measured, what was measured, how the data were collected, where the data was collected, and when and why the study was performed.
data
A collection of information gathered for a purpose. Data may be in the form of either words or numbers.
Data Table
An arrangement of data in which each row represents a case and each column represents a variable.
case
Individual about whom or which we have data.
categorical variable
A variable that names categories (words/numbers)
Quantitative Variable
A variable in which the numbers act as numerical values - always have units.
units
A quantity or amount adopted as a standard of measurement, such as dollars, hours, or grams.
area principle
In a statistical display, each data value should be represented by the same amount of area.
bar chart
Shows a bar whose area represents the count (or percentage) of observations for each category of a categorical variance.
Marginal Distribution
In a contingency table, the distribution of either var. alone.
Conditional Distribution
The distribution of a var. restricting the who to consider only a smaller group of individuals.
Independence
Variables are ________________ if the conditional distribution of one variables is the same for each category of the other.
simpson's paradox
When averages are taken across different groups, they can appear to contradict the overall averages.
stem-and-leaf display
shows quantitative data values in a way that sketches the distribution of the data
dotplot
Consists of a graph in which each data value is plotted as a point (or dot) along a scale of values. Dots representing equal values are stacked.
Shape
To describe the _________ of a distribution, look for: single vs. mult. modes; symmetry vs skewness; outliers and gaps.
Center
Each regular polygon has a center because it can be inscribed in a circle.
spread
A numerical summary of how tightly the values are clustered around the center. Measures: IQR, Standard Dev.