{"id":475,"date":"2022-11-02T20:43:05","date_gmt":"2022-11-02T20:43:05","guid":{"rendered":"https:\/\/pressbooks.palomar.edu\/introtostats\/back-matter\/glossary\/"},"modified":"2022-11-02T20:43:05","modified_gmt":"2022-11-02T20:43:05","slug":"glossary","status":"publish","type":"back-matter","link":"https:\/\/pressbooks.palomar.edu\/introtostats\/back-matter\/glossary\/","title":{"raw":"Glossary","rendered":"Glossary"},"content":{"raw":"","rendered":"<dl data-type=\"glossary\">\n<dt data-type=\"glossterm\"><dfn id=\"dfn-slug-69e574f8ac0a6\"><\/dfn><\/dt>\n<dd data-type=\"glossdef\"><\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-alternative-hypothesis\">alternative hypothesis<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>In hypothesis testing, the null hypothesis and an alternative hypothesis (the experimenter\u2019s prediction) are put forward. If the data are sufficiently strong to reject the null hypothesis, then the null hypothesis is rejected in favor of an alternative hypothesis.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-analysis-of-variance-anova\">analysis of variance (ANOVA)<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A hypothesis-testing procedure that is used to evaluate mean differences between multiple treatments (or populations).<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-area-in-the-tails-of-the-distribution\">area in the tails of the distribution<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The proportion of the distribution that falls in the tails of a normal curve. The area in the tail of the distribution associated with a particular z score can be found in Appendix A, column C.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-area-under-the-curve\">area under the curve<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The proportion of the distribution that is bounded by a single z score or a pair of z scores. The area under the curve bounded by a single z score can be found in Appendix A, column B.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-arithmetic-mean\">arithmetic mean<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>Perhaps the most common measure of central tendency, the mean is the mathematical average of the scores in a sample.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-bell-curve\">bell curve<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The bell curve is a symmetrical distribution in which there is a single peak at the center and tails that extend equally out to each side. The bell curve represents a normal distribution.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-between-groups-variability\">between-groups variability<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The variability that arises from differences between groups, which includes treatment effects and error.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-bimodal-distribution\">bimodal distribution<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A distribution with two distinct peaks that lie roughly symmetrically on either side of the center point.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-bin-width\">bin width<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The width of class intervals in a histogram.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-bonferroni-test\">Bonferroni test<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A Bonferroni test is one of the simplest post hoc analyses. It is a series of t tests performed on each pair of group means with a modified alpha level.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-box-plots\">box plots<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>One of the more effective graphical summaries of a data set, the box plot generally shows the median, 25th and 75th percentiles, and outliers.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-categorical-variables\">categorical variables<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>Also known as qualitative variables, categorical variables cannot be quantified, or measured numerically. Instead, they are measured on a nominal or ordinal scale.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-central-limit-theorem\">central limit theorem<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A mathematical theorem that states: for samples of a given sample size, drawn from a population with a given mean and variance, the sampling distribution of sample means will approximate a normal distribution as the sample size increases.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-central-tendency\">central tendency<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The center or middle of a distribution. There are many measures of central tendency. The most common are the mean, median, and mode.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-chi-square-test\">chi-square (test)<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A nonparametric test designed to understand the frequency distribution of a single categorical variable or find a relationship between two categorical variables.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-confidence-interval\">confidence interval<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A range of scores likely (i.e., with a certain degree of confidence) to contain the parameter being estimated.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-confound-variables\">confound variables<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>Two or more variables are confounded if their effects cannot be separated because they vary together. For example, if a study on the effect of light inadvertently manipulated heat along with light, then light and heat would be confounded.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-contingency-table\">contingency table<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A frequency table that shows the frequency of each category in one variable, contingent upon the specific category or level of the other variable.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-continuous-variables\">continuous variables<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>Numerical variables that can take on any value in a certain range. Time and distance are continuous; gender, SAT score, and \u201ctime rounded to the nearest second\u201d are not.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-control\">control<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The group in an experimental study that is not receiving the treatment being tested.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-convenience-sampling\">convenience sampling<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A sampling strategy in which participants are recruited for their easy availability (e.g., college students). A sample obtained through convenience sampling should not be considered a representative sample.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-correlation-matrices\">correlation matrices<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>Tables displaying correlation coefficients that describe the relationships among multiple variables.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-covariance\">covariance<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>When variables differ together; that is, when one score changes, the other score also changes in a predictable or consistent way.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-critical-value\">critical value<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The value corresponding to a specific rejection region; also called critical region.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-curvilinear-models\">curvilinear models<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>Forms of regression that can explain curves in the data rather than straight lines.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-curvilinear-relationship\">curvilinear relationship<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A relationship in which a line through the middle of the points in a scatter plot will be curved rather than straight.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-data\">data<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A collection of values to be used for statistical analysis. Data is the plural form of datum.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-degrees-of-freedom\">degrees of freedom<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The number of independent pieces of information that go into the estimate. In general, the degrees of freedom for an estimate is equal to the number of values minus the number of parameters estimated en route to the estimate in question.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-dependent-variable\">dependent variable<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A variable that measures the experimental outcome. In most experiments, the effects of the independent variable on the dependent variables are observed. For example, if a study investigated the effectiveness of an experimental treatment for depression, then the measure of depression would be the dependent variable.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-descriptive-statistics\">descriptive statistics<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A set of statistics\u2014such as the mean, standard deviation, and skew\u2014that describe a distribution.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-difference-score\">difference score<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The change in a single variable over time: the score at Time 2 minus the score at Time 1.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-discrete-variables\">discrete variables<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A variable that exists in indivisible units. For quantitative variables, it is measured in whole numbers that are discrete points on the scale.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-dispersion\">dispersion<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The extent to which values differ from one another; that is, how much they vary. Dispersion is also called variability or spread.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-distribution-of-sample-means\">distribution of sample means<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The set of means from all the possible random samples of a specific size selected from a specific population. The distribution of sample means is an example of a sampling distribution.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-effect-size\">effect size<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A statistic that indicates how large, important, or meaningful a statistically significant effect is.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-error\">error<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The difference between a measured or calculated value and a true one.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-event\">event<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>Any specific outcome that could happen.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-expected-values\">expected values<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The expected value of a statistic is the mean of the sampling distribution of the statistic.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-experimental\">experimental<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The group in a study that is receiving the treatment being tested.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-experimental-research\">experimental research<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>Research that involves the use of random assignment to treatment conditions and manipulation of the independent variable.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-factorial-anova\">factorial ANOVA<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>An analysis of variance that uses multiple grouping variables, instead of just one, to look for group mean differences.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-frequency-polygons\">frequency polygons<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A frequency polygon is a graphical representation of a distribution that is similar in appearance to a line graph. Frequency polygons can be grouped or ungrouped.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-grand-mean\">grand mean<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The mean of a group of averages.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-group-mean-differences\">group mean differences<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>Research studies concerned with group mean differences determine whether, on average, a person from Group A is higher or lower or different on some variable than a person from Group B. Key criteria to consider in such studies: the groups must be mutually exclusive.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-grouping-variable\">grouping variable<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>Also called the independent variable, a grouping variable is used to categorize data into groups. It predicts or explains the values in the outcome variable.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-histogram\">histogram<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A graphical representation of a distribution that is similar in appearance to a bar chart. It partitions the variable on the x-axis into various contiguous class intervals of (usually) equal widths. The heights of the bars represent the class frequencies.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-homogeneity-of-variance\">homogeneity of variance<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The assumption that the true population variance for each group is the same and any difference in the observed sample variances is due to random chance.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-hypothesis\">hypothesis<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A prediction that is tested in a research study.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-independent-samples-t-test\">independent samples (t test)<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The analysis of two samples that are selected from two populations, where the values from one population are not related in any way to the values from the other population.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-independent-variable\">independent variable<\/dfn><\/dt>\n<dd data-type=\"glossdef\"><\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-inferential-statistics\">inferential statistics<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The branch of statistics concerned with drawing conclusions about a population from a sample. This is generally done through random sampling, followed by inferences made about central tendency, or any of a number of other aspects of a<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-interquartile-range-iqr\">interquartile range (IQR)<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The range of the middle 50% of the scores in a distribution; computed by subtracting the 25th percentile from the 75th percentile. The interquartile range is a robust measure of central tendency.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-interval-scale\">interval scale<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A numerical scale in which the distance between scores on the scale is consistent (equal) and for which the zero is relative (rather than absolute).<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-inverse-relationship\">inverse relationship<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>In an inverse relationship, variables are related but move in opposite directions when they change: as one variable goes up, the other variable goes down.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-law-of-large-numbers\">law of large numbers<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A mathematical theorem that states: as sample size increases, the probability that a sample mean is an accurate representation of the true population mean also increases.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-least-squares-error-solution-linear-regression\">least squares error solution (linear regression)<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The solution\u2014or equation\u2014of a line is the one that provides the smallest possible value of the squared errors (squared so that they can be summed, just like in standard deviation) relative to any other straight line that could be drawn through the data.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-lie-factor\">lie factor<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The ratio of the size of the effect shown in a graph to the size of the effect shown in the data. This term was coined by Edward Tufte, who suggested that lie factors greater than 1.05 or less than 0.95 produce unacceptable distortion.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-line-of-best-fit\">line of best fit<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The central tendency of our scatter plot. The term best fit means that the line is as close to all points (with each point representing both variables for a single person) in the scatter plot as possible, with a balance of scores above and below the line.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-linear-relationship\">linear relationship<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>There is a perfect linear relationship between two variables if a scatter plot of the points falls on a straight line. The relationship is linear even if the points diverge from the line, as long as the divergence is random rather than being systematic.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-magnitude\">magnitude<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>How strong or how consistent the relationship between variables is. Higher numbers mean greater magnitude, which means a stronger relationship.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-margin-of-error\">margin of error<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The difference between the statistic used to estimate a parameter and the endpoints of the confidence interval. For example, if the statistic were 0.6 and the confidence interval ranged from 0.4 to 0.8, then the margin of error would be \u00b1.20. Unless otherwise specified, the 95% confidence interval is used.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-marginal-values\">marginal values<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>In a contingency table, marginal values are the total values for a single category of one variable, added up across levels of the other variable.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-matched-pairs\">matched pairs<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>Two samples that are matched or dependent in some way.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-mean\">mean<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>See arithmetic mean.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-mean-square\">mean square<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A sample variance that measures the mean of the squared deviations.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-mean-squared-error-linear-regression\">mean squared error (linear regression)<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The average squared difference between the estimated values and the actual value.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-median\">median<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The median is a popular measure of central tendency. It is the 50th percentile of a distribution.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-mode\">mode<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A measure of central tendency, the mode is the most frequent value in a distribution.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-moderation-models\">moderation models<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>Forms of regression that change the relationship between two variables based on levels of a third variable.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-multiple-regression\">multiple regression<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>Linear regression in which two or more predictor (independent) variables are used to predict the dependent variable.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-negative-relationship\">negative relationship<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A negative or inverse relationship means that as the value of one variable increases, the other decreases.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-no-relationship\">no relationship<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>There is no relationship between variables X and Y if the hypothetical line drawn through points on a scatter plot has no slope; in other words, values of X are not associated with the values of Y.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-nominal-scale\">nominal scale<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A scale in which no ordering is implied, and addition\/subtraction and multiplication\/division would be inappropriate for a variable. Variables measured on a nominal scale have no natural ordering, even if they are coded using numbers (e.g., for eye color 1 = blue, 2 = brown, 3 = hazel, etc.).<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-non-experimental-research\">non-experimental research<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>Research that involves observing things as they occur naturally and recording observations as data. Also known as correlational research.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-nonparametric-tests\">nonparametric tests<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>Tests in which there are no population parameters to estimate or distributions to test against. All chi-square tests are nonparametric.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-normal-distribution\">normal distribution<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>One of the most common continuous distributions, a normal distribution is sometimes referred to as a bell-shaped distribution, a bell curve, or a Gaussian curve. If the mean is 0 and the standard deviation is 1, the distribution is referred to as the \u201cstandard normal distribution.\u201d<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-null-hypothesis\">null hypothesis<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A null hypothesis is a hypothesis tested in significance testing. It is typically the hypothesis that a parameter is zero or that a difference between parameters is zero. For example, the null hypothesis might be that the difference between population means is zero.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-observed-effect\">observed effect<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>What is observed in the sample versus what was expected based on the population from which that sample was drawn.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-ordinal-scale\">ordinal scale<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A set of ordered values in which there is no set distance between scale values; for example, asking someone to indicate how much education they completed by asking them to circle one of the following: did not complete high school, high school diploma, some college, college degree, professional degree.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-outcome-variable\">outcome variable<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>Also known as the dependent variable, the outcome variable is thought to change as a function of changes in a predictor (independent) variable.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-outlier\">outlier<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>An atypical, infrequent observation; a value that has an extreme deviation from the center of the distribution. There is no universally agreed on criterion for defining an outlier, and outliers should only be discarded with extreme caution. However, one should always assess the effects of outliers on the statistical conclusions.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-p-value-or-probability-value\">p value or probability value<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>In significance testing, the probability value is the probability of obtaining a statistic as different or more different from the parameter specified in the null hypothesis as the statistic obtained in the experiment. The probability value is computed assuming the null hypothesis is true. The lower the probability value, the stronger the evidence that the null hypothesis is false. Traditionally, the null hypothesis is rejected if the probability value is below .05.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-point-estimate\">point estimate<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A single number (rather than a range of numbers) that is used to estimate a parameter.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-pooled-variance\">pooled variance<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A weighted average of two variances\u2014the weights being determined by sample size\u2014that can then be used when calculating standard error.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-population\">population<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The complete set of observations a researcher is interested in. Contrast this with a sample which is a subset of a population. Inferential statistics are computed from sample data in order to make inferences about the population.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-positive-relationship\">positive relationship<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A positive relationship exists between variables X and Y when smaller values of X are associated with smaller values of Y, and a positive relationship is indicated graphically when a regression line drawn through the center of the points on a scatter plot has a positive slope.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-post-hoc-test\">post hoc test<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A test that is conducted after an ANOVA with more than two treatment conditions where the null hypothesis was rejected. The purpose of post hoc tests is to determine exactly which treatment conditions are significantly different.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-probability\">probability<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The likelihood of a statistical result or the number of outcomes that satisfy specific criteria divided by the total number of possible outcomes.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-qualitative-variables\">qualitative variables<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>Also known as categorical variables, qualitative variables cannot be quantified, or measured numerically. Instead, they are measured on a nominal or ordinal scale. Variables that are not qualitative are known as quantitative variables.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-quantitative-variables\">quantitative variables<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>Variables that are measured on a numeric or quantitative scale or that can be ordered in some fashion. Ordinal, interval, and ratio scales are quantitative. A country\u2019s population, a person\u2019s shoe size, or a car\u2019s speed are all quantitative variables. Variables that are not quantitative are known as qualitative variables.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-quasi-experimental-research\">quasi-experimental research<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>Research that involves manipulating the independent variable but not randomly assigning people to groups.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-random-error\">random error<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>Any deviations between a person and that person\u2019s group mean caused only by chance. Random error is a component of within-groups variability.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-range\">range<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The difference between the maximum and minimum values of a variable or distribution. The range is the simplest measure of variability.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-range-restriction\">range restriction<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>Failure to capture the full range of a\u00a0 variable\u2019s potential scores.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-ratio-scale\">ratio scale<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A numerical scale in which the distance between scores on the scale is consistent (equal) and for which the zero is relative (rather than absolute).<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-rejection-region\">rejection region<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The region in which an experimenter can reject the null hypothesis, provided the test statistic falls into that region; also called the critical region.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-related-samples-t-test\">related samples (t test)<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The analysis of two scores that are related in a systematic way within people or within pairs. Also called paired samples, matched pairs, repeated measures, dependent measures, and dependent samples, among other names.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-repeated-measures-anova\">repeated measures ANOVA<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>An analysis of variance that measures each study subject three or more times to look for a change. A repeated measures ANOVA is an extension of a related samples t test.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-residual\">residual<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The distance between the actual value of the Y variable and the predicted value of the Y variable.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-robustness\">robustness<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>Something is robust if it holds up well in the face of adversity. A measure of central tendency or variability is considered robust if it is not greatly affected by a few extreme scores. A statistical test is considered robust if it works well in spite of moderate violations of the assumptions on which it is based.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-sample\">sample<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A subset of a population, often taken for the purpose of statistical inference.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-sampling-bias\">sampling bias<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>Sampling bias occurs when participants are not selected at random or when they have an unequal probability of being selected for participation in a study.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-sampling-distribution\">sampling distribution<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A distribution that is obtained through repeated sampling from a larger population.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-sampling-error\">sampling error<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The discrepancy between a parameter and the statistic used to estimate it.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-scale-of-a-distribution\">scale (of a distribution)<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>How far apart the values of a distribution are (their spread) and where they are located (their central tendency).<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-scheffe-test\">Scheff\u00e9 test<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A post hoc test for ANOVA that uses an F ratio to evaluate the significance of the difference between any two treatment conditions. The Scheff\u00e9 test is one of the safest of all possible post hoc tests.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-significance-level\">significance level<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>In significance testing, the significance level is the highest value of a probability value for which the null hypothesis is rejected. Common significance levels are .05 and .01. If the .05 level is used, then the null hypothesis is rejected if the probability value is less than or equal to .05. Also called the \u03b1 level or simply \u03b1 (\u201calpha\u201d).<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-simple-random-sampling\">simple random sampling<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A process of selecting a subset of a population for the purposes of statistical inference in which every member of the population is equally likely to be chosen.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-skew\">skew<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A distribution is skewed if one tail extends out further than the other, making the distribution asymmetrical. A distribution has a positive skew (is skewed to the right) if the tail to the right is longer. A distribution has a negative skew (is skewed to the left) if the tail to the left is longer.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-sources-of-variability\">sources of variability<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The reasons that scores differ from one another. For example, in an ANOVA, we define two sources of variability: between-groups and within-groups variability.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-spearmans-rho\">Spearman\u2019s rho<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A statistic that expresses the relationship between two variables on a scale from 1 to \u20131. This correlation coefficient is designed to be used with ordinal data rather than continuous data and, unlike Pearson correlation, does not assume a linear relationship.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-spread\">spread<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The extent to which values differ from one another; that is, how much they vary. Spread is also called variability or dispersion.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-standard-deviation\">standard deviation<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The standard deviation is a widely used measure of variability. It is computed by taking the square root of the variance.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-standard-error\">standard error<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The standard deviation of the sampling distribution of a statistic.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-standard-error-of-the-estimate\">standard error of the estimate<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The average size of the residual, or the average distance from a researcher\u2019s predictions to the actual observed values.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-standard-normal-distribution\">standard normal distribution<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A normal distribution that has a mean of 0 and a standard deviation of 1; also known as the unit normal distribution.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-standardization\">standardization<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The process of transforming any normal distribution into a standard normal distribution by converting all of the raw scores in the distribution into standard scores (z scores).<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-statistical-power\">statistical power<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The probability of correctly rejecting a false null hypothesis (i.e., not making a Type II error).<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-statistical-significance\">statistical significance<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The probability of rejecting the null hypothesis when the null hypothesis is true. Generally, in psychology we look for p &lt; .05 to indicate that a mean difference or relationship is statistically significant.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-statistics\">statistics<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A range of techniques and procedures for analyzing, interpreting, displaying, and making decisions based on sample data.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-stem-and-leaf-display\">stem-and-leaf display<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A quasi-graphical representation of numerical data. Generally, all but the final digit of each value is a stem, and the final digit is the leaf. The stems are placed in a vertical list, with each matched leaf on one side.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-stratified-random-sampling\">stratified random sampling<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>In stratified random sampling, the population is divided into a number of subgroups (or strata). Random samples are then taken from each subgroup with sample sizes proportional to the size of the subgroup in the population.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-sum-of-squares\">sum of squares<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The sum of squared deviations, or differences, between scores and the mean in a numeric dataset.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-systematic-variability\">systematic variability<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>Variation in observations that are the result of factors related to the experimental differences between groups.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-test-for-goodness-of-fit\">test for goodness of fit<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A chi-square test that assesses whether the observed frequencies in a sample fit the frequencies in a known distribution.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-test-for-independence\">test for independence<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A chi-square test that assesses whether the values of each categorical variable (that is, the frequency of their levels) is related to or independent of the values of the other categorical variable. This type of analysis is performed on contingency tables.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-test-statistic\">test statistic<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>An inferential statistic used to test a null hypothesis.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-tukeys-honestly-significant-difference-hsd\">Tukey\u2019s honestly significant difference (HSD)<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>A popular post hoc analysis for ANOVA that makes adjustments based on the number of comparisons; however, unlike the Bonferroni test, it makes adjustments to the test statistic when running the comparisons of two groups.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-type-i-error\">Type I error<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>Rejecting the null hypothesis when it is actually true; a false positive.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-type-ii-error\">Type II error<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>Failing to reject the null hypothesis when it is actually false; a false negative.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-variability\">variability<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The extent to which values differ from one another; that is, how much they vary. Variability can also be thought of as how spread out or dispersed a distribution is.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-variable\">variable<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>Something that can take on different values. For example, different subjects in an experiment weigh different amounts. Therefore \u201cweight\u201d is a variable in the experiment. Or, subjects may be given different doses of a drug. This would make \u201cdosage\u201d a variable. Variables can be dependent or independent, qualitative or quantitative, and continuous or discrete.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-variance\">variance<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The variance is a widely used measure of variability. It is defined as the mean squared deviation of scores from the mean.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-whiskers\">whiskers<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>Vertical lines ending in a horizontal stroke that are added to box plots to indicate the spread of the data points. Whiskers are drawn from the upper and lower hinges to the upper and lower adjacent values.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-within-groups-variability\">within-groups variability<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The variability that arises from differences that occur within each group due to individual differences among participants.<\/p>\n<\/dd>\n<dt data-type=\"glossterm\"><dfn id=\"dfn-z-score\">z score<\/dfn><\/dt>\n<dd data-type=\"glossdef\">\n<p>The number of standard deviations a score is from the mean of its population. When the scores (or sample means) in the population are normally distributed, the z table can be used to find probabilities for obtaining a given z score.<\/p>\n<\/dd>\n<\/dl>\n","protected":false},"author":7,"menu_order":8,"template":"","meta":{"pb_show_title":"","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"back-matter-type":[38],"contributor":[],"license":[],"class_list":["post-475","back-matter","type-back-matter","status-publish","hentry","back-matter-type-glossary"],"_links":{"self":[{"href":"https:\/\/pressbooks.palomar.edu\/introtostats\/wp-json\/pressbooks\/v2\/back-matter\/475","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/pressbooks.palomar.edu\/introtostats\/wp-json\/pressbooks\/v2\/back-matter"}],"about":[{"href":"https:\/\/pressbooks.palomar.edu\/introtostats\/wp-json\/wp\/v2\/types\/back-matter"}],"author":[{"embeddable":true,"href":"https:\/\/pressbooks.palomar.edu\/introtostats\/wp-json\/wp\/v2\/users\/7"}],"version-history":[{"count":0,"href":"https:\/\/pressbooks.palomar.edu\/introtostats\/wp-json\/pressbooks\/v2\/back-matter\/475\/revisions"}],"metadata":[{"href":"https:\/\/pressbooks.palomar.edu\/introtostats\/wp-json\/pressbooks\/v2\/back-matter\/475\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/pressbooks.palomar.edu\/introtostats\/wp-json\/wp\/v2\/media?parent=475"}],"wp:term":[{"taxonomy":"back-matter-type","embeddable":true,"href":"https:\/\/pressbooks.palomar.edu\/introtostats\/wp-json\/pressbooks\/v2\/back-matter-type?post=475"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/pressbooks.palomar.edu\/introtostats\/wp-json\/wp\/v2\/contributor?post=475"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/pressbooks.palomar.edu\/introtostats\/wp-json\/wp\/v2\/license?post=475"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}