{"id":39,"date":"2021-12-09T22:26:51","date_gmt":"2021-12-09T22:26:51","guid":{"rendered":"https:\/\/pressbooks.palomar.edu\/introtostats\/chapter\/chapter-1\/"},"modified":"2026-07-07T20:20:43","modified_gmt":"2026-07-07T20:20:43","slug":"chapter-1","status":"publish","type":"chapter","link":"https:\/\/pressbooks.palomar.edu\/introtostats\/chapter\/chapter-1\/","title":{"raw":"Chapter 1: Introduction","rendered":"Chapter 1: Introduction"},"content":{"raw":"<div id=\"_idContainer738\" class=\"_idGenObjectStyleOverride-1\">\r\n<div class=\"textbox textbox--sidebar textbox--learning-objectives\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\">Key Terms<\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<p class=\"Key-terms\"><a href=\"#continuous-variables\"><span class=\"Hyperlink-underscore\">continuous variables<\/span><\/a><\/p>\r\n<p class=\"Key-terms\"><a href=\"#control\"><span class=\"Hyperlink-underscore\">control (group)<\/span><\/a><\/p>\r\n<p class=\"Key-terms\"><a href=\"#convenience-sampling\"><span class=\"Hyperlink-underscore\">convenience sampling<\/span><\/a><\/p>\r\n<p class=\"Key-terms\"><a href=\"#data\"><span class=\"Hyperlink-underscore\">data<\/span><\/a><\/p>\r\n<p class=\"Key-terms\"><a href=\"#dependent-variable\"><span class=\"Hyperlink-underscore\">dependent variable<\/span><\/a><\/p>\r\n<p class=\"Key-terms\"><a href=\"#descriptive-statistics\"><span class=\"Hyperlink-underscore\">descriptive statistics<\/span><\/a><\/p>\r\n<p class=\"Key-terms\"><a href=\"#discrete-variables\"><span class=\"Hyperlink-underscore\">discrete variables<\/span><\/a><\/p>\r\n<p class=\"Key-terms\"><a href=\"#experimental-group\"><span class=\"Hyperlink-underscore\">experimental (group)<\/span><\/a><\/p>\r\n<p class=\"Key-terms\"><a href=\"#experimental-research\"><span class=\"Hyperlink-underscore\">experimental research<\/span><\/a><\/p>\r\n<p class=\"Key-terms\"><a href=\"#independent-variable\"><span class=\"Hyperlink-underscore\">independent variable<\/span><\/a><\/p>\r\n<p class=\"Key-terms\"><a href=\"#inferential-statistics\"><span class=\"Hyperlink-underscore\">inferential statistics<\/span><\/a><\/p>\r\n<p class=\"Key-terms\"><a href=\"#interval-scale\"><span class=\"Hyperlink-underscore\">interval scale<\/span><\/a><\/p>\r\n<p class=\"Key-terms\"><a href=\"#nominal-scale\"><span class=\"Hyperlink-underscore\">nominal scale<\/span><\/a><\/p>\r\n<p class=\"Key-terms\"><a href=\"#non-experimental-research\"><span class=\"Hyperlink-underscore\">non-experimental research<\/span><\/a><\/p>\r\n<p class=\"Key-terms\"><a href=\"#ordinal-scale\"><span class=\"Hyperlink-underscore\">ordinal scale<\/span><\/a><\/p>\r\n<p class=\"Key-terms\"><a href=\"#population\"><span class=\"Hyperlink-underscore\">population<\/span><\/a><\/p>\r\n<p class=\"Key-terms\"><a href=\"#qualitative-variables\"><span class=\"Hyperlink-underscore\">qualitative variables<\/span><\/a><\/p>\r\n<p class=\"Key-terms\"><a href=\"#quantitative-variables\"><span class=\"Hyperlink-underscore\">quantitative variables<\/span><\/a><\/p>\r\n<p class=\"Key-terms\"><a href=\"#quasi-experimental-research\"><span class=\"Hyperlink-underscore\">quasi-experimental research<\/span><\/a><\/p>\r\n<p class=\"Key-terms\"><a href=\"#ratio-scale\"><span class=\"Hyperlink-underscore\">ratio scale<\/span><\/a><\/p>\r\n<p class=\"Key-terms\"><a href=\"#sample\"><span class=\"Hyperlink-underscore\">sample<\/span><\/a><\/p>\r\n<p class=\"Key-terms\"><a href=\"#sampling-bias\"><span class=\"Hyperlink-underscore\">sampling bias<\/span><\/a><\/p>\r\n<p class=\"Key-terms\"><a href=\"#sampling-error\"><span class=\"Hyperlink-underscore\">sampling error<\/span><\/a><\/p>\r\n<p class=\"Key-terms\"><a href=\"#simple-random-sampling\"><span class=\"Hyperlink-underscore\">simple random sampling<\/span><\/a><\/p>\r\n<p class=\"Key-terms\"><a href=\"#statistics\"><span class=\"Hyperlink-underscore\">statistics<\/span><\/a><\/p>\r\n<p class=\"Key-terms\"><a href=\"#stratified-random-sampling\"><span class=\"Hyperlink-underscore\">stratified random sampling<\/span><\/a><\/p>\r\n<p class=\"Key-terms\"><a href=\"#variable\"><span class=\"Hyperlink-underscore\">variable<\/span><\/a><\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<p class=\"Text\">This chapter provides an overview of statistics as a field of study and presents terminology that will be used throughout the course.<\/p>\r\n\r\n<h3 data-start=\"96\" data-end=\"122\"><a id=\"statistics\"><\/a>What Are Statistics?<\/h3>\r\n<p data-start=\"124\" data-end=\"231\">Statistics include numerical facts and figures that help us understand patterns in society. For instance:<\/p>\r\n\r\n<ul data-start=\"233\" data-end=\"578\">\r\n \t<li data-start=\"233\" data-end=\"322\">Black Americans are incarcerated at more than five times the rate of white Americans.<\/li>\r\n \t<li data-start=\"323\" data-end=\"412\">Women in the United States earn, on average, 82 cents for every dollar earned by men.<\/li>\r\n \t<li data-start=\"413\" data-end=\"491\">Nearly 1 in 5 transgender people in the U.S. has experienced homelessness.<\/li>\r\n \t<li data-start=\"492\" data-end=\"578\">By the year 2050, climate change could displace over 200 million people worldwide.<\/li>\r\n<\/ul>\r\n<p data-start=\"580\" data-end=\"970\">The study of statistics involves mathematics and relies on numerical calculations. However, it also heavily depends on how data is collected and how statistics are interpreted. Consider the following three examples where the numbers might be correct, but the conclusions drawn from them are misleading. Try to identify the major flaw in each interpretation before reading the explanation.<\/p>\r\n\r\n<ol data-start=\"972\" data-end=\"2503\">\r\n \t<li data-start=\"972\" data-end=\"1535\">\r\n<p data-start=\"975\" data-end=\"1194\"><strong data-start=\"975\" data-end=\"1192\">A city passes a new law restricting unhoused individuals from sleeping in public spaces. A year later, official reports show a 40% decrease in visible homelessness. Thus, the law successfully reduced homelessness.<\/strong><\/p>\r\n\r\n<ul data-start=\"1198\" data-end=\"1535\">\r\n \t<li data-start=\"1198\" data-end=\"1535\"><strong data-start=\"1200\" data-end=\"1215\">Major flaw:<\/strong> A reduction in <em data-start=\"1231\" data-end=\"1240\">visible<\/em> homelessness does not necessarily mean fewer people are unhoused. Instead, the law may have pushed people into less visible areas, such as encampments in wooded areas or abandoned buildings. This is an example of a measurement issue\u2014what is being counted does not necessarily reflect reality.<\/li>\r\n<\/ul>\r\n<\/li>\r\n \t<li data-start=\"1537\" data-end=\"1960\">\r\n<p data-start=\"1540\" data-end=\"1644\"><strong data-start=\"1540\" data-end=\"1642\">Cities with more social justice protests also have higher crime rates. Thus, protests cause crime.<\/strong><\/p>\r\n\r\n<ul data-start=\"1648\" data-end=\"1960\">\r\n \t<li data-start=\"1648\" data-end=\"1960\"><strong data-start=\"1650\" data-end=\"1665\">Major flaw:<\/strong> The presence of both protests and higher crime rates can often be explained by other factors, such as systemic inequality, police responses, or urban density. This is an example of the <em data-start=\"1851\" data-end=\"1875\">third-variable problem<\/em>, where two factors appear related but are actually influenced by another variable.<\/li>\r\n<\/ul>\r\n<\/li>\r\n \t<li data-start=\"1962\" data-end=\"2503\">\r\n<p data-start=\"1965\" data-end=\"2135\"><strong data-start=\"1965\" data-end=\"2133\">The percentage of women in leadership positions in Fortune 500 companies has doubled over the past decade. Thus, gender inequality in the workplace has been solved.<\/strong><\/p>\r\n\r\n<ul data-start=\"2139\" data-end=\"2503\">\r\n \t<li data-start=\"2139\" data-end=\"2503\"><strong data-start=\"2141\" data-end=\"2156\">Major flaw:<\/strong> While the percentage may have increased, the actual number might still be quite low. If only 3% of CEOs were women a decade ago and now it is 6%, that is still a significant disparity. Additionally, this statistic does not address other forms of workplace inequality, such as pay gaps, hiring discrimination, or lack of parental leave policies.<\/li>\r\n<\/ul>\r\n<\/li>\r\n<\/ol>\r\n<p data-start=\"2505\" data-end=\"2801\">These examples illustrate that statistics are not just numbers; they are shaped by how they are collected, interpreted, and presented. In the broadest sense, \u201cstatistics\u201d refers to a range of techniques and procedures for analyzing, interpreting, displaying, and making decisions based on data.<\/p>\r\n<p data-start=\"2803\" data-end=\"3151\">Statistics is the language of social science and activism. Understanding and communicating with statistics enables researchers, policymakers, and activists to articulate their findings, challenge misconceptions, and advocate for meaningful social change. It is an objective, precise, and powerful tool for advancing justice and equity in society.<\/p>\r\n\r\n<h4 class=\"H2\">What a Statistics Course Is <span class=\"bold-italic\">Not<\/span><\/h4>\r\n<p class=\"Text-1st\">Many sociology students dread the idea of taking a statistics course, and more than a few have changed majors upon learning that it is a requirement. That is because many students view statistics as a math class, which is actually not true. While many of you will not believe this or agree with it, statistics isn\u2019t math.<\/p>\r\n<p class=\"Text\">Although math is a central component of it, statistics is a broader way of organizing, interpreting, and communicating information in an objective manner. Indeed, great care has been taken to eliminate as much math from this course as possible (students who do not believe this are welcome to ask the professor what matrix algebra is). Statistics is a way of viewing reality as it exists around us in a way that we otherwise could not.<\/p>\r\n\r\n<h3 class=\"H1\">Why Do We Study Statistics?<\/h3>\r\n<p class=\"Text-1st\">Virtually every student of the behavioral sciences takes some form of statistics class. This is because statistics is how we communicate in science. It serves as the link between a research idea and usable conclusions. Without statistics, we would be unable to interpret the massive amounts of information contained in <a id=\"data\"><\/a>data. Even small datasets contain hundreds\u2014if not thousands\u2014of numbers, each representing a specific observation we made. Without a way to organize these numbers into a more interpretable form, we would be lost, having wasted the time and money of our participants, ourselves, and the communities we serve.<\/p>\r\n<p class=\"Text\">Beyond its use in science, however, there is a more personal reason to study statistics. Like most people, you probably feel that it is important to \u201ctake control of your life.\u201d But what does this mean? Partly, it means being able to properly evaluate the data and claims that bombard you every day. If you cannot distinguish good from faulty reasoning, then you are vulnerable to manipulation and to decisions that are not in your best interest. Statistics provides tools that you need in order to react intelligently to information you hear or read. In this sense, statistics is one of the most important things that you can study.<\/p>\r\n<p class=\"Text\">To be more specific, here are some claims that we have heard on several occasions. (We are not saying that each one of these claims is true!)<\/p>\r\n\r\n<ul>\r\n \t<li data-start=\"160\" data-end=\"288\">\r\n<p data-start=\"162\" data-end=\"288\">Nearly 40% of unhoused individuals in the U.S. are Black, even though Black people make up only about 13% of the population.<\/p>\r\n<\/li>\r\n \t<li data-start=\"289\" data-end=\"376\">\r\n<p data-start=\"291\" data-end=\"376\">Latinx workers are twice as likely as white workers to earn less than $15 per hour.<\/p>\r\n<\/li>\r\n \t<li data-start=\"377\" data-end=\"500\">\r\n<p data-start=\"379\" data-end=\"500\">Transgender people are more than four times as likely to experience violent victimization compared to cisgender people.<\/p>\r\n<\/li>\r\n \t<li data-start=\"501\" data-end=\"589\">\r\n<p data-start=\"503\" data-end=\"589\">About 1 in 5 women report experiencing sexual harassment in the workplace each year.<\/p>\r\n<\/li>\r\n \t<li data-start=\"590\" data-end=\"669\">\r\n<p data-start=\"592\" data-end=\"669\">Only 5% of Fortune 500 CEOs are women, and less than 2% are women of color.<\/p>\r\n<\/li>\r\n \t<li data-start=\"670\" data-end=\"768\">\r\n<p data-start=\"672\" data-end=\"768\">Indigenous people in the U.S. are incarcerated at a rate 38% higher than the national average.<\/p>\r\n<\/li>\r\n \t<li data-start=\"769\" data-end=\"925\">\r\n<p data-start=\"771\" data-end=\"925\">A recent study shows that students from low-income families are nearly 30% less likely to graduate college within six years compared to wealthier peers.<\/p>\r\n<\/li>\r\n \t<li data-start=\"926\" data-end=\"1024\">\r\n<p data-start=\"928\" data-end=\"1024\">Black women are three times more likely to die from pregnancy-related causes than white women.<\/p>\r\n<\/li>\r\n \t<li data-start=\"1025\" data-end=\"1178\">\r\n<p data-start=\"1027\" data-end=\"1178\">There\u2019s about a 50% chance that in a group of 23 people, at least two share the same birthday \u2014 a classic stats paradox that surprises many students.<\/p>\r\n<\/li>\r\n<\/ul>\r\n<p class=\"Text\">All of these claims are statistical in character. We suspect that some of them sound familiar; if not, we bet that you have heard other claims like them. Notice how diverse the examples are. They come from psychology, health, law, sports, business, etc. Indeed, data and data interpretation show up in discourse from virtually every facet of contemporary life.<\/p>\r\n<p class=\"Text\">Statistics are often presented in an effort to add credibility to an argument or advice. You can see this by paying attention to television advertisements. Many of the numbers thrown about in this way do not represent careful statistical analysis. They can be misleading and push you into decisions that you might find cause to regret. For these reasons, learning about statistics is a long step toward taking control of your life. (It is not, of course, the only step needed to do so.) The purpose of this course, beyond preparing you for a career in psychology, is to help you learn statistical essentials. It will make you into an intelligent consumer of statistical claims.<\/p>\r\n<p class=\"Text\">You can take the first step right away. To be an intelligent consumer of statistics, your first reflex must be to question the statistics you encounter. The British Prime Minister Benjamin Disraeli is quoted by Mark Twain as having said, \u201cThere are three kinds of lies\u2014lies, damned lies, and statistics.\u201d This quote reminds us why it is so important to understand statistics. So let us invite you to reform your statistical habits from now on. No longer will you blindly accept numbers or findings. Instead, you will begin to think about the numbers, their sources, and most importantly, the procedures used to generate them.<\/p>\r\n<p class=\"Text\">The above section puts an emphasis on defending ourselves against fraudulent claims wrapped up as statistics, but let us look at a more positive note. Just as important as detecting the deceptive use of statistics is the appreciation of the proper use of statistics. You must also learn to recognize statistical evidence that supports a stated conclusion. Statistics are all around you, sometimes used well, sometimes not. We must learn how to distinguish the two cases. In doing so, statistics will likely be the course you use most in your day-to-day life, even if you do not ever run a formal analysis again.<\/p>\r\n\r\n<h3 data-start=\"325\" data-end=\"368\">TYPES OF DATA AND HOW TO COLLECT THEM<\/h3>\r\n<p data-start=\"370\" data-end=\"1200\">In order to use statistics, we need data to analyze. Data come in an amazingly diverse range of formats, and each type gives us a unique type of information. In virtually any form, data represent the measured value of variables. In sociology and psychology, we are often interested in people, so we might get a group of people together and measure their levels of stress (one variable), their access to healthcare (a second variable), and their income level (a third variable). Once we have data on these three variables, we can use statistics to understand if and how they are related. Before we do so, we need to understand the nature of our data\u2014what they represent and where they came from.<\/p>\r\n\r\n\r\n<hr data-start=\"1202\" data-end=\"1205\" \/>\r\n\r\n<h3 data-start=\"1207\" data-end=\"1231\"><a id=\"variable\"><\/a>TYPES OF VARIABLES<\/h3>\r\n<p data-start=\"1233\" data-end=\"1869\">When conducting research, experimenters often manipulate variables. For example, an experimenter might compare the effectiveness of four types of tutoring programs. In this case, the variable is \u201ctype of program.\u201d When a variable is manipulated by an experimenter, it is called an <a id=\"independent-variable\"><\/a>independent variable. The experiment seeks to determine the effect of the independent variable on student performance. In this example, academic achievement is called a <a id=\"dependent-variable\"><\/a>dependent variable. In general, the independent variable is manipulated by the experimenter, and its effects on the dependent variable are measured.<\/p>\r\n\r\n\r\n<hr data-start=\"1871\" data-end=\"1874\" \/>\r\n<p data-start=\"1876\" data-end=\"2172\"><strong data-start=\"1876\" data-end=\"1936\">Example #1: Does raising the minimum wage reduce stress?<\/strong><br data-start=\"1936\" data-end=\"1939\" \/>Researchers could compare three groups of workers: those earning below $15\/hour, those earning exactly $15\/hour, and those earning above $20\/hour. After six months, surveys and health measures could be used to assess stress levels.<\/p>\r\n\r\n<ul data-start=\"2173\" data-end=\"2284\">\r\n \t<li data-start=\"2173\" data-end=\"2237\">\r\n<p data-start=\"2175\" data-end=\"2237\">Independent variable: wage level (below $15, $15, above $20)<\/p>\r\n<\/li>\r\n \t<li data-start=\"2238\" data-end=\"2284\">\r\n<p data-start=\"2240\" data-end=\"2284\">Dependent variable: measured stress levels<\/p>\r\n<\/li>\r\n<\/ul>\r\n\r\n<hr data-start=\"2286\" data-end=\"2289\" \/>\r\n<p data-start=\"2291\" data-end=\"2574\"><strong data-start=\"2291\" data-end=\"2360\">Example #2: Do police body cameras reduce use-of-force incidents?<\/strong><br data-start=\"2360\" data-end=\"2363\" \/>In a study of police departments, some officers are randomly assigned to wear body cameras while others are not. Researchers track the number of force-related complaints filed by community members over a year.<\/p>\r\n\r\n<ul data-start=\"2575\" data-end=\"2686\">\r\n \t<li data-start=\"2575\" data-end=\"2628\">\r\n<p data-start=\"2577\" data-end=\"2628\">Independent variable: body camera use (yes or no)<\/p>\r\n<\/li>\r\n \t<li data-start=\"2629\" data-end=\"2686\">\r\n<p data-start=\"2631\" data-end=\"2686\">Dependent variable: number of use-of-force complaints<\/p>\r\n<\/li>\r\n<\/ul>\r\n\r\n<hr data-start=\"2688\" data-end=\"2691\" \/>\r\n<p data-start=\"2693\" data-end=\"2945\"><strong data-start=\"2693\" data-end=\"2772\">Example #3: Does providing free school breakfast improve academic outcomes?<\/strong><br data-start=\"2772\" data-end=\"2775\" \/>A school district implements a free breakfast program in some schools but not others. After a year, researchers compare standardized test scores between the two groups.<\/p>\r\n\r\n<ul data-start=\"2946\" data-end=\"3053\">\r\n \t<li data-start=\"2946\" data-end=\"3017\">\r\n<p data-start=\"2948\" data-end=\"3017\">Independent variable: breakfast program (provided vs. not provided)<\/p>\r\n<\/li>\r\n \t<li data-start=\"3018\" data-end=\"3053\">\r\n<p data-start=\"3020\" data-end=\"3053\">Dependent variable: test scores<\/p>\r\n<\/li>\r\n<\/ul>\r\n\r\n<hr data-start=\"3055\" data-end=\"3058\" \/>\r\n\r\n<h3 data-start=\"3060\" data-end=\"3099\">LEVELS OF AN INDEPENDENT VARIABLE<\/h3>\r\n<p data-start=\"3101\" data-end=\"3553\">If an experiment compares an experimental treatment with a control treatment, then the independent variable (type of treatment) has two levels: <a id=\"experimental-group\"><\/a>experimental and [pb_glossary id=\"495\"]<a id=\"control-group\"><\/a><a id=\"control\"><\/a>control[\/pb_glossary]. If an experiment were comparing five types of health insurance coverage, then the independent variable (type of coverage) would have 5 levels. In general, the number of levels of an independent variable is the number of experimental conditions.<\/p>\r\n\r\n<h4 class=\"H2\">Qualitative and Quantitative Variables<\/h4>\r\n<p class=\"Text-1st\">An important distinction between variables is between qualitative variables and quantitative variables. <span class=\"key-term\"><a id=\"qualitative-variables\"><\/a>Qualitative variables<\/span> are those that express a qualitative attribute such as hair color, eye color, religion, favorite movie, gender, and so on. The values of a qualitative variable do not imply a numerical ordering. Values of the variable \u201creligion\u201d differ qualitatively; no ordering of religions is implied. Qualitative variables are sometimes referred to as categorical or nominal variables. <span class=\"key-term\"><a id=\"quantitative-variables\"><\/a>Quantitative variables<\/span> are those variables that are measured in terms of numbers. Some examples of quantitative variables are height, weight, and shoe size.<\/p>\r\n<p class=\"Text\">In the study on the effect of diet discussed previously, the independent variable was type of supplement: none, strawberry, blueberry, and spinach. The variable \u201ctype of supplement\u201d is a qualitative variable; there is nothing quantitative about it. In contrast, the dependent variable \u201cmemory test\u201d is a quantitative variable since memory performance was measured on a quantitative scale (number correct).<\/p>\r\n\r\n<h4 class=\"H2\">Discrete and Continuous Variables<\/h4>\r\n<p class=\"Text-1st\">Variables such as number of children in a household are called <span class=\"key-term\"><a id=\"discrete-variables\"><\/a>[pb_glossary id=\"507\"]discrete variables[\/pb_glossary]<\/span> since the possible scores are discrete points on the scale. For example, a household could have three children or six children, but not 4.53 children. Other variables such as time to respond to a question are <span class=\"key-term\">[pb_glossary id=\"494\"]<a id=\"continuous-variables\"><\/a>continuous variables[\/pb_glossary]<\/span> since the scale is continuous and not made up of discrete steps. The response time could be 1.64 seconds, or it could be 1.64237123922121 seconds. Of course, the practicalities of measurement preclude most measured variables from being truly continuous.<\/p>\r\n\r\n<h3 data-start=\"158\" data-end=\"185\">LEVELS OF MEASUREMENT<\/h3>\r\n<p data-start=\"187\" data-end=\"861\">Before we can conduct a statistical analysis, we need to measure our dependent variable. Exactly how the measurement is carried out depends on the type of variable involved in the analysis. Different types of variables require different methods of measurement. For example, to measure how long it takes someone to complete a job-training program, you might use a calendar or clock. But to measure a community\u2019s sense of safety in their neighborhood, a survey with response options such as \u201cvery unsafe,\u201d \u201csomewhat unsafe,\u201d or \u201cvery safe\u201d would be more appropriate. And for a variable like racial\/ethnic identity, we would simply record the category the respondent selects.<\/p>\r\n<p data-start=\"863\" data-end=\"1243\">Although the procedures for measurement differ, they can be classified into a few fundamental categories. Each category captures specific properties of data that are important to understand if we want to analyze inequality, evaluate programs, or document disparities accurately. These categories are called <strong data-start=\"1170\" data-end=\"1185\">scale types<\/strong> (or just <strong data-start=\"1195\" data-end=\"1205\">scales<\/strong>) and are described in this section.<\/p>\r\n\r\n<h3 data-start=\"309\" data-end=\"333\">TYPES OF VARIABLES<\/h3>\r\n<p data-start=\"335\" data-end=\"1004\">When conducting research, experimenters often manipulate variables. For example, an experimenter might compare the effectiveness of different types of community programs. In this case, the variable is \u201ctype of program.\u201d When a variable is manipulated by an experimenter, it is called an [pb_glossary id=\"612\"]independent variable[\/pb_glossary]. The experiment seeks to determine the effect of the independent variable on outcomes such as health, education, or safety. In this example, the measurable result is called a dependent variable. In general, the independent variable is manipulated by the experimenter, and its effects on the dependent variable are measured.<\/p>\r\n\r\n\r\n<hr data-start=\"1006\" data-end=\"1009\" \/>\r\n<p data-start=\"1011\" data-end=\"1237\"><strong data-start=\"1011\" data-end=\"1081\">Example #1: Do school lunch programs improve academic performance?<\/strong><br data-start=\"1081\" data-end=\"1084\" \/>Researchers study students in schools with free lunch, reduced-price lunch, or no lunch program. After one year, they compare standardized test scores.<\/p>\r\n\r\n<ul data-start=\"1239\" data-end=\"1375\">\r\n \t<li data-start=\"1239\" data-end=\"1312\">\r\n<p data-start=\"1241\" data-end=\"1312\"><strong data-start=\"1241\" data-end=\"1266\">Independent variable:<\/strong> type of lunch program (free, reduced, none)<\/p>\r\n<\/li>\r\n \t<li data-start=\"1313\" data-end=\"1375\">\r\n<p data-start=\"1315\" data-end=\"1375\"><strong data-start=\"1315\" data-end=\"1338\">Dependent variable:<\/strong> academic performance (test scores)<\/p>\r\n<\/li>\r\n<\/ul>\r\n\r\n<hr data-start=\"1377\" data-end=\"1380\" \/>\r\n<p data-start=\"1382\" data-end=\"1658\"><strong data-start=\"1382\" data-end=\"1455\">Example #2: Does access to affordable housing reduce health problems?<\/strong><br data-start=\"1455\" data-end=\"1458\" \/>A study tracks families who receive housing vouchers compared to those who remain on a waiting list. Over five years, researchers measure health outcomes such as rates of asthma and hospital visits.<\/p>\r\n\r\n<ul data-start=\"1660\" data-end=\"1805\">\r\n \t<li data-start=\"1660\" data-end=\"1729\">\r\n<p data-start=\"1662\" data-end=\"1729\"><strong data-start=\"1662\" data-end=\"1687\">Independent variable:<\/strong> housing status (voucher vs. no voucher)<\/p>\r\n<\/li>\r\n \t<li data-start=\"1730\" data-end=\"1805\">\r\n<p data-start=\"1732\" data-end=\"1805\"><strong data-start=\"1732\" data-end=\"1755\">Dependent variable:<\/strong> health outcomes (asthma rates, hospital visits)<\/p>\r\n<\/li>\r\n<\/ul>\r\n\r\n<hr data-start=\"1807\" data-end=\"1810\" \/>\r\n<p data-start=\"1812\" data-end=\"2064\"><strong data-start=\"1812\" data-end=\"1881\">Example #3: Do body cameras reduce police use-of-force incidents?<\/strong><br data-start=\"1881\" data-end=\"1884\" \/>Police departments randomly assign some officers to wear body cameras and others not to. Researchers then record the number of use-of-force complaints filed by community members.<\/p>\r\n\r\n<ul data-start=\"2066\" data-end=\"2185\">\r\n \t<li data-start=\"2066\" data-end=\"2123\">\r\n<p data-start=\"2068\" data-end=\"2123\"><strong data-start=\"2068\" data-end=\"2093\">Independent variable:<\/strong> body camera use (yes or no)<\/p>\r\n<\/li>\r\n \t<li data-start=\"2124\" data-end=\"2185\">\r\n<p data-start=\"2126\" data-end=\"2185\"><strong data-start=\"2126\" data-end=\"2149\">Dependent variable:<\/strong> number of use-of-force complaints<\/p>\r\n<\/li>\r\n<\/ul>\r\n<h3 data-start=\"312\" data-end=\"332\"><a id=\"nominal-scale\"><\/a>NOMINAL SCALES<\/h3>\r\n<p data-start=\"334\" data-end=\"934\">When measuring using a nominal scale, one simply names or categorizes responses. Race\/ethnicity, gender identity, housing status, and immigration status are examples of variables measured on a nominal scale. The essential point about nominal scales is that they do not imply any ordering among the responses. For example, when classifying people by housing status (housed, unhoused, transitional housing), there is no sense in which \u201choused\u201d is placed \u201cahead of\u201d \u201cunhoused.\u201d Responses are merely categories. Nominal scales embody the lowest level of measurement.<\/p>\r\n\r\n\r\n<hr data-start=\"936\" data-end=\"939\" \/>\r\n\r\n<h3 data-start=\"941\" data-end=\"961\"><a id=\"ordinal-scale\"><\/a>ORDINAL SCALES<\/h3>\r\n<p data-start=\"963\" data-end=\"1556\">A researcher wishing to measure students\u2019 sense of belonging on campus might ask them to rate their experiences as \u201cvery excluded,\u201d \u201csomewhat excluded,\u201d \u201csomewhat included,\u201d or \u201cvery included.\u201d The items in this scale are ordered, ranging from least to most included. This is what distinguishes ordinal from nominal scales. Unlike a nominal scale, an ordinal scale allows a comparison of the degree to which two individuals report belonging. For example, our belonging scale makes it meaningful to assert that one student feels more included than another.<\/p>\r\n<p data-start=\"1558\" data-end=\"2320\">On the other hand, ordinal scales fail to capture important information that will be present in other scales. In particular, the difference between two levels of an ordinal scale cannot be assumed to be the same as the difference between two other levels. In our belonging scale, for example, the difference between \u201cvery excluded\u201d and \u201csomewhat excluded\u201d may not be equivalent to the difference between \u201csomewhat included\u201d and \u201cvery included.\u201d Nothing in our measurement procedure allows us to determine whether the two differences reflect the same change in belonging. Statisticians express this by saying that the differences between adjacent scale values do not necessarily represent equal intervals on the underlying scale giving rise to the measurements.<\/p>\r\n<p data-start=\"2322\" data-end=\"2536\">Even if we changed the response format to numbers (1 = very excluded, 2 = somewhat excluded, etc.), the meaning would remain ordinal. The jump from 1 to 2 is not guaranteed to be the same as the jump from 3 to 4.<\/p>\r\n\r\n\r\n<hr data-start=\"2538\" data-end=\"2541\" \/>\r\n\r\n<h3 data-start=\"2543\" data-end=\"2564\"><a id=\"interval-scale\"><\/a>INTERVAL SCALES<\/h3>\r\n<p data-start=\"2566\" data-end=\"2936\">An interval scale is a numerical scale in which intervals have the same interpretation throughout. A good example comes from survey research: standardized test scores such as the SAT. The difference between a score of 1000 and 1100 is intended to represent the same difference in performance as the difference between 1200 and 1300.<\/p>\r\n<p data-start=\"2938\" data-end=\"3483\">Interval scales are not perfect, however. They do not have a true zero point even if one of the scaled values happens to carry the name \u201czero.\u201d For instance, in public opinion polling, \u201czero\u201d support for a candidate does not literally mean <em data-start=\"3178\" data-end=\"3186\">no one<\/em> supports them \u2014 it just reflects the limits of the measurement. Because an interval scale lacks a true zero, it does not make sense to compute ratios. We cannot say that a SAT score of 1200 means a student is \u201ctwice as smart\u201d as a student with a score of 600, since the zero point is arbitrary.<\/p>\r\n\r\n\r\n<hr data-start=\"3485\" data-end=\"3488\" \/>\r\n\r\n<h3 data-start=\"3490\" data-end=\"3508\"><a id=\"ratio-scale\"><\/a>RATIO SCALES<\/h3>\r\n<p data-start=\"3510\" data-end=\"3739\">The ratio scale of measurement is the most informative scale. It is an interval scale with the additional property that its zero position indicates the absence of the quantity being measured.<\/p>\r\n<p data-start=\"3741\" data-end=\"3988\">An example of a ratio scale is income. A person with $0 income truly has no money, and someone earning $40,000 makes twice as much as someone earning $20,000. This is what makes it a ratio scale: the zero means \u201cnone,\u201d and ratios are meaningful.<\/p>\r\n<p data-start=\"3990\" data-end=\"4288\">Another example is hours worked per week. Zero hours means no work at all, while 40 hours is twice as much as 20 hours. Measures such as number of arrests, years of education completed, or distance to the nearest grocery store also fall into the ratio category because they have true zero points.<\/p>\r\n<p data-start=\"3990\" data-end=\"4288\">In practice, researchers often treat <strong data-start=\"174\" data-end=\"217\">interval and ratio data in similar ways<\/strong> because both use numerical values with equal intervals between them. For example, a public opinion survey on immigration policy might use a 1\u20137 scale of attitudes (interval), while census data could record household income in dollars (ratio). Both can be averaged, graphed, or analyzed using many of the same statistical techniques. The main difference is that ratio data have a true zero point while interval data do not, but for most statistical procedures\u2014like correlation, regression, or ANOVA\u2014the methods apply equally well to both. This is why you will often see interval and ratio data grouped together under the term <em data-start=\"843\" data-end=\"855\">scale data<\/em> in statistical software.<\/p>\r\n\r\n<h4 class=\"H2\">What Level of Measurement Is Used for behavioral science Variables?<\/h4>\r\n<p class=\"Text-1st\">Rating scales are used frequently in behavioral science research. For example, experimental subjects may be asked to rate their level of pain, how much they like a consumer product, their attitudes about capital punishment, or their confidence in an answer to a test question. Typically these ratings are made on a 5-point or a 7-point scale. These scales are often considered ordinal scales.\u00a0 However, we also treat them as interval scales which makes the assumption that the values are equi-distant. For example, we make the assumption that a treatment that reduces pain from a rated pain level of 3 to a rated pain level of 2 represents the same level of relief as a treatment that reduces pain from a rated pain level of 7 to a rated pain level of 6.<\/p>\r\n<p class=\"Text\">In memory experiments, the dependent variable is often the number of items correctly recalled. What scale of measurement is this? You could reasonably argue that it is a ratio scale. First, there is a true zero point; some subjects may get no items correct at all. Moreover, a difference of one represents a difference of one item recalled across the entire scale. It is certainly valid to say that someone who recalled 12 items recalled twice as many items as someone who recalled only 6 items.<\/p>\r\n\r\n<h3 data-start=\"211\" data-end=\"253\">CONSEQUENCES OF LEVEL OF MEASUREMENT<\/h3>\r\n<p data-start=\"255\" data-end=\"744\">Why are we so interested in the type of scale that measures a dependent variable? The crux of the matter is the relationship between the variable\u2019s level of measurement and the statistics that can be meaningfully computed with that variable. For example, consider a study in which five students are asked to report their housing status, choosing from the categories: <em data-start=\"622\" data-end=\"697\">housed, temporarily doubled-up, shelter, street, or transitional housing.<\/em> The researcher codes the results as follows:<\/p>\r\n\r\n<div class=\"_tableContainer_1rjym_1\">\r\n<div class=\"group _tableWrapper_1rjym_13 flex w-fit flex-col-reverse\">\r\n<table class=\"w-fit min-w-(--thread-content-width)\" data-start=\"746\" data-end=\"990\">\r\n<thead data-start=\"746\" data-end=\"780\">\r\n<tr data-start=\"746\" data-end=\"780\">\r\n<th data-start=\"746\" data-end=\"770\" data-col-size=\"sm\">Housing Status<\/th>\r\n<th data-start=\"770\" data-end=\"780\" data-col-size=\"sm\">Code<\/th>\r\n<\/tr>\r\n<\/thead>\r\n<tbody data-start=\"816\" data-end=\"990\">\r\n<tr data-start=\"816\" data-end=\"850\">\r\n<td data-start=\"816\" data-end=\"840\" data-col-size=\"sm\">Housed<\/td>\r\n<td data-start=\"840\" data-end=\"850\" data-col-size=\"sm\">1<\/td>\r\n<\/tr>\r\n<tr data-start=\"851\" data-end=\"885\">\r\n<td data-start=\"851\" data-end=\"875\" data-col-size=\"sm\">Doubled-up<\/td>\r\n<td data-start=\"875\" data-end=\"885\" data-col-size=\"sm\">2<\/td>\r\n<\/tr>\r\n<tr data-start=\"886\" data-end=\"920\">\r\n<td data-start=\"886\" data-end=\"910\" data-col-size=\"sm\">Shelter<\/td>\r\n<td data-start=\"910\" data-end=\"920\" data-col-size=\"sm\">3<\/td>\r\n<\/tr>\r\n<tr data-start=\"921\" data-end=\"955\">\r\n<td data-start=\"921\" data-end=\"945\" data-col-size=\"sm\">Transitional housing<\/td>\r\n<td data-start=\"945\" data-end=\"955\" data-col-size=\"sm\">4<\/td>\r\n<\/tr>\r\n<tr data-start=\"956\" data-end=\"990\">\r\n<td data-start=\"956\" data-end=\"980\" data-col-size=\"sm\">Street<\/td>\r\n<td data-start=\"980\" data-end=\"990\" data-col-size=\"sm\">5<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<\/div>\r\n<\/div>\r\n<h3 class=\"H1\">Collecting Data<\/h3>\r\n<p class=\"Text-1st\">We are usually interested in understanding a specific group of people. This group is known as the population of interest, or simply the population. The <span class=\"key-term\"><a id=\"population\"><\/a>population<\/span> is the collection of all people who have some characteristic in common; it can be as broad as \u201call people\u201d if we have a very general research question about human behavior, or it can be extremely narrow, such as \u201call freshmen psychology majors at Midwestern public universities\u201d if we have a specific group in mind.<\/p>\r\n\r\n<h3 data-start=\"238\" data-end=\"267\">POPULATIONS AND SAMPLES<\/h3>\r\n<p data-start=\"269\" data-end=\"510\">In statistics, we often rely on a <a id=\"sample\"><\/a>[pb_glossary id=\"576\"]sample[\/pb_glossary]\u2014that is, a small subset of a larger set of data\u2014to draw inferences about the larger set. The larger set is known as the population from which the sample is drawn.<\/p>\r\n\r\n\r\n<hr data-start=\"512\" data-end=\"515\" \/>\r\n<p data-start=\"517\" data-end=\"886\"><strong data-start=\"517\" data-end=\"553\">Example #1: Access to healthcare<\/strong><br data-start=\"553\" data-end=\"556\" \/>Suppose researchers want to know how adults in the United States feel about whether healthcare is affordable. It would not be practical to ask every single adult in the country, so researchers instead survey a smaller group of people. The group of adults surveyed is the <em data-start=\"827\" data-end=\"836\">sample,<\/em> while all U.S. adults make up the <em data-start=\"871\" data-end=\"884\">population.<\/em><\/p>\r\n<p data-start=\"888\" data-end=\"1500\">A sample is typically a small subset of the population. In the case of healthcare attitudes, we might sample a few thousand Americans drawn from the hundreds of millions in the population. But if our sample were made up entirely of people from urban hospitals, it would leave out the experiences of rural residents. Similarly, if the sample included only people with private insurance, it would fail to represent those on Medicaid or those who are uninsured. This is the problem of <strong data-start=\"1370\" data-end=\"1387\"><a id=\"sampling-bias\"><\/a>sampling bias<\/strong>: when our sample over-represents one kind of person, our results cannot be generalized to the full population.<\/p>\r\n\r\n\r\n<hr data-start=\"1502\" data-end=\"1505\" \/>\r\n<p data-start=\"1507\" data-end=\"2384\"><strong data-start=\"1507\" data-end=\"1544\">Example #2: College affordability<\/strong><br data-start=\"1544\" data-end=\"1547\" \/>Imagine we are interested in how many jobs college students are working, on average, while pursuing their degrees. The population in this case is <em data-start=\"1693\" data-end=\"1721\">all U.S. college students.<\/em> Because there are millions of students enrolled in thousands of institutions, it would be impossible to collect work-hour data from everyone. Instead, we select a sample of students from a mix of public and private colleges, community colleges, and universities. If we found in our sample that students work an average of 20 hours per week, we might infer that this is close to the true population average. But we must be cautious: if our sample leaned heavily toward community colleges (where students are more likely to work longer hours), then the estimate might overstate the work hours of all college students. Again, unrepresentative samples can mislead.<\/p>\r\n\r\n\r\n<hr data-start=\"2386\" data-end=\"2389\" \/>\r\n<p data-start=\"2391\" data-end=\"2586\">To solidify your understanding of sampling bias, consider the following examples. Identify the population and the sample, and then ask whether the sample is likely to give accurate information.<\/p>\r\n<p data-start=\"2588\" data-end=\"2841\"><strong data-start=\"2588\" data-end=\"2625\">Example #3: School climate survey<\/strong><br data-start=\"2625\" data-end=\"2628\" \/>A high school principal wants to know how safe students feel on campus. She distributes surveys, but only to students in the honors program. From their responses, she concludes that students generally feel safe.<\/p>\r\n\r\n<ul data-start=\"2842\" data-end=\"3077\">\r\n \t<li data-start=\"2842\" data-end=\"2891\">\r\n<p data-start=\"2844\" data-end=\"2891\"><em data-start=\"2844\" data-end=\"2857\">Population:<\/em> all students in the high school<\/p>\r\n<\/li>\r\n \t<li data-start=\"2892\" data-end=\"2929\">\r\n<p data-start=\"2894\" data-end=\"2929\"><em data-start=\"2894\" data-end=\"2903\">Sample:<\/em> honors program students<\/p>\r\n<\/li>\r\n \t<li data-start=\"2930\" data-end=\"3077\">\r\n<p data-start=\"2932\" data-end=\"3077\"><em data-start=\"2932\" data-end=\"2942\">Problem:<\/em> honors students may have different experiences of school climate than students in other tracks, so the sample is not representative.<\/p>\r\n<\/li>\r\n<\/ul>\r\n<p data-start=\"3079\" data-end=\"3356\"><strong data-start=\"3079\" data-end=\"3123\">Example #4: Housing insecurity on campus<\/strong><br data-start=\"3123\" data-end=\"3126\" \/>A researcher wants to estimate how many students at a university have experienced housing insecurity. She asks for volunteers and receives responses from 30 students. She reports that 90% of students have struggled with housing.<\/p>\r\n\r\n<ul data-start=\"3357\" data-end=\"3571\">\r\n \t<li data-start=\"3357\" data-end=\"3405\">\r\n<p data-start=\"3359\" data-end=\"3405\"><em data-start=\"3359\" data-end=\"3372\">Population:<\/em> all students at the university<\/p>\r\n<\/li>\r\n \t<li data-start=\"3406\" data-end=\"3433\">\r\n<p data-start=\"3408\" data-end=\"3433\"><em data-start=\"3408\" data-end=\"3417\">Sample:<\/em> 30 volunteers<\/p>\r\n<\/li>\r\n \t<li data-start=\"3434\" data-end=\"3571\">\r\n<p data-start=\"3436\" data-end=\"3571\"><em data-start=\"3436\" data-end=\"3446\">Problem:<\/em> students experiencing housing insecurity are more likely to volunteer, so the estimate may exaggerate the true prevalence.<\/p>\r\n<\/li>\r\n<\/ul>\r\n<h4 class=\"H2\">Simple Random Sampling<\/h4>\r\n<p class=\"Text-1st\">Researchers adopt a variety of sampling strategies. The most straightforward is <span class=\"key-term\"><a id=\"simple-random-sampling\"><\/a>simple random sampling<\/span>. Such sampling requires every member of the population to have an equal chance of being selected into the sample. In addition, the selection of one member must be independent of the selection of every other member. That is, picking one member from the population must not increase or decrease the probability of picking any other member (relative to the others). In this sense, we can say that simple random sampling chooses a sample by pure chance. To check your understanding of simple random sampling, consider the following example. What is the population? What is the sample? Was the sample picked by simple random sampling? Is it biased?<\/p>\r\n<p class=\"Example\"><span class=\"semibold\">Example #5:<\/span> A research scientist is interested in studying the experiences of twins raised together versus those raised apart. She obtains a list of twins from the National Twin Registry, and selects two subsets of individuals for her study. First, she chooses all those in the registry whose last name begins with <span class=\"italic\">Z<\/span>. Then she turns to all those whose last name begins with\u00a0<span class=\"italic\">B<\/span>. Because there are so many names that start with <span class=\"italic\">B<\/span>, however, our researcher decides to incorporate only every other name into her sample. Finally, she mails out a survey and compares characteristics of twins raised apart versus together.<\/p>\r\n<p class=\"Text\">In Example #5, the population consists of all twins recorded in the National Twin Registry. It is important that the researcher only make statistical generalizations to the twins on this list, not to all twins in the nation or world. That is, the National Twin Registry may not be representative of all twins. Even if inferences are limited to the Registry, a number of problems affect the sampling procedure we described. For instance, choosing only twins whose last names begin with <span class=\"italic\">Z <\/span>does not give every individual an equal chance of being selected into the sample. Moreover, such a procedure risks over-representing ethnic groups with many surnames that begin with <span class=\"italic\">Z<\/span>. There are other reasons why choosing just the <span class=\"italic\">Z<\/span>s may bias the sample.<\/p>\r\n<p class=\"Text\">Perhaps such people are more patient than average because they often find themselves at the end of the line! The same problem occurs with choosing twins whose last name begins with <span class=\"italic\">B<\/span>. An additional problem for the <span class=\"italic\">B<\/span>s is that the every-other-one procedure disallowed adjacent names on the <span class=\"italic\">B<\/span> part of the list from being both selected. Just this defect alone means the sample was not formed through simple random sampling.<\/p>\r\n\r\n<h4 class=\"H2\">Sample Size Matters<\/h4>\r\n<p class=\"Text-1st\">Recall that the definition of a random sample is a sample in which every member of the population has an equal chance of being selected. This means that the sampling procedure rather than the results of the procedure define what it means for a sample to be random. Random samples, especially if the sample size is small, are not necessarily representative of the entire population. For example, if a random sample of 20 subjects were taken from a population with an equal number of males and females, there would be a nontrivial probability (.06) that 70% or more of the sample would be female. Such a sample would not be representative, although it would be drawn randomly. Only a large sample size makes it likely that our sample is close to representative of the population. For this reason, inferential statistics take into account the sample size when generalizing results from samples to populations. In later chapters, you\u2019ll see what kinds of mathematical techniques ensure this sensitivity to sample size.<\/p>\r\n\r\n<h4 class=\"H2\">More Complex Sampling<\/h4>\r\n<p class=\"Text-1st\">Sometimes it is not feasible to build a sample using simple random sampling. To see the problem, consider the fact that both Dallas and Houston competed to be hosts of the 2012 Olympics. Imagine that you had been hired to assess whether most Texans preferred Houston to Dallas as the host, or the reverse. Given the impracticality of obtaining the opinion of every single Texan, you had to construct a sample of the Texas population. But notice how difficult it would have been to proceed by simple random sampling. For example, how would you have contacted those individuals who didn\u2019t vote and didn\u2019t have a phone? Even among people you found in the telephone book, how could you have identified those who had just relocated to another state (and had no reason to inform you of their move)? What would you have done about the fact that since the beginning of the study, an additional 4,212 people took up residence in the state of Texas? As you can see, it is sometimes very difficult to develop a truly random procedure. For this reason, other kinds of sampling techniques have been devised. We now discuss two of them.<\/p>\r\n\r\n<h5 class=\"H3\">Stratified Sampling<\/h5>\r\n<p class=\"Text-1st\">Since simple random sampling often does not ensure a representative sample, a sampling method called <span class=\"key-term\"><a id=\"stratified-random-sampling\"><\/a>stratified random sampling<\/span> is sometimes used to make the sample more representative of the population. This method can be used if the population has a number of distinct \u201cstrata\u201d or groups. In stratified sampling, you first identify members of your sample who belong to each group. Then you randomly sample from each of those subgroups in such a way that the sizes of the subgroups in the sample are proportional to their sizes in the population.<\/p>\r\n<p class=\"Text\">Let\u2019s take an example: Suppose you were interested in views of capital punishment at an urban university. You have the time and resources to interview 200 students. The student body is diverse with respect to age; many older people work during the day and enroll in night courses (average age is 39), while younger students generally enroll in day classes (average age of 19). It is possible that night students have different views about capital punishment than day students. If 70% of the students were day students, it makes sense to ensure that 70% of the sample consisted of day students. Thus, your sample of 200 students would consist of 140 day students and 60 night students. The proportion of day students in the sample and in the population (the entire university) would be the same. Inferences to the entire population of students at the university would therefore be more secure.<\/p>\r\n\r\n<h5 class=\"H3\"><a id=\"convenience-sampling\"><\/a>Convenience Sampling<\/h5>\r\n<p class=\"Text-1st\">Not all sampling methods are perfect, and sometimes that\u2019s okay. For example, if we are beginning research into a completely unstudied area, we may sometimes take some shortcuts to quickly gather data and get a general idea of how things work before fully investing a lot of time and money into well-designed research projects with proper sampling. This is known as <span class=\"key-term\">convenience sampling<\/span>, named for its ease of use. In limited cases, such as the one just described, convenience sampling is okay because we intend to follow up with a representative sample. Unfortunately, sometimes convenience sampling is used due only to its convenience without the intent of improving on it in future work.<\/p>\r\n\r\n<h3 class=\"H1\">Types of Statistical Analyses<\/h3>\r\n<p class=\"Text-1st\">Now that we understand the nature of our data, let\u2019s turn to the types of statistics we can use to interpret them. There are two types of statistics: descriptive and inferential.<\/p>\r\n\r\n<h4 class=\"H2\">Descriptive Statistics<\/h4>\r\n<p class=\"Text-1st\"><span class=\"key-term\"><a id=\"descriptive-statistics\"><\/a>[pb_glossary id=\"505\"]Descriptive statistics[\/pb_glossary]<\/span> are numbers that are used to summarize and describe data. The word \u201cdata\u201d refers to the information that has been collected from an experiment, a survey, a historical record, etc. (By the way, <span class=\"italic\">data<\/span> is plural. One piece of information is called a <span class=\"italic\">datum<\/span>.) If we are analyzing birth certificates, for example, a descriptive statistic might be the percentage of certificates issued in New York State, or the average age of the mother. Any other number we choose to compute also counts as a descriptive statistic for the data from which the statistic is computed. Several descriptive statistics are often used at one time to give a full picture of the data.<\/p>\r\n<p class=\"Text\">Descriptive statistics are just descriptive. They do not involve generalizing beyond the data at hand. Generalizing from our data to another set of cases is the business of inferential statistics, which you\u2019ll be studying in another section. Here we focus on (mere) descriptive statistics.<\/p>\r\n<p class=\"Text\">Some descriptive statistics are shown in <a href=\"#_idTextAnchor030\"><span class=\"Fig-table-number-underscore\">Table 1.1<\/span><\/a>. The table shows the average salaries for various occupations in the United States in 1999. Descriptive statistics like these offer insight into American society. It is interesting to note, for example, that we pay the people who educate our children and who protect our citizens a great deal less than we pay people who take care of our feet or our teeth.<\/p>\r\n\r\n<div class=\"_idGenObjectLayout-1\">\r\n<div id=\"_idContainer005\" class=\"Basic-Text-Frame\">\r\n<p class=\"Table-title\"><span class=\"Fig-table-number\">Table 1.1.<\/span> Average salaries for various U.S. occupations in 1999.<\/p>\r\n\r\n<table id=\"table004\" class=\"Foster-table\"><colgroup> <col class=\"_idGenTableRowColumn-13\" \/> <col class=\"_idGenTableRowColumn-14\" \/><\/colgroup>\r\n<thead>\r\n<tr class=\"Foster-table _idGenTableRowColumn-5\">\r\n<th class=\"Foster-table Table-col-hd CellOverride-2\" scope=\"row\">\r\n<p class=\"Table-col-hd\">Occupation<\/p>\r\n<\/th>\r\n<th class=\"Foster-table Table-col-hd\" scope=\"row\">\r\n<p class=\"Table-col-hd ParaOverride-4\">Salary<\/p>\r\n<\/th>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr class=\"Foster-table _idGenTableRowColumn-6\">\r\n<td class=\"Foster-table Table-body CellOverride-2 _idGenCellOverride-1\">\r\n<p class=\"Table-body\">Pediatricians<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-3 _idGenCellOverride-1\">\r\n<p class=\"Table-body ParaOverride-5\">$112,760<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr class=\"Foster-table _idGenTableRowColumn-7\">\r\n<td class=\"Foster-table Table-body CellOverride-2 _idGenCellOverride-2\">\r\n<p class=\"Table-body\">Dentists<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-3 _idGenCellOverride-2\">\r\n<p class=\"Table-body ParaOverride-5\">$106,130<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr class=\"Foster-table _idGenTableRowColumn-6\">\r\n<td class=\"Foster-table Table-body CellOverride-2 _idGenCellOverride-2\">\r\n<p class=\"Table-body\">Podiatrists<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-3 _idGenCellOverride-2\">\r\n<p class=\"Table-body ParaOverride-5\">$100,090<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr class=\"Foster-table _idGenTableRowColumn-7\">\r\n<td class=\"Foster-table Table-body CellOverride-2 _idGenCellOverride-2\">\r\n<p class=\"Table-body\">Physicists<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-3 _idGenCellOverride-2\">\r\n<p class=\"Table-body ParaOverride-5\">$76,140<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr class=\"Foster-table _idGenTableRowColumn-6\">\r\n<td class=\"Foster-table Table-body CellOverride-2 _idGenCellOverride-2\">\r\n<p class=\"Table-body\">Architects<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-3 _idGenCellOverride-2\">\r\n<p class=\"Table-body ParaOverride-5\">$53,410<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr class=\"Foster-table _idGenTableRowColumn-7\">\r\n<td class=\"Foster-table Table-body CellOverride-2 _idGenCellOverride-2\">\r\n<p class=\"Table-body\">School, clinical, and counseling psychologists<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-3 _idGenCellOverride-2\">\r\n<p class=\"Table-body ParaOverride-5\">$49,720<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr class=\"Foster-table _idGenTableRowColumn-6\">\r\n<td class=\"Foster-table Table-body CellOverride-2 _idGenCellOverride-2\">\r\n<p class=\"Table-body\">Flight attendants<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-3 _idGenCellOverride-2\">\r\n<p class=\"Table-body ParaOverride-5\">$47,910<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr class=\"Foster-table _idGenTableRowColumn-7\">\r\n<td class=\"Foster-table Table-body CellOverride-2 _idGenCellOverride-2\">\r\n<p class=\"Table-body\">Elementary school teachers<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-3 _idGenCellOverride-2\">\r\n<p class=\"Table-body ParaOverride-5\">$39,560<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr class=\"Foster-table _idGenTableRowColumn-6\">\r\n<td class=\"Foster-table Table-body CellOverride-2 _idGenCellOverride-2\">\r\n<p class=\"Table-body\">Police officers<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-3 _idGenCellOverride-2\">\r\n<p class=\"Table-body ParaOverride-5\">$38,710<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr class=\"Foster-table _idGenTableRowColumn-8\">\r\n<td class=\"Foster-table Table-body-last Table-body CellOverride-2\">\r\n<p class=\"Table-body\">Floral designers<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body-last Table-body CellOverride-3\">\r\n<p class=\"Table-body ParaOverride-5\">$18,980<\/p>\r\n<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<\/div>\r\n<\/div>\r\n<p class=\"Text\">For more descriptive statistics, consider <a href=\"#_idTextAnchor031\"><span class=\"Fig-table-number-underscore\">Table 1.2<\/span><\/a>. It shows the number of unmarried men per 100 unmarried women in U.S. metro areas in 1990. From this table we see that men outnumber women most in Jacksonville, North Carolina, and women outnumber men most in Sarasota, Florida. You can see that descriptive statistics can be useful if we are looking for an opposite-sex partner! (These data come from the <a href=\"https:\/\/www.infoplease.com\/us\/states\/the-top-ten-us-male-female-ratios\"><span class=\"Hyperlink-underscore\">Information Please Almanac<\/span><\/a>.)<\/p>\r\n\r\n<div class=\"_idGenObjectLayout-1\">\r\n<div id=\"_idContainer006\" class=\"_idGenObjectStyleOverride-1\">\r\n<p class=\"Table-title\"><span class=\"Fig-table-number\">Table 1.2.<\/span> Number of unmarried men per 100 unmarried women in U.S. metro areas in 1990. <span class=\"italic CharOverride-3\">note<\/span><span class=\"italic\">: Unmarried includes never<\/span>-<span class=\"italic\">married, widowed, and divorced persons, 15 years or older.<\/span><\/p>\r\n\r\n<table id=\"table005\" class=\"Foster-table\" style=\"height: 187px;\"><colgroup> <col class=\"_idGenTableRowColumn-15\" \/> <col class=\"_idGenTableRowColumn-16\" \/> <col class=\"_idGenTableRowColumn-17\" \/> <col class=\"_idGenTableRowColumn-18\" \/><\/colgroup>\r\n<thead>\r\n<tr class=\"Foster-table _idGenTableRowColumn-19\" style=\"height: 17px;\">\r\n<td class=\"Foster-table Table-col-hd CellOverride-4\" style=\"height: 17px; width: 324.312px;\">\r\n<p class=\"Table-col-hd\">Cities with Mostly Men<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-col-hd CellOverride-5\" style=\"height: 17px; width: 136.523px;\">\r\n<p class=\"Table-col-hd ParaOverride-4\">Men per 100 Women<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-col-hd CellOverride-4\" style=\"height: 17px; width: 215.656px;\">\r\n<p class=\"Table-col-hd\">Cities with Mostly Women<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-col-hd\" style=\"height: 17px; width: 136.523px;\">\r\n<p class=\"Table-col-hd ParaOverride-4\">Men per 100 Women<\/p>\r\n<\/td>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr class=\"Foster-table _idGenTableRowColumn-6\" style=\"height: 17px;\">\r\n<th style=\"height: 17px; width: 324.312px;\" scope=\"row\">\r\n<p class=\"Table-numbered-list ParaOverride-6\">1. Jacksonville, North Carolina<\/p>\r\n<\/th>\r\n<td class=\"Foster-table Table-body CellOverride-5 _idGenCellOverride-1\" style=\"height: 17px; width: 136.523px;\">\r\n<p class=\"Table-body ParaOverride-4\">224<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-4 _idGenCellOverride-1\" style=\"height: 17px; width: 215.656px;\">\r\n<p class=\"Table-numbered-list ParaOverride-6\">1. Sarasota, Florida<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body _idGenCellOverride-1\" style=\"height: 17px; width: 136.523px;\">\r\n<p class=\"Table-body ParaOverride-4\">66<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr class=\"Foster-table _idGenTableRowColumn-7\" style=\"height: 17px;\">\r\n<td class=\"Foster-table Table-body CellOverride-4 _idGenCellOverride-2\" style=\"height: 17px; width: 324.312px;\">\r\n<p class=\"Table-numbered-list ParaOverride-6\">2. Killeen\u2013Temple, Texas<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-5 _idGenCellOverride-2\" style=\"height: 17px; width: 136.523px;\">\r\n<p class=\"Table-body ParaOverride-4\">123<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-4 _idGenCellOverride-2\" style=\"height: 17px; width: 215.656px;\">\r\n<p class=\"Table-numbered-list ParaOverride-6\">2. Bradenton, Florida<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body _idGenCellOverride-2\" style=\"height: 17px; width: 136.523px;\">\r\n<p class=\"Table-body ParaOverride-4\">68<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr class=\"Foster-table _idGenTableRowColumn-6\" style=\"height: 17px;\">\r\n<td class=\"Foster-table Table-body CellOverride-4 _idGenCellOverride-2\" style=\"height: 17px; width: 324.312px;\">\r\n<p class=\"Table-numbered-list ParaOverride-6\">3. Fayetteville, North Carolina<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-5 _idGenCellOverride-2\" style=\"height: 17px; width: 136.523px;\">\r\n<p class=\"Table-body ParaOverride-4\">118<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-4 _idGenCellOverride-2\" style=\"height: 17px; width: 215.656px;\">\r\n<p class=\"Table-numbered-list ParaOverride-6\">3. Altoona, Pennsylvania<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body _idGenCellOverride-2\" style=\"height: 17px; width: 136.523px;\">\r\n<p class=\"Table-body ParaOverride-4\">69<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr class=\"Foster-table _idGenTableRowColumn-7\" style=\"height: 17px;\">\r\n<td class=\"Foster-table Table-body CellOverride-4 _idGenCellOverride-2\" style=\"height: 17px; width: 324.312px;\">\r\n<p class=\"Table-numbered-list ParaOverride-6\">4. Brazoria, Texas<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-5 _idGenCellOverride-2\" style=\"height: 17px; width: 136.523px;\">\r\n<p class=\"Table-body ParaOverride-4\">117<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-4 _idGenCellOverride-2\" style=\"height: 17px; width: 215.656px;\">\r\n<p class=\"Table-numbered-list ParaOverride-6\">4. Springfield, Illinois<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body _idGenCellOverride-2\" style=\"height: 17px; width: 136.523px;\">\r\n<p class=\"Table-body ParaOverride-4\">70<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr class=\"Foster-table _idGenTableRowColumn-6\" style=\"height: 17px;\">\r\n<td class=\"Foster-table Table-body CellOverride-4 _idGenCellOverride-2\" style=\"height: 17px; width: 324.312px;\">\r\n<p class=\"Table-numbered-list ParaOverride-6\">5. Lawton, Oklahoma<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-5 _idGenCellOverride-2\" style=\"height: 17px; width: 136.523px;\">\r\n<p class=\"Table-body ParaOverride-4\">116<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-4 _idGenCellOverride-2\" style=\"height: 17px; width: 215.656px;\">\r\n<p class=\"Table-numbered-list ParaOverride-6\">5. Jacksonville, Tennessee<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body _idGenCellOverride-2\" style=\"height: 17px; width: 136.523px;\">\r\n<p class=\"Table-body ParaOverride-4\">70<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr class=\"Foster-table _idGenTableRowColumn-7\" style=\"height: 17px;\">\r\n<td class=\"Foster-table Table-body CellOverride-4 _idGenCellOverride-2\" style=\"height: 17px; width: 324.312px;\">\r\n<p class=\"Table-numbered-list ParaOverride-6\">6. State College, Pennsylvania<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-5 _idGenCellOverride-2\" style=\"height: 17px; width: 136.523px;\">\r\n<p class=\"Table-body ParaOverride-4\">113<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-4 _idGenCellOverride-2\" style=\"height: 17px; width: 215.656px;\">\r\n<p class=\"Table-numbered-list ParaOverride-6\">6. Gadsden, Alabama<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body _idGenCellOverride-2\" style=\"height: 17px; width: 136.523px;\">\r\n<p class=\"Table-body ParaOverride-4\">70<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr class=\"Foster-table _idGenTableRowColumn-20\" style=\"height: 17px;\">\r\n<td class=\"Foster-table Table-body CellOverride-4 _idGenCellOverride-2\" style=\"height: 17px; width: 324.312px;\">\r\n<p class=\"Table-numbered-list ParaOverride-6\">7. Clarksville\u2013Hopkinsville, Tennessee\u2013Kentucky<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-5 _idGenCellOverride-2\" style=\"height: 17px; width: 136.523px;\">\r\n<p class=\"Table-body ParaOverride-4\">113<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-4 _idGenCellOverride-2\" style=\"height: 17px; width: 215.656px;\">\r\n<p class=\"Table-numbered-list ParaOverride-6\">7. Wheeling, West\u00a0Virginia\u2013Ohio<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body _idGenCellOverride-2\" style=\"height: 17px; width: 136.523px;\">\r\n<p class=\"Table-body ParaOverride-4\">70<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr class=\"Foster-table _idGenTableRowColumn-7\" style=\"height: 17px;\">\r\n<td class=\"Foster-table Table-body CellOverride-4 _idGenCellOverride-2\" style=\"height: 17px; width: 324.312px;\">\r\n<p class=\"Table-numbered-list ParaOverride-6\">8. Anchorage, Alaska<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-5 _idGenCellOverride-2\" style=\"height: 17px; width: 136.523px;\">\r\n<p class=\"Table-body ParaOverride-4\">112<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-4 _idGenCellOverride-2\" style=\"height: 17px; width: 215.656px;\">\r\n<p class=\"Table-numbered-list ParaOverride-6\">8. Charleston, West Virginia<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body _idGenCellOverride-2\" style=\"height: 17px; width: 136.523px;\">\r\n<p class=\"Table-body ParaOverride-4\">71<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr class=\"Foster-table _idGenTableRowColumn-6\" style=\"height: 17px;\">\r\n<td class=\"Foster-table Table-body CellOverride-4 _idGenCellOverride-2\" style=\"height: 17px; width: 324.312px;\">\r\n<p class=\"Table-numbered-list ParaOverride-6\">9. Salinas\u2013Seaside\u2013Monterey, California<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-5 _idGenCellOverride-2\" style=\"height: 17px; width: 136.523px;\">\r\n<p class=\"Table-body ParaOverride-4\">112<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-4 _idGenCellOverride-2\" style=\"height: 17px; width: 215.656px;\">\r\n<p class=\"Table-numbered-list ParaOverride-6\">9. St. Joseph, Missouri<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body _idGenCellOverride-2\" style=\"height: 17px; width: 136.523px;\">\r\n<p class=\"Table-body ParaOverride-4\">71<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr class=\"Foster-table _idGenTableRowColumn-8\" style=\"height: 17px;\">\r\n<td class=\"Foster-table Table-body-last Table-body CellOverride-4\" style=\"height: 17px; width: 324.312px;\">\r\n<p class=\"Table-numbered-list ParaOverride-6\">10. Bryan\u2013College Station, Texas<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body-last Table-body CellOverride-5\" style=\"height: 17px; width: 136.523px;\">\r\n<p class=\"Table-body ParaOverride-4\">111<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body-last Table-body CellOverride-4\" style=\"height: 17px; width: 215.656px;\">\r\n<p class=\"Table-numbered-list ParaOverride-6\">10. Lynchburg, Virginia<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body-last Table-body\" style=\"height: 17px; width: 136.523px;\">\r\n<p class=\"Table-body ParaOverride-4\">71<\/p>\r\n<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<\/div>\r\n<\/div>\r\n<p class=\"Text\">These descriptive statistics may make us ponder why the numbers are so disparate in these cities. One potential explanation, for instance, as to why there are more women in Florida than men may involve the fact that elderly individuals tend to move down to the Sarasota region and that women tend to outlive men. Thus, more women might live in Sarasota than men. However, in the absence of proper data, this is only speculation.<\/p>\r\n<p class=\"Text\">There are many descriptive statistics that we can compute from the data in these tables. To gain insight into the improvement in speed over the years, let us divide the men\u2019s times into two pieces, namely, the first 13 races (up to 1952) and the second 13 (starting from 1956). The mean winning time for the first 13 races is 2 hours, 44 minutes, and 22 seconds (written 2:44:22). The mean winning time for the second 13 races is 2:13:18. This is quite a difference (over half an hour). Does this prove that the fastest men are running faster? Or is the difference just due to chance, no more than what often emerges from chance differences in performance from year to year? We can\u2019t answer this question with descriptive statistics alone. All we can affirm is that the two means are \u201csuggestive.\u201d<\/p>\r\n<p class=\"Text\">It is also important to differentiate what we use to describe populations vs. what we use to describe samples. A population is described by a parameter; the parameter is the true value of the descriptive in the population, but one that we can never know for sure. For example, the Bureau of Labor Statistics reports that the average hourly wage of chefs is $23.87. However, even if this number were computed using information from every single chef in the United States (making it a parameter), it would quickly become slightly off as one chef retires and a new chef enters the job market. Additionally, as noted above, there is virtually no way to collect data from every single person in a population. In order to understand a variable, we estimate the population parameter using a sample statistic. Here, the term <span class=\"italic\">statistic<\/span> refers to the specific number we compute from the data (e.g., the average), not the field of statistics. A sample statistic is an estimate of the true population parameter, and if our sample is representative of the population, then the statistic is considered to be a good estimator of the parameter.<\/p>\r\n<p class=\"Text\">Even the best sample will be somewhat off from the full population, earlier referred to as sampling bias, and as a result, there will always be a tiny discrepancy between the parameter and the statistic we use to estimate it. This difference is known as <span class=\"key-term\"><a id=\"sampling-error\"><\/a>sampling error<\/span>, and, as we will see throughout the course, understanding sampling error is the key to understanding statistics. Every observation we make about a variable, be it a full research study or observing an individual\u2019s behavior, is incapable of being completely representative of all possibilities for that variable. Knowing where to draw the line between an unusual observation and a true difference is what statistics is all about.<\/p>\r\n\r\n<h4 class=\"H2\">Inferential Statistics<\/h4>\r\n<p class=\"Text-1st\">Descriptive statistics are wonderful at telling us what our data look like. However, what we often want to understand is how our data behave. What variables are related to other variables? Under what conditions will the value of a variable change? Are two groups different from each other, and if so, are people within each group different or similar? These are the questions answered by inferential statistics, and inferential statistics are how we generalize from our sample back up to our population. <a href=\"https:\/\/pressbooks.palomar.edu\/introtostats\/part\/unit-2-hypothesis-testing\/\"><span class=\"Hyperlink-underscore\">Unit 2<\/span><\/a> and <a href=\"https:\/\/pressbooks.palomar.edu\/introtostats\/part\/unit-3-additional-hypothesis-tests\/\"><span class=\"Hyperlink-underscore\">Unit 3<\/span><\/a> are all about <span class=\"key-term\"><a id=\"inferential-statistics\"><\/a>[pb_glossary id=\"526\"]inferential statistics[\/pb_glossary]<\/span>, the formal analyses and tests we run to make conclusions about our data.<\/p>\r\n<p class=\"Text\">For example, we will learn how to use a <span class=\"italic\">t <\/span>statistic to determine whether people change over time when enrolled in an intervention. We will also use an <span class=\"italic\">F\u00a0<\/span>statistic to determine if we can predict future values on a variable based on current known values of a variable. There are many types of inferential statistics, each allowing us insight into a different behavior of the data we collect. This course will only touch on a small subset (or a <span class=\"italic\">sample<\/span>) of them, but the principles we learn along the way will make it easier to learn new tests, as most inferential statistics follow the same structure and format.<\/p>\r\n\r\n<h3 class=\"H1\">A Note about Statistical Software<\/h3>\r\n<p class=\"Text-1st\">Many pieces of technology support statistical analysis and quantitative data analysis done by psychologists. The statistical software we use is the proprietary Statistical Package for the Social Sciences (SPSS) which can be accessed through the virtual desktop at Palomar College.<\/p>\r\n\r\n<h3 class=\"H1\">Mathematical Notation<\/h3>\r\n<p class=\"Text-1st\">As noted earlier, statistics is not math. It does, however, use math as a tool. Many statistical formulas involve summing numbers. Fortunately, there is a convenient notation for expressing summation. This section covers the basics of this summation notation.<\/p>\r\n<p class=\"Text\">Let\u2019s say we have a variable <span class=\"italic\">X<\/span> that represents the weights (in grams) of 4 grapes:<\/p>\r\n\r\n<table id=\"table008\" class=\"Foster-table _idGenTablePara-1\" style=\"height: 78px; width: 194px;\"><colgroup> <col class=\"_idGenTableRowColumn-25\" \/> <col class=\"_idGenTableRowColumn-26\" \/><\/colgroup>\r\n<thead>\r\n<tr class=\"Foster-table _idGenTableRowColumn-5\" style=\"height: 10px;\">\r\n<th class=\"Foster-table Table-col-hd CellOverride-2\" style=\"height: 10px; width: 235.844px;\" scope=\"row\">\r\n<p class=\"Table-col-hd ParaOverride-4\">Grape<\/p>\r\n<\/th>\r\n<th class=\"Foster-table Table-col-hd\" style=\"height: 10px; width: 160px;\" scope=\"row\">\r\n<p class=\"Table-col-hd ParaOverride-4\"><span class=\"bold-italic\">X<\/span><\/p>\r\n<\/th>\r\n<\/tr>\r\n<tr class=\"Foster-table _idGenTableRowColumn-6\" style=\"height: 17px;\">\r\n<th class=\"Foster-table Table-body CellOverride-2 _idGenCellOverride-1\" style=\"height: 17px; width: 235.844px;\" scope=\"row\">\r\n<p class=\"Table-body ParaOverride-4\">Grape 1<\/p>\r\n<\/th>\r\n<th class=\"Foster-table Table-body _idGenCellOverride-1\" style=\"height: 17px; width: 160px;\" scope=\"row\">\r\n<p class=\"Table-body ParaOverride-4\">4.6<\/p>\r\n<\/th>\r\n<\/tr>\r\n<tr class=\"Foster-table _idGenTableRowColumn-7\" style=\"height: 17px;\">\r\n<th class=\"Foster-table Table-body CellOverride-2 _idGenCellOverride-2\" style=\"height: 17px; width: 235.844px;\" scope=\"row\">\r\n<p class=\"Table-body ParaOverride-4\">Grape 2<\/p>\r\n<\/th>\r\n<th class=\"Foster-table Table-body _idGenCellOverride-2\" style=\"height: 17px; width: 160px;\" scope=\"row\">\r\n<p class=\"Table-body ParaOverride-4\">5.1<\/p>\r\n<\/th>\r\n<\/tr>\r\n<tr class=\"Foster-table _idGenTableRowColumn-6\" style=\"height: 17px;\">\r\n<th class=\"Foster-table Table-body CellOverride-2 _idGenCellOverride-2\" style=\"height: 17px; width: 235.844px;\" scope=\"row\">\r\n<p class=\"Table-body ParaOverride-4\">Grape 3<\/p>\r\n<\/th>\r\n<th class=\"Foster-table Table-body _idGenCellOverride-2\" style=\"height: 17px; width: 160px;\" scope=\"row\">\r\n<p class=\"Table-body ParaOverride-4\">4.9<\/p>\r\n<\/th>\r\n<\/tr>\r\n<tr class=\"Foster-table _idGenTableRowColumn-8\" style=\"height: 17px;\">\r\n<th class=\"Foster-table Table-body-last Table-body CellOverride-2\" style=\"height: 17px; width: 235.844px;\" scope=\"row\">\r\n<p class=\"Table-body ParaOverride-4\">Grape 4<\/p>\r\n<\/th>\r\n<th class=\"Foster-table Table-body-last Table-body\" style=\"height: 17px; width: 160px;\" scope=\"row\">\r\n<p class=\"Table-body ParaOverride-4\">4.4<\/p>\r\n<\/th>\r\n<\/tr>\r\n<\/thead>\r\n<\/table>\r\n<p class=\"Text\">The Greek letter <img class=\"_idGenObjectAttribute-5\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2021\/12\/Eqn1.1-sigma-2.png\" alt=\"Upper Sigma\" \/> indicates summation.<\/p>\r\n<p class=\"Text\">When all the scores of a variable (such as <span class=\"italic\">X<\/span>) are to be summed, it is often convenient to use the following abbreviated notation:<\/p>\r\n<p class=\"Equation\"><img class=\"_idGenObjectAttribute-10\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn1.4-2.png\" alt=\"sigma-summation Upper X\" \/><strong> =\u00a0 4.6 + 5.1 + 4.9 + 4.4 equals 19.<\/strong><\/p>\r\n<p class=\"Text\">Thus it means to sum all the values of <span class=\"italic\">X<\/span>.<\/p>\r\n<p class=\"Text\">Many formulas involve squaring numbers before they are summed. This is indicated as<\/p>\r\n<p class=\"Equation\"><img class=\"_idGenObjectAttribute-11 alignnone\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn1.5-2.png\" alt=\"sigma-summation Upper X Sup 2 Base. Square and then add x values. Total is 90.54\" width=\"744\" height=\"42\" \/><\/p>\r\n<p class=\"Text\">Notice that:<\/p>\r\n<p class=\"Equation\"><img class=\"_idGenObjectAttribute-12\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn1.6-2.png\" alt=\"l-par sigma-summation Upper X r-par Sup 2 Base not-equals sigma-summation Upper X Sup 2\" \/><\/p>\r\n<p class=\"Text\">because the expression on the left means to sum up all the values of <span class=\"italic\">X<\/span> and then square the sum (<img class=\"_idGenObjectAttribute-13\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn1.6a-2.png\" alt=\"19 Sup 2 Base equals 361\" \/>), whereas the expression on the right means to square the numbers and then sum the squares (90.54, as shown).<\/p>\r\n<p class=\"Text\">Some formulas involve the sum of cross products. Below are the data for variables <span class=\"italic\">X<\/span> and <span class=\"italic\">Y<\/span>. The cross products (<span class=\"italic\">XY<\/span>) are shown in the third column. The sum of the cross products is <img class=\"_idGenObjectAttribute-14\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn1.6b-2.png\" alt=\"3 plus 4 plus 21 equals 28\" \/>.<\/p>\r\n\r\n<table id=\"table009\" class=\"Foster-table _idGenTablePara-1\"><colgroup> <col class=\"_idGenTableRowColumn-27\" \/> <col class=\"_idGenTableRowColumn-27\" \/> <col class=\"_idGenTableRowColumn-2\" \/><\/colgroup>\r\n<thead>\r\n<tr class=\"Foster-table _idGenTableRowColumn-5\">\r\n<th class=\"Foster-table Table-col-hd CellOverride-2\" scope=\"row\">\r\n<p class=\"Table-col-hd\"><span class=\"bold-italic\">X<\/span><\/p>\r\n<\/th>\r\n<th class=\"Foster-table Table-col-hd CellOverride-2\" scope=\"row\">\r\n<p class=\"Table-col-hd\"><span class=\"bold-italic\">Y<\/span><\/p>\r\n<\/th>\r\n<th class=\"Foster-table Table-col-hd\" scope=\"row\">\r\n<p class=\"Table-col-hd\"><span class=\"bold-italic\">XY<\/span><\/p>\r\n<\/th>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr class=\"Foster-table _idGenTableRowColumn-6\">\r\n<th class=\"Foster-table Table-body CellOverride-2 _idGenCellOverride-1\" scope=\"row\">\r\n<p class=\"Table-body\">1<\/p>\r\n<\/th>\r\n<th class=\"Foster-table Table-body CellOverride-2 _idGenCellOverride-1\" scope=\"row\">\r\n<p class=\"Table-body\">3<\/p>\r\n<\/th>\r\n<th class=\"Foster-table Table-body _idGenCellOverride-1\" scope=\"row\">\r\n<p class=\"Table-body\">3<\/p>\r\n<\/th>\r\n<\/tr>\r\n<tr class=\"Foster-table _idGenTableRowColumn-7\">\r\n<th class=\"Foster-table Table-body CellOverride-2 _idGenCellOverride-2\" scope=\"row\">\r\n<p class=\"Table-body\">2<\/p>\r\n<\/th>\r\n<th class=\"Foster-table Table-body CellOverride-2 _idGenCellOverride-2\" scope=\"row\">\r\n<p class=\"Table-body\">2<\/p>\r\n<\/th>\r\n<th class=\"Foster-table Table-body _idGenCellOverride-2\" scope=\"row\">\r\n<p class=\"Table-body\">4<\/p>\r\n<\/th>\r\n<\/tr>\r\n<tr class=\"Foster-table _idGenTableRowColumn-11\">\r\n<th class=\"Foster-table Table-body-last Table-body CellOverride-2\" scope=\"row\">\r\n<p class=\"Table-body\">3<\/p>\r\n<\/th>\r\n<th class=\"Foster-table Table-body-last Table-body CellOverride-2\" scope=\"row\">\r\n<p class=\"Table-body\">7<\/p>\r\n<\/th>\r\n<th class=\"Foster-table Table-body-last Table-body\" scope=\"row\">\r\n<p class=\"Table-body\">21<\/p>\r\n<\/th>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<p class=\"Text\">In summation notation, this is written as:<\/p>\r\n<p class=\"Equation\"><img class=\"_idGenObjectAttribute-15\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn1.7-2.png\" alt=\"sigma-summation Upper X Upper Y equals 28\" \/><\/p>\r\n\r\n<h3 class=\"H1\">Exercises<\/h3>\r\n<ol>\r\n \t<li class=\"Numbered-list-Exercises-1st\">In your own words, describe why we study statistics.<\/li>\r\n \t<li class=\"Numbered-list-Exercises\">For each of the following, determine if the variable is continuous or discrete:\r\n<ol>\r\n \t<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">Time taken to read a book chapter<\/li>\r\n \t<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">Favorite food<\/li>\r\n \t<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">Cognitive ability<\/li>\r\n \t<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">Temperature<\/li>\r\n \t<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">Letter grade received in a class<\/li>\r\n<\/ol>\r\n<\/li>\r\n \t<li class=\"Numbered-list-Exercises\">For each of the following, determine the level of measurement:\r\n<ol>\r\n \t<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">T-shirt size<\/li>\r\n \t<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">Time taken to run 100-meter race<\/li>\r\n \t<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">First, second, and third place in 100-meter race<\/li>\r\n \t<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">Birthplace<\/li>\r\n \t<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">Temperature in Celsius<\/li>\r\n<\/ol>\r\n<\/li>\r\n \t<li class=\"Numbered-list-Exercises\">What is the difference between a population and a sample? Which is described by a parameter and which is described by a statistic?<\/li>\r\n \t<li class=\"Numbered-list-Exercises\">What is sampling bias? What is sampling error?<\/li>\r\n \t<li class=\"Numbered-list-Exercises\">What is the difference between a simple random sample and a stratified random sample?<\/li>\r\n \t<li class=\"Numbered-list-Exercises\"><a id=\"non-experimental-research\"><\/a><a id=\"experimental-research\"><\/a>What are the two key characteristics of a true experimental design?<\/li>\r\n \t<li class=\"Numbered-list-Exercises\"><a id=\"quasi-experimental-research\"><\/a>When would we use a quasi-experimental design?<\/li>\r\n \t<li class=\"Numbered-list-Exercises\">Use the following dataset for the computations below:\r\n<table id=\"table010\" class=\"Foster-table _idGenTablePara-1\" style=\"height: 102px;\"><colgroup> <col class=\"_idGenTableRowColumn-27\" \/> <col class=\"_idGenTableRowColumn-28\" \/><\/colgroup>\r\n<thead>\r\n<tr class=\"Foster-table _idGenTableRowColumn-5\" style=\"height: 17px;\">\r\n<td class=\"Foster-table Table-col-hd\" style=\"height: 17px; width: 194px;\">\r\n<p class=\"Table-col-hd\"><span class=\"bold-italic\">X<\/span><\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-col-hd\" style=\"height: 17px; width: 194px;\">\r\n<p class=\"Table-col-hd\"><span class=\"bold-italic\">Y<\/span><\/p>\r\n<\/td>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr class=\"Foster-table _idGenTableRowColumn-6\" style=\"height: 17px;\">\r\n<td class=\"Foster-table Table-body _idGenCellOverride-1\" style=\"height: 17px; width: 194px;\">\r\n<p class=\"Table-body\">2<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body _idGenCellOverride-1\" style=\"height: 17px; width: 194px;\">\r\n<p class=\"Table-body\">8<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr class=\"Foster-table _idGenTableRowColumn-7\" style=\"height: 17px;\">\r\n<td class=\"Foster-table Table-body _idGenCellOverride-2\" style=\"height: 17px; width: 194px;\">\r\n<p class=\"Table-body\">3<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body _idGenCellOverride-2\" style=\"height: 17px; width: 194px;\">\r\n<p class=\"Table-body\">8<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr class=\"Foster-table _idGenTableRowColumn-6\" style=\"height: 17px;\">\r\n<td class=\"Foster-table Table-body _idGenCellOverride-2\" style=\"height: 17px; width: 194px;\">\r\n<p class=\"Table-body\">7<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body _idGenCellOverride-2\" style=\"height: 17px; width: 194px;\">\r\n<p class=\"Table-body\">4<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr class=\"Foster-table _idGenTableRowColumn-7\" style=\"height: 17px;\">\r\n<td class=\"Foster-table Table-body _idGenCellOverride-2\" style=\"height: 17px; width: 194px;\">\r\n<p class=\"Table-body\">5<\/p>\r\n<\/td>\r\n<th style=\"height: 17px; width: 194px;\" scope=\"row\">\r\n<p class=\"Table-body\">1<\/p>\r\n<\/th>\r\n<\/tr>\r\n<tr class=\"Foster-table _idGenTableRowColumn-11\" style=\"height: 17px;\">\r\n<td class=\"Foster-table Table-body-last Table-body\" style=\"height: 17px; width: 194px;\">\r\n<p class=\"Table-body\">9<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body-last Table-body\" style=\"height: 17px; width: 194px;\">\r\n<p class=\"Table-body\">4<\/p>\r\n<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<ol>\r\n \t<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\"><img class=\"_idGenObjectAttribute-10\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn1.4-2.png\" alt=\"sigma-summation Upper X\" \/><\/li>\r\n \t<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\"><span xml:lang=\"ar-SA\"><img class=\"_idGenObjectAttribute-16\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn1.9-2.png\" alt=\"sigma-summation Upper Y Sup 2\" \/><\/span><\/li>\r\n \t<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\"><img class=\"_idGenObjectAttribute-17\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn1.10-2.png\" alt=\"sigma-summation Upper X Upper Y\" \/><\/li>\r\n \t<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\"><img class=\"_idGenObjectAttribute-18\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn1.11-2.png\" alt=\"l-par sigma-summation Upper Y r-par Sup 2\" \/><\/li>\r\n<\/ol>\r\n<\/li>\r\n \t<li class=\"Numbered-list-Exercises\">What are the most common measures of central tendency and spread?<\/li>\r\n<\/ol>\r\n<div class=\"textbox textbox--learning-objectives\"><header class=\"textbox__header\">\r\n<h3 class=\"H1\">Answers to Odd-Numbered Exercises<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n\r\n1)\r\n\r\n<span style=\"font-size: 14pt;\">Your answer could take many forms but should include information about objectively interpreting information and\/or communicating results and research conclusions.<\/span>\r\n\r\n3)\r\n\r\n<span style=\"font-size: 14pt;\">Ordinal<\/span>\r\n\r\n&nbsp;\r\n\r\n5)\r\n\r\nRatio\r\n<ol>\r\n \t<li style=\"list-style-type: none;\">\r\n<ol>\r\n \t<li class=\"Numbered-list-Exercises-sub-odd _idGenParaOverride-2\">Ordinal<\/li>\r\n \t<li class=\"Numbered-list-Exercises-sub-odd _idGenParaOverride-2\">Nominal<\/li>\r\n \t<li class=\"Numbered-list-Exercises-sub-odd _idGenParaOverride-2\">Interval<\/li>\r\n<\/ol>\r\n<\/li>\r\n<\/ol>\r\n7)\r\n\r\n<span style=\"font-size: 14pt;\">Sampling bias is the difference in demographic characteristics between a sample and the population it should represent. Sampling error is the difference between a population parameter and sample statistic that is caused by random chance due to sampling bias.<\/span>\r\n\r\n9)\r\n\r\n<span style=\"font-size: 14pt;\">Random assignment to treatment conditions and manipulation of the independent variable<\/span>\r\n<ol>\r\n \t<li>26<\/li>\r\n \t<li>161<\/li>\r\n \t<li>109<\/li>\r\n \t<li>625<\/li>\r\n<\/ol>\r\n<\/div>\r\n<\/div>\r\n<\/div>","rendered":"<div id=\"_idContainer738\" class=\"_idGenObjectStyleOverride-1\">\n<div class=\"textbox textbox--sidebar textbox--learning-objectives\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\">Key Terms<\/p>\n<\/header>\n<div class=\"textbox__content\">\n<p class=\"Key-terms\"><a href=\"#continuous-variables\"><span class=\"Hyperlink-underscore\">continuous variables<\/span><\/a><\/p>\n<p class=\"Key-terms\"><a href=\"#control\"><span class=\"Hyperlink-underscore\">control (group)<\/span><\/a><\/p>\n<p class=\"Key-terms\"><a href=\"#convenience-sampling\"><span class=\"Hyperlink-underscore\">convenience sampling<\/span><\/a><\/p>\n<p class=\"Key-terms\"><a href=\"#data\"><span class=\"Hyperlink-underscore\">data<\/span><\/a><\/p>\n<p class=\"Key-terms\"><a href=\"#dependent-variable\"><span class=\"Hyperlink-underscore\">dependent variable<\/span><\/a><\/p>\n<p class=\"Key-terms\"><a href=\"#descriptive-statistics\"><span class=\"Hyperlink-underscore\">descriptive statistics<\/span><\/a><\/p>\n<p class=\"Key-terms\"><a href=\"#discrete-variables\"><span class=\"Hyperlink-underscore\">discrete variables<\/span><\/a><\/p>\n<p class=\"Key-terms\"><a href=\"#experimental-group\"><span class=\"Hyperlink-underscore\">experimental (group)<\/span><\/a><\/p>\n<p class=\"Key-terms\"><a href=\"#experimental-research\"><span class=\"Hyperlink-underscore\">experimental research<\/span><\/a><\/p>\n<p class=\"Key-terms\"><a href=\"#independent-variable\"><span class=\"Hyperlink-underscore\">independent variable<\/span><\/a><\/p>\n<p class=\"Key-terms\"><a href=\"#inferential-statistics\"><span class=\"Hyperlink-underscore\">inferential statistics<\/span><\/a><\/p>\n<p class=\"Key-terms\"><a href=\"#interval-scale\"><span class=\"Hyperlink-underscore\">interval scale<\/span><\/a><\/p>\n<p class=\"Key-terms\"><a href=\"#nominal-scale\"><span class=\"Hyperlink-underscore\">nominal scale<\/span><\/a><\/p>\n<p class=\"Key-terms\"><a href=\"#non-experimental-research\"><span class=\"Hyperlink-underscore\">non-experimental research<\/span><\/a><\/p>\n<p class=\"Key-terms\"><a href=\"#ordinal-scale\"><span class=\"Hyperlink-underscore\">ordinal scale<\/span><\/a><\/p>\n<p class=\"Key-terms\"><a href=\"#population\"><span class=\"Hyperlink-underscore\">population<\/span><\/a><\/p>\n<p class=\"Key-terms\"><a href=\"#qualitative-variables\"><span class=\"Hyperlink-underscore\">qualitative variables<\/span><\/a><\/p>\n<p class=\"Key-terms\"><a href=\"#quantitative-variables\"><span class=\"Hyperlink-underscore\">quantitative variables<\/span><\/a><\/p>\n<p class=\"Key-terms\"><a href=\"#quasi-experimental-research\"><span class=\"Hyperlink-underscore\">quasi-experimental research<\/span><\/a><\/p>\n<p class=\"Key-terms\"><a href=\"#ratio-scale\"><span class=\"Hyperlink-underscore\">ratio scale<\/span><\/a><\/p>\n<p class=\"Key-terms\"><a href=\"#sample\"><span class=\"Hyperlink-underscore\">sample<\/span><\/a><\/p>\n<p class=\"Key-terms\"><a href=\"#sampling-bias\"><span class=\"Hyperlink-underscore\">sampling bias<\/span><\/a><\/p>\n<p class=\"Key-terms\"><a href=\"#sampling-error\"><span class=\"Hyperlink-underscore\">sampling error<\/span><\/a><\/p>\n<p class=\"Key-terms\"><a href=\"#simple-random-sampling\"><span class=\"Hyperlink-underscore\">simple random sampling<\/span><\/a><\/p>\n<p class=\"Key-terms\"><a href=\"#statistics\"><span class=\"Hyperlink-underscore\">statistics<\/span><\/a><\/p>\n<p class=\"Key-terms\"><a href=\"#stratified-random-sampling\"><span class=\"Hyperlink-underscore\">stratified random sampling<\/span><\/a><\/p>\n<p class=\"Key-terms\"><a href=\"#variable\"><span class=\"Hyperlink-underscore\">variable<\/span><\/a><\/p>\n<\/div>\n<\/div>\n<p class=\"Text\">This chapter provides an overview of statistics as a field of study and presents terminology that will be used throughout the course.<\/p>\n<h3 data-start=\"96\" data-end=\"122\"><a id=\"statistics\"><\/a>What Are Statistics?<\/h3>\n<p data-start=\"124\" data-end=\"231\">Statistics include numerical facts and figures that help us understand patterns in society. For instance:<\/p>\n<ul data-start=\"233\" data-end=\"578\">\n<li data-start=\"233\" data-end=\"322\">Black Americans are incarcerated at more than five times the rate of white Americans.<\/li>\n<li data-start=\"323\" data-end=\"412\">Women in the United States earn, on average, 82 cents for every dollar earned by men.<\/li>\n<li data-start=\"413\" data-end=\"491\">Nearly 1 in 5 transgender people in the U.S. has experienced homelessness.<\/li>\n<li data-start=\"492\" data-end=\"578\">By the year 2050, climate change could displace over 200 million people worldwide.<\/li>\n<\/ul>\n<p data-start=\"580\" data-end=\"970\">The study of statistics involves mathematics and relies on numerical calculations. However, it also heavily depends on how data is collected and how statistics are interpreted. Consider the following three examples where the numbers might be correct, but the conclusions drawn from them are misleading. Try to identify the major flaw in each interpretation before reading the explanation.<\/p>\n<ol data-start=\"972\" data-end=\"2503\">\n<li data-start=\"972\" data-end=\"1535\">\n<p data-start=\"975\" data-end=\"1194\"><strong data-start=\"975\" data-end=\"1192\">A city passes a new law restricting unhoused individuals from sleeping in public spaces. A year later, official reports show a 40% decrease in visible homelessness. Thus, the law successfully reduced homelessness.<\/strong><\/p>\n<ul data-start=\"1198\" data-end=\"1535\">\n<li data-start=\"1198\" data-end=\"1535\"><strong data-start=\"1200\" data-end=\"1215\">Major flaw:<\/strong> A reduction in <em data-start=\"1231\" data-end=\"1240\">visible<\/em> homelessness does not necessarily mean fewer people are unhoused. Instead, the law may have pushed people into less visible areas, such as encampments in wooded areas or abandoned buildings. This is an example of a measurement issue\u2014what is being counted does not necessarily reflect reality.<\/li>\n<\/ul>\n<\/li>\n<li data-start=\"1537\" data-end=\"1960\">\n<p data-start=\"1540\" data-end=\"1644\"><strong data-start=\"1540\" data-end=\"1642\">Cities with more social justice protests also have higher crime rates. Thus, protests cause crime.<\/strong><\/p>\n<ul data-start=\"1648\" data-end=\"1960\">\n<li data-start=\"1648\" data-end=\"1960\"><strong data-start=\"1650\" data-end=\"1665\">Major flaw:<\/strong> The presence of both protests and higher crime rates can often be explained by other factors, such as systemic inequality, police responses, or urban density. This is an example of the <em data-start=\"1851\" data-end=\"1875\">third-variable problem<\/em>, where two factors appear related but are actually influenced by another variable.<\/li>\n<\/ul>\n<\/li>\n<li data-start=\"1962\" data-end=\"2503\">\n<p data-start=\"1965\" data-end=\"2135\"><strong data-start=\"1965\" data-end=\"2133\">The percentage of women in leadership positions in Fortune 500 companies has doubled over the past decade. Thus, gender inequality in the workplace has been solved.<\/strong><\/p>\n<ul data-start=\"2139\" data-end=\"2503\">\n<li data-start=\"2139\" data-end=\"2503\"><strong data-start=\"2141\" data-end=\"2156\">Major flaw:<\/strong> While the percentage may have increased, the actual number might still be quite low. If only 3% of CEOs were women a decade ago and now it is 6%, that is still a significant disparity. Additionally, this statistic does not address other forms of workplace inequality, such as pay gaps, hiring discrimination, or lack of parental leave policies.<\/li>\n<\/ul>\n<\/li>\n<\/ol>\n<p data-start=\"2505\" data-end=\"2801\">These examples illustrate that statistics are not just numbers; they are shaped by how they are collected, interpreted, and presented. In the broadest sense, \u201cstatistics\u201d refers to a range of techniques and procedures for analyzing, interpreting, displaying, and making decisions based on data.<\/p>\n<p data-start=\"2803\" data-end=\"3151\">Statistics is the language of social science and activism. Understanding and communicating with statistics enables researchers, policymakers, and activists to articulate their findings, challenge misconceptions, and advocate for meaningful social change. It is an objective, precise, and powerful tool for advancing justice and equity in society.<\/p>\n<h4 class=\"H2\">What a Statistics Course Is <span class=\"bold-italic\">Not<\/span><\/h4>\n<p class=\"Text-1st\">Many sociology students dread the idea of taking a statistics course, and more than a few have changed majors upon learning that it is a requirement. That is because many students view statistics as a math class, which is actually not true. While many of you will not believe this or agree with it, statistics isn\u2019t math.<\/p>\n<p class=\"Text\">Although math is a central component of it, statistics is a broader way of organizing, interpreting, and communicating information in an objective manner. Indeed, great care has been taken to eliminate as much math from this course as possible (students who do not believe this are welcome to ask the professor what matrix algebra is). Statistics is a way of viewing reality as it exists around us in a way that we otherwise could not.<\/p>\n<h3 class=\"H1\">Why Do We Study Statistics?<\/h3>\n<p class=\"Text-1st\">Virtually every student of the behavioral sciences takes some form of statistics class. This is because statistics is how we communicate in science. It serves as the link between a research idea and usable conclusions. Without statistics, we would be unable to interpret the massive amounts of information contained in <a id=\"data\"><\/a>data. Even small datasets contain hundreds\u2014if not thousands\u2014of numbers, each representing a specific observation we made. Without a way to organize these numbers into a more interpretable form, we would be lost, having wasted the time and money of our participants, ourselves, and the communities we serve.<\/p>\n<p class=\"Text\">Beyond its use in science, however, there is a more personal reason to study statistics. Like most people, you probably feel that it is important to \u201ctake control of your life.\u201d But what does this mean? Partly, it means being able to properly evaluate the data and claims that bombard you every day. If you cannot distinguish good from faulty reasoning, then you are vulnerable to manipulation and to decisions that are not in your best interest. Statistics provides tools that you need in order to react intelligently to information you hear or read. In this sense, statistics is one of the most important things that you can study.<\/p>\n<p class=\"Text\">To be more specific, here are some claims that we have heard on several occasions. (We are not saying that each one of these claims is true!)<\/p>\n<ul>\n<li data-start=\"160\" data-end=\"288\">\n<p data-start=\"162\" data-end=\"288\">Nearly 40% of unhoused individuals in the U.S. are Black, even though Black people make up only about 13% of the population.<\/p>\n<\/li>\n<li data-start=\"289\" data-end=\"376\">\n<p data-start=\"291\" data-end=\"376\">Latinx workers are twice as likely as white workers to earn less than $15 per hour.<\/p>\n<\/li>\n<li data-start=\"377\" data-end=\"500\">\n<p data-start=\"379\" data-end=\"500\">Transgender people are more than four times as likely to experience violent victimization compared to cisgender people.<\/p>\n<\/li>\n<li data-start=\"501\" data-end=\"589\">\n<p data-start=\"503\" data-end=\"589\">About 1 in 5 women report experiencing sexual harassment in the workplace each year.<\/p>\n<\/li>\n<li data-start=\"590\" data-end=\"669\">\n<p data-start=\"592\" data-end=\"669\">Only 5% of Fortune 500 CEOs are women, and less than 2% are women of color.<\/p>\n<\/li>\n<li data-start=\"670\" data-end=\"768\">\n<p data-start=\"672\" data-end=\"768\">Indigenous people in the U.S. are incarcerated at a rate 38% higher than the national average.<\/p>\n<\/li>\n<li data-start=\"769\" data-end=\"925\">\n<p data-start=\"771\" data-end=\"925\">A recent study shows that students from low-income families are nearly 30% less likely to graduate college within six years compared to wealthier peers.<\/p>\n<\/li>\n<li data-start=\"926\" data-end=\"1024\">\n<p data-start=\"928\" data-end=\"1024\">Black women are three times more likely to die from pregnancy-related causes than white women.<\/p>\n<\/li>\n<li data-start=\"1025\" data-end=\"1178\">\n<p data-start=\"1027\" data-end=\"1178\">There\u2019s about a 50% chance that in a group of 23 people, at least two share the same birthday \u2014 a classic stats paradox that surprises many students.<\/p>\n<\/li>\n<\/ul>\n<p class=\"Text\">All of these claims are statistical in character. We suspect that some of them sound familiar; if not, we bet that you have heard other claims like them. Notice how diverse the examples are. They come from psychology, health, law, sports, business, etc. Indeed, data and data interpretation show up in discourse from virtually every facet of contemporary life.<\/p>\n<p class=\"Text\">Statistics are often presented in an effort to add credibility to an argument or advice. You can see this by paying attention to television advertisements. Many of the numbers thrown about in this way do not represent careful statistical analysis. They can be misleading and push you into decisions that you might find cause to regret. For these reasons, learning about statistics is a long step toward taking control of your life. (It is not, of course, the only step needed to do so.) The purpose of this course, beyond preparing you for a career in psychology, is to help you learn statistical essentials. It will make you into an intelligent consumer of statistical claims.<\/p>\n<p class=\"Text\">You can take the first step right away. To be an intelligent consumer of statistics, your first reflex must be to question the statistics you encounter. The British Prime Minister Benjamin Disraeli is quoted by Mark Twain as having said, \u201cThere are three kinds of lies\u2014lies, damned lies, and statistics.\u201d This quote reminds us why it is so important to understand statistics. So let us invite you to reform your statistical habits from now on. No longer will you blindly accept numbers or findings. Instead, you will begin to think about the numbers, their sources, and most importantly, the procedures used to generate them.<\/p>\n<p class=\"Text\">The above section puts an emphasis on defending ourselves against fraudulent claims wrapped up as statistics, but let us look at a more positive note. Just as important as detecting the deceptive use of statistics is the appreciation of the proper use of statistics. You must also learn to recognize statistical evidence that supports a stated conclusion. Statistics are all around you, sometimes used well, sometimes not. We must learn how to distinguish the two cases. In doing so, statistics will likely be the course you use most in your day-to-day life, even if you do not ever run a formal analysis again.<\/p>\n<h3 data-start=\"325\" data-end=\"368\">TYPES OF DATA AND HOW TO COLLECT THEM<\/h3>\n<p data-start=\"370\" data-end=\"1200\">In order to use statistics, we need data to analyze. Data come in an amazingly diverse range of formats, and each type gives us a unique type of information. In virtually any form, data represent the measured value of variables. In sociology and psychology, we are often interested in people, so we might get a group of people together and measure their levels of stress (one variable), their access to healthcare (a second variable), and their income level (a third variable). Once we have data on these three variables, we can use statistics to understand if and how they are related. Before we do so, we need to understand the nature of our data\u2014what they represent and where they came from.<\/p>\n<hr data-start=\"1202\" data-end=\"1205\" \/>\n<h3 data-start=\"1207\" data-end=\"1231\"><a id=\"variable\"><\/a>TYPES OF VARIABLES<\/h3>\n<p data-start=\"1233\" data-end=\"1869\">When conducting research, experimenters often manipulate variables. For example, an experimenter might compare the effectiveness of four types of tutoring programs. In this case, the variable is \u201ctype of program.\u201d When a variable is manipulated by an experimenter, it is called an <a id=\"independent-variable\"><\/a>independent variable. The experiment seeks to determine the effect of the independent variable on student performance. In this example, academic achievement is called a <a id=\"dependent-variable\"><\/a>dependent variable. In general, the independent variable is manipulated by the experimenter, and its effects on the dependent variable are measured.<\/p>\n<hr data-start=\"1871\" data-end=\"1874\" \/>\n<p data-start=\"1876\" data-end=\"2172\"><strong data-start=\"1876\" data-end=\"1936\">Example #1: Does raising the minimum wage reduce stress?<\/strong><br data-start=\"1936\" data-end=\"1939\" \/>Researchers could compare three groups of workers: those earning below $15\/hour, those earning exactly $15\/hour, and those earning above $20\/hour. After six months, surveys and health measures could be used to assess stress levels.<\/p>\n<ul data-start=\"2173\" data-end=\"2284\">\n<li data-start=\"2173\" data-end=\"2237\">\n<p data-start=\"2175\" data-end=\"2237\">Independent variable: wage level (below $15, $15, above $20)<\/p>\n<\/li>\n<li data-start=\"2238\" data-end=\"2284\">\n<p data-start=\"2240\" data-end=\"2284\">Dependent variable: measured stress levels<\/p>\n<\/li>\n<\/ul>\n<hr data-start=\"2286\" data-end=\"2289\" \/>\n<p data-start=\"2291\" data-end=\"2574\"><strong data-start=\"2291\" data-end=\"2360\">Example #2: Do police body cameras reduce use-of-force incidents?<\/strong><br data-start=\"2360\" data-end=\"2363\" \/>In a study of police departments, some officers are randomly assigned to wear body cameras while others are not. Researchers track the number of force-related complaints filed by community members over a year.<\/p>\n<ul data-start=\"2575\" data-end=\"2686\">\n<li data-start=\"2575\" data-end=\"2628\">\n<p data-start=\"2577\" data-end=\"2628\">Independent variable: body camera use (yes or no)<\/p>\n<\/li>\n<li data-start=\"2629\" data-end=\"2686\">\n<p data-start=\"2631\" data-end=\"2686\">Dependent variable: number of use-of-force complaints<\/p>\n<\/li>\n<\/ul>\n<hr data-start=\"2688\" data-end=\"2691\" \/>\n<p data-start=\"2693\" data-end=\"2945\"><strong data-start=\"2693\" data-end=\"2772\">Example #3: Does providing free school breakfast improve academic outcomes?<\/strong><br data-start=\"2772\" data-end=\"2775\" \/>A school district implements a free breakfast program in some schools but not others. After a year, researchers compare standardized test scores between the two groups.<\/p>\n<ul data-start=\"2946\" data-end=\"3053\">\n<li data-start=\"2946\" data-end=\"3017\">\n<p data-start=\"2948\" data-end=\"3017\">Independent variable: breakfast program (provided vs. not provided)<\/p>\n<\/li>\n<li data-start=\"3018\" data-end=\"3053\">\n<p data-start=\"3020\" data-end=\"3053\">Dependent variable: test scores<\/p>\n<\/li>\n<\/ul>\n<hr data-start=\"3055\" data-end=\"3058\" \/>\n<h3 data-start=\"3060\" data-end=\"3099\">LEVELS OF AN INDEPENDENT VARIABLE<\/h3>\n<p data-start=\"3101\" data-end=\"3553\">If an experiment compares an experimental treatment with a control treatment, then the independent variable (type of treatment) has two levels: <a id=\"experimental-group\"><\/a>experimental and <a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_39_495\"><a id=\"control-group\"><\/a><a id=\"control\"><\/a>control<\/a>. If an experiment were comparing five types of health insurance coverage, then the independent variable (type of coverage) would have 5 levels. In general, the number of levels of an independent variable is the number of experimental conditions.<\/p>\n<h4 class=\"H2\">Qualitative and Quantitative Variables<\/h4>\n<p class=\"Text-1st\">An important distinction between variables is between qualitative variables and quantitative variables. <span class=\"key-term\"><a id=\"qualitative-variables\"><\/a>Qualitative variables<\/span> are those that express a qualitative attribute such as hair color, eye color, religion, favorite movie, gender, and so on. The values of a qualitative variable do not imply a numerical ordering. Values of the variable \u201creligion\u201d differ qualitatively; no ordering of religions is implied. Qualitative variables are sometimes referred to as categorical or nominal variables. <span class=\"key-term\"><a id=\"quantitative-variables\"><\/a>Quantitative variables<\/span> are those variables that are measured in terms of numbers. Some examples of quantitative variables are height, weight, and shoe size.<\/p>\n<p class=\"Text\">In the study on the effect of diet discussed previously, the independent variable was type of supplement: none, strawberry, blueberry, and spinach. The variable \u201ctype of supplement\u201d is a qualitative variable; there is nothing quantitative about it. In contrast, the dependent variable \u201cmemory test\u201d is a quantitative variable since memory performance was measured on a quantitative scale (number correct).<\/p>\n<h4 class=\"H2\">Discrete and Continuous Variables<\/h4>\n<p class=\"Text-1st\">Variables such as number of children in a household are called <span class=\"key-term\"><a id=\"discrete-variables\"><\/a><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_39_507\">discrete variables<\/a><\/span> since the possible scores are discrete points on the scale. For example, a household could have three children or six children, but not 4.53 children. Other variables such as time to respond to a question are <span class=\"key-term\"><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_39_494\"><a id=\"continuous-variables\"><\/a>continuous variables<\/a><\/span> since the scale is continuous and not made up of discrete steps. The response time could be 1.64 seconds, or it could be 1.64237123922121 seconds. Of course, the practicalities of measurement preclude most measured variables from being truly continuous.<\/p>\n<h3 data-start=\"158\" data-end=\"185\">LEVELS OF MEASUREMENT<\/h3>\n<p data-start=\"187\" data-end=\"861\">Before we can conduct a statistical analysis, we need to measure our dependent variable. Exactly how the measurement is carried out depends on the type of variable involved in the analysis. Different types of variables require different methods of measurement. For example, to measure how long it takes someone to complete a job-training program, you might use a calendar or clock. But to measure a community\u2019s sense of safety in their neighborhood, a survey with response options such as \u201cvery unsafe,\u201d \u201csomewhat unsafe,\u201d or \u201cvery safe\u201d would be more appropriate. And for a variable like racial\/ethnic identity, we would simply record the category the respondent selects.<\/p>\n<p data-start=\"863\" data-end=\"1243\">Although the procedures for measurement differ, they can be classified into a few fundamental categories. Each category captures specific properties of data that are important to understand if we want to analyze inequality, evaluate programs, or document disparities accurately. These categories are called <strong data-start=\"1170\" data-end=\"1185\">scale types<\/strong> (or just <strong data-start=\"1195\" data-end=\"1205\">scales<\/strong>) and are described in this section.<\/p>\n<h3 data-start=\"309\" data-end=\"333\">TYPES OF VARIABLES<\/h3>\n<p data-start=\"335\" data-end=\"1004\">When conducting research, experimenters often manipulate variables. For example, an experimenter might compare the effectiveness of different types of community programs. In this case, the variable is \u201ctype of program.\u201d When a variable is manipulated by an experimenter, it is called an <a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_39_612\">independent variable<\/a>. The experiment seeks to determine the effect of the independent variable on outcomes such as health, education, or safety. In this example, the measurable result is called a dependent variable. In general, the independent variable is manipulated by the experimenter, and its effects on the dependent variable are measured.<\/p>\n<hr data-start=\"1006\" data-end=\"1009\" \/>\n<p data-start=\"1011\" data-end=\"1237\"><strong data-start=\"1011\" data-end=\"1081\">Example #1: Do school lunch programs improve academic performance?<\/strong><br data-start=\"1081\" data-end=\"1084\" \/>Researchers study students in schools with free lunch, reduced-price lunch, or no lunch program. After one year, they compare standardized test scores.<\/p>\n<ul data-start=\"1239\" data-end=\"1375\">\n<li data-start=\"1239\" data-end=\"1312\">\n<p data-start=\"1241\" data-end=\"1312\"><strong data-start=\"1241\" data-end=\"1266\">Independent variable:<\/strong> type of lunch program (free, reduced, none)<\/p>\n<\/li>\n<li data-start=\"1313\" data-end=\"1375\">\n<p data-start=\"1315\" data-end=\"1375\"><strong data-start=\"1315\" data-end=\"1338\">Dependent variable:<\/strong> academic performance (test scores)<\/p>\n<\/li>\n<\/ul>\n<hr data-start=\"1377\" data-end=\"1380\" \/>\n<p data-start=\"1382\" data-end=\"1658\"><strong data-start=\"1382\" data-end=\"1455\">Example #2: Does access to affordable housing reduce health problems?<\/strong><br data-start=\"1455\" data-end=\"1458\" \/>A study tracks families who receive housing vouchers compared to those who remain on a waiting list. Over five years, researchers measure health outcomes such as rates of asthma and hospital visits.<\/p>\n<ul data-start=\"1660\" data-end=\"1805\">\n<li data-start=\"1660\" data-end=\"1729\">\n<p data-start=\"1662\" data-end=\"1729\"><strong data-start=\"1662\" data-end=\"1687\">Independent variable:<\/strong> housing status (voucher vs. no voucher)<\/p>\n<\/li>\n<li data-start=\"1730\" data-end=\"1805\">\n<p data-start=\"1732\" data-end=\"1805\"><strong data-start=\"1732\" data-end=\"1755\">Dependent variable:<\/strong> health outcomes (asthma rates, hospital visits)<\/p>\n<\/li>\n<\/ul>\n<hr data-start=\"1807\" data-end=\"1810\" \/>\n<p data-start=\"1812\" data-end=\"2064\"><strong data-start=\"1812\" data-end=\"1881\">Example #3: Do body cameras reduce police use-of-force incidents?<\/strong><br data-start=\"1881\" data-end=\"1884\" \/>Police departments randomly assign some officers to wear body cameras and others not to. Researchers then record the number of use-of-force complaints filed by community members.<\/p>\n<ul data-start=\"2066\" data-end=\"2185\">\n<li data-start=\"2066\" data-end=\"2123\">\n<p data-start=\"2068\" data-end=\"2123\"><strong data-start=\"2068\" data-end=\"2093\">Independent variable:<\/strong> body camera use (yes or no)<\/p>\n<\/li>\n<li data-start=\"2124\" data-end=\"2185\">\n<p data-start=\"2126\" data-end=\"2185\"><strong data-start=\"2126\" data-end=\"2149\">Dependent variable:<\/strong> number of use-of-force complaints<\/p>\n<\/li>\n<\/ul>\n<h3 data-start=\"312\" data-end=\"332\"><a id=\"nominal-scale\"><\/a>NOMINAL SCALES<\/h3>\n<p data-start=\"334\" data-end=\"934\">When measuring using a nominal scale, one simply names or categorizes responses. Race\/ethnicity, gender identity, housing status, and immigration status are examples of variables measured on a nominal scale. The essential point about nominal scales is that they do not imply any ordering among the responses. For example, when classifying people by housing status (housed, unhoused, transitional housing), there is no sense in which \u201choused\u201d is placed \u201cahead of\u201d \u201cunhoused.\u201d Responses are merely categories. Nominal scales embody the lowest level of measurement.<\/p>\n<hr data-start=\"936\" data-end=\"939\" \/>\n<h3 data-start=\"941\" data-end=\"961\"><a id=\"ordinal-scale\"><\/a>ORDINAL SCALES<\/h3>\n<p data-start=\"963\" data-end=\"1556\">A researcher wishing to measure students\u2019 sense of belonging on campus might ask them to rate their experiences as \u201cvery excluded,\u201d \u201csomewhat excluded,\u201d \u201csomewhat included,\u201d or \u201cvery included.\u201d The items in this scale are ordered, ranging from least to most included. This is what distinguishes ordinal from nominal scales. Unlike a nominal scale, an ordinal scale allows a comparison of the degree to which two individuals report belonging. For example, our belonging scale makes it meaningful to assert that one student feels more included than another.<\/p>\n<p data-start=\"1558\" data-end=\"2320\">On the other hand, ordinal scales fail to capture important information that will be present in other scales. In particular, the difference between two levels of an ordinal scale cannot be assumed to be the same as the difference between two other levels. In our belonging scale, for example, the difference between \u201cvery excluded\u201d and \u201csomewhat excluded\u201d may not be equivalent to the difference between \u201csomewhat included\u201d and \u201cvery included.\u201d Nothing in our measurement procedure allows us to determine whether the two differences reflect the same change in belonging. Statisticians express this by saying that the differences between adjacent scale values do not necessarily represent equal intervals on the underlying scale giving rise to the measurements.<\/p>\n<p data-start=\"2322\" data-end=\"2536\">Even if we changed the response format to numbers (1 = very excluded, 2 = somewhat excluded, etc.), the meaning would remain ordinal. The jump from 1 to 2 is not guaranteed to be the same as the jump from 3 to 4.<\/p>\n<hr data-start=\"2538\" data-end=\"2541\" \/>\n<h3 data-start=\"2543\" data-end=\"2564\"><a id=\"interval-scale\"><\/a>INTERVAL SCALES<\/h3>\n<p data-start=\"2566\" data-end=\"2936\">An interval scale is a numerical scale in which intervals have the same interpretation throughout. A good example comes from survey research: standardized test scores such as the SAT. The difference between a score of 1000 and 1100 is intended to represent the same difference in performance as the difference between 1200 and 1300.<\/p>\n<p data-start=\"2938\" data-end=\"3483\">Interval scales are not perfect, however. They do not have a true zero point even if one of the scaled values happens to carry the name \u201czero.\u201d For instance, in public opinion polling, \u201czero\u201d support for a candidate does not literally mean <em data-start=\"3178\" data-end=\"3186\">no one<\/em> supports them \u2014 it just reflects the limits of the measurement. Because an interval scale lacks a true zero, it does not make sense to compute ratios. We cannot say that a SAT score of 1200 means a student is \u201ctwice as smart\u201d as a student with a score of 600, since the zero point is arbitrary.<\/p>\n<hr data-start=\"3485\" data-end=\"3488\" \/>\n<h3 data-start=\"3490\" data-end=\"3508\"><a id=\"ratio-scale\"><\/a>RATIO SCALES<\/h3>\n<p data-start=\"3510\" data-end=\"3739\">The ratio scale of measurement is the most informative scale. It is an interval scale with the additional property that its zero position indicates the absence of the quantity being measured.<\/p>\n<p data-start=\"3741\" data-end=\"3988\">An example of a ratio scale is income. A person with $0 income truly has no money, and someone earning $40,000 makes twice as much as someone earning $20,000. This is what makes it a ratio scale: the zero means \u201cnone,\u201d and ratios are meaningful.<\/p>\n<p data-start=\"3990\" data-end=\"4288\">Another example is hours worked per week. Zero hours means no work at all, while 40 hours is twice as much as 20 hours. Measures such as number of arrests, years of education completed, or distance to the nearest grocery store also fall into the ratio category because they have true zero points.<\/p>\n<p data-start=\"3990\" data-end=\"4288\">In practice, researchers often treat <strong data-start=\"174\" data-end=\"217\">interval and ratio data in similar ways<\/strong> because both use numerical values with equal intervals between them. For example, a public opinion survey on immigration policy might use a 1\u20137 scale of attitudes (interval), while census data could record household income in dollars (ratio). Both can be averaged, graphed, or analyzed using many of the same statistical techniques. The main difference is that ratio data have a true zero point while interval data do not, but for most statistical procedures\u2014like correlation, regression, or ANOVA\u2014the methods apply equally well to both. This is why you will often see interval and ratio data grouped together under the term <em data-start=\"843\" data-end=\"855\">scale data<\/em> in statistical software.<\/p>\n<h4 class=\"H2\">What Level of Measurement Is Used for behavioral science Variables?<\/h4>\n<p class=\"Text-1st\">Rating scales are used frequently in behavioral science research. For example, experimental subjects may be asked to rate their level of pain, how much they like a consumer product, their attitudes about capital punishment, or their confidence in an answer to a test question. Typically these ratings are made on a 5-point or a 7-point scale. These scales are often considered ordinal scales.\u00a0 However, we also treat them as interval scales which makes the assumption that the values are equi-distant. For example, we make the assumption that a treatment that reduces pain from a rated pain level of 3 to a rated pain level of 2 represents the same level of relief as a treatment that reduces pain from a rated pain level of 7 to a rated pain level of 6.<\/p>\n<p class=\"Text\">In memory experiments, the dependent variable is often the number of items correctly recalled. What scale of measurement is this? You could reasonably argue that it is a ratio scale. First, there is a true zero point; some subjects may get no items correct at all. Moreover, a difference of one represents a difference of one item recalled across the entire scale. It is certainly valid to say that someone who recalled 12 items recalled twice as many items as someone who recalled only 6 items.<\/p>\n<h3 data-start=\"211\" data-end=\"253\">CONSEQUENCES OF LEVEL OF MEASUREMENT<\/h3>\n<p data-start=\"255\" data-end=\"744\">Why are we so interested in the type of scale that measures a dependent variable? The crux of the matter is the relationship between the variable\u2019s level of measurement and the statistics that can be meaningfully computed with that variable. For example, consider a study in which five students are asked to report their housing status, choosing from the categories: <em data-start=\"622\" data-end=\"697\">housed, temporarily doubled-up, shelter, street, or transitional housing.<\/em> The researcher codes the results as follows:<\/p>\n<div class=\"_tableContainer_1rjym_1\">\n<div class=\"group _tableWrapper_1rjym_13 flex w-fit flex-col-reverse\">\n<table class=\"w-fit min-w-(--thread-content-width)\" data-start=\"746\" data-end=\"990\">\n<thead data-start=\"746\" data-end=\"780\">\n<tr data-start=\"746\" data-end=\"780\">\n<th data-start=\"746\" data-end=\"770\" data-col-size=\"sm\">Housing Status<\/th>\n<th data-start=\"770\" data-end=\"780\" data-col-size=\"sm\">Code<\/th>\n<\/tr>\n<\/thead>\n<tbody data-start=\"816\" data-end=\"990\">\n<tr data-start=\"816\" data-end=\"850\">\n<td data-start=\"816\" data-end=\"840\" data-col-size=\"sm\">Housed<\/td>\n<td data-start=\"840\" data-end=\"850\" data-col-size=\"sm\">1<\/td>\n<\/tr>\n<tr data-start=\"851\" data-end=\"885\">\n<td data-start=\"851\" data-end=\"875\" data-col-size=\"sm\">Doubled-up<\/td>\n<td data-start=\"875\" data-end=\"885\" data-col-size=\"sm\">2<\/td>\n<\/tr>\n<tr data-start=\"886\" data-end=\"920\">\n<td data-start=\"886\" data-end=\"910\" data-col-size=\"sm\">Shelter<\/td>\n<td data-start=\"910\" data-end=\"920\" data-col-size=\"sm\">3<\/td>\n<\/tr>\n<tr data-start=\"921\" data-end=\"955\">\n<td data-start=\"921\" data-end=\"945\" data-col-size=\"sm\">Transitional housing<\/td>\n<td data-start=\"945\" data-end=\"955\" data-col-size=\"sm\">4<\/td>\n<\/tr>\n<tr data-start=\"956\" data-end=\"990\">\n<td data-start=\"956\" data-end=\"980\" data-col-size=\"sm\">Street<\/td>\n<td data-start=\"980\" data-end=\"990\" data-col-size=\"sm\">5<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<\/div>\n<h3 class=\"H1\">Collecting Data<\/h3>\n<p class=\"Text-1st\">We are usually interested in understanding a specific group of people. This group is known as the population of interest, or simply the population. The <span class=\"key-term\"><a id=\"population\"><\/a>population<\/span> is the collection of all people who have some characteristic in common; it can be as broad as \u201call people\u201d if we have a very general research question about human behavior, or it can be extremely narrow, such as \u201call freshmen psychology majors at Midwestern public universities\u201d if we have a specific group in mind.<\/p>\n<h3 data-start=\"238\" data-end=\"267\">POPULATIONS AND SAMPLES<\/h3>\n<p data-start=\"269\" data-end=\"510\">In statistics, we often rely on a <a id=\"sample\"><\/a><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_39_576\">sample<\/a>\u2014that is, a small subset of a larger set of data\u2014to draw inferences about the larger set. The larger set is known as the population from which the sample is drawn.<\/p>\n<hr data-start=\"512\" data-end=\"515\" \/>\n<p data-start=\"517\" data-end=\"886\"><strong data-start=\"517\" data-end=\"553\">Example #1: Access to healthcare<\/strong><br data-start=\"553\" data-end=\"556\" \/>Suppose researchers want to know how adults in the United States feel about whether healthcare is affordable. It would not be practical to ask every single adult in the country, so researchers instead survey a smaller group of people. The group of adults surveyed is the <em data-start=\"827\" data-end=\"836\">sample,<\/em> while all U.S. adults make up the <em data-start=\"871\" data-end=\"884\">population.<\/em><\/p>\n<p data-start=\"888\" data-end=\"1500\">A sample is typically a small subset of the population. In the case of healthcare attitudes, we might sample a few thousand Americans drawn from the hundreds of millions in the population. But if our sample were made up entirely of people from urban hospitals, it would leave out the experiences of rural residents. Similarly, if the sample included only people with private insurance, it would fail to represent those on Medicaid or those who are uninsured. This is the problem of <strong data-start=\"1370\" data-end=\"1387\"><a id=\"sampling-bias\"><\/a>sampling bias<\/strong>: when our sample over-represents one kind of person, our results cannot be generalized to the full population.<\/p>\n<hr data-start=\"1502\" data-end=\"1505\" \/>\n<p data-start=\"1507\" data-end=\"2384\"><strong data-start=\"1507\" data-end=\"1544\">Example #2: College affordability<\/strong><br data-start=\"1544\" data-end=\"1547\" \/>Imagine we are interested in how many jobs college students are working, on average, while pursuing their degrees. The population in this case is <em data-start=\"1693\" data-end=\"1721\">all U.S. college students.<\/em> Because there are millions of students enrolled in thousands of institutions, it would be impossible to collect work-hour data from everyone. Instead, we select a sample of students from a mix of public and private colleges, community colleges, and universities. If we found in our sample that students work an average of 20 hours per week, we might infer that this is close to the true population average. But we must be cautious: if our sample leaned heavily toward community colleges (where students are more likely to work longer hours), then the estimate might overstate the work hours of all college students. Again, unrepresentative samples can mislead.<\/p>\n<hr data-start=\"2386\" data-end=\"2389\" \/>\n<p data-start=\"2391\" data-end=\"2586\">To solidify your understanding of sampling bias, consider the following examples. Identify the population and the sample, and then ask whether the sample is likely to give accurate information.<\/p>\n<p data-start=\"2588\" data-end=\"2841\"><strong data-start=\"2588\" data-end=\"2625\">Example #3: School climate survey<\/strong><br data-start=\"2625\" data-end=\"2628\" \/>A high school principal wants to know how safe students feel on campus. She distributes surveys, but only to students in the honors program. From their responses, she concludes that students generally feel safe.<\/p>\n<ul data-start=\"2842\" data-end=\"3077\">\n<li data-start=\"2842\" data-end=\"2891\">\n<p data-start=\"2844\" data-end=\"2891\"><em data-start=\"2844\" data-end=\"2857\">Population:<\/em> all students in the high school<\/p>\n<\/li>\n<li data-start=\"2892\" data-end=\"2929\">\n<p data-start=\"2894\" data-end=\"2929\"><em data-start=\"2894\" data-end=\"2903\">Sample:<\/em> honors program students<\/p>\n<\/li>\n<li data-start=\"2930\" data-end=\"3077\">\n<p data-start=\"2932\" data-end=\"3077\"><em data-start=\"2932\" data-end=\"2942\">Problem:<\/em> honors students may have different experiences of school climate than students in other tracks, so the sample is not representative.<\/p>\n<\/li>\n<\/ul>\n<p data-start=\"3079\" data-end=\"3356\"><strong data-start=\"3079\" data-end=\"3123\">Example #4: Housing insecurity on campus<\/strong><br data-start=\"3123\" data-end=\"3126\" \/>A researcher wants to estimate how many students at a university have experienced housing insecurity. She asks for volunteers and receives responses from 30 students. She reports that 90% of students have struggled with housing.<\/p>\n<ul data-start=\"3357\" data-end=\"3571\">\n<li data-start=\"3357\" data-end=\"3405\">\n<p data-start=\"3359\" data-end=\"3405\"><em data-start=\"3359\" data-end=\"3372\">Population:<\/em> all students at the university<\/p>\n<\/li>\n<li data-start=\"3406\" data-end=\"3433\">\n<p data-start=\"3408\" data-end=\"3433\"><em data-start=\"3408\" data-end=\"3417\">Sample:<\/em> 30 volunteers<\/p>\n<\/li>\n<li data-start=\"3434\" data-end=\"3571\">\n<p data-start=\"3436\" data-end=\"3571\"><em data-start=\"3436\" data-end=\"3446\">Problem:<\/em> students experiencing housing insecurity are more likely to volunteer, so the estimate may exaggerate the true prevalence.<\/p>\n<\/li>\n<\/ul>\n<h4 class=\"H2\">Simple Random Sampling<\/h4>\n<p class=\"Text-1st\">Researchers adopt a variety of sampling strategies. The most straightforward is <span class=\"key-term\"><a id=\"simple-random-sampling\"><\/a>simple random sampling<\/span>. Such sampling requires every member of the population to have an equal chance of being selected into the sample. In addition, the selection of one member must be independent of the selection of every other member. That is, picking one member from the population must not increase or decrease the probability of picking any other member (relative to the others). In this sense, we can say that simple random sampling chooses a sample by pure chance. To check your understanding of simple random sampling, consider the following example. What is the population? What is the sample? Was the sample picked by simple random sampling? Is it biased?<\/p>\n<p class=\"Example\"><span class=\"semibold\">Example #5:<\/span> A research scientist is interested in studying the experiences of twins raised together versus those raised apart. She obtains a list of twins from the National Twin Registry, and selects two subsets of individuals for her study. First, she chooses all those in the registry whose last name begins with <span class=\"italic\">Z<\/span>. Then she turns to all those whose last name begins with\u00a0<span class=\"italic\">B<\/span>. Because there are so many names that start with <span class=\"italic\">B<\/span>, however, our researcher decides to incorporate only every other name into her sample. Finally, she mails out a survey and compares characteristics of twins raised apart versus together.<\/p>\n<p class=\"Text\">In Example #5, the population consists of all twins recorded in the National Twin Registry. It is important that the researcher only make statistical generalizations to the twins on this list, not to all twins in the nation or world. That is, the National Twin Registry may not be representative of all twins. Even if inferences are limited to the Registry, a number of problems affect the sampling procedure we described. For instance, choosing only twins whose last names begin with <span class=\"italic\">Z <\/span>does not give every individual an equal chance of being selected into the sample. Moreover, such a procedure risks over-representing ethnic groups with many surnames that begin with <span class=\"italic\">Z<\/span>. There are other reasons why choosing just the <span class=\"italic\">Z<\/span>s may bias the sample.<\/p>\n<p class=\"Text\">Perhaps such people are more patient than average because they often find themselves at the end of the line! The same problem occurs with choosing twins whose last name begins with <span class=\"italic\">B<\/span>. An additional problem for the <span class=\"italic\">B<\/span>s is that the every-other-one procedure disallowed adjacent names on the <span class=\"italic\">B<\/span> part of the list from being both selected. Just this defect alone means the sample was not formed through simple random sampling.<\/p>\n<h4 class=\"H2\">Sample Size Matters<\/h4>\n<p class=\"Text-1st\">Recall that the definition of a random sample is a sample in which every member of the population has an equal chance of being selected. This means that the sampling procedure rather than the results of the procedure define what it means for a sample to be random. Random samples, especially if the sample size is small, are not necessarily representative of the entire population. For example, if a random sample of 20 subjects were taken from a population with an equal number of males and females, there would be a nontrivial probability (.06) that 70% or more of the sample would be female. Such a sample would not be representative, although it would be drawn randomly. Only a large sample size makes it likely that our sample is close to representative of the population. For this reason, inferential statistics take into account the sample size when generalizing results from samples to populations. In later chapters, you\u2019ll see what kinds of mathematical techniques ensure this sensitivity to sample size.<\/p>\n<h4 class=\"H2\">More Complex Sampling<\/h4>\n<p class=\"Text-1st\">Sometimes it is not feasible to build a sample using simple random sampling. To see the problem, consider the fact that both Dallas and Houston competed to be hosts of the 2012 Olympics. Imagine that you had been hired to assess whether most Texans preferred Houston to Dallas as the host, or the reverse. Given the impracticality of obtaining the opinion of every single Texan, you had to construct a sample of the Texas population. But notice how difficult it would have been to proceed by simple random sampling. For example, how would you have contacted those individuals who didn\u2019t vote and didn\u2019t have a phone? Even among people you found in the telephone book, how could you have identified those who had just relocated to another state (and had no reason to inform you of their move)? What would you have done about the fact that since the beginning of the study, an additional 4,212 people took up residence in the state of Texas? As you can see, it is sometimes very difficult to develop a truly random procedure. For this reason, other kinds of sampling techniques have been devised. We now discuss two of them.<\/p>\n<h5 class=\"H3\">Stratified Sampling<\/h5>\n<p class=\"Text-1st\">Since simple random sampling often does not ensure a representative sample, a sampling method called <span class=\"key-term\"><a id=\"stratified-random-sampling\"><\/a>stratified random sampling<\/span> is sometimes used to make the sample more representative of the population. This method can be used if the population has a number of distinct \u201cstrata\u201d or groups. In stratified sampling, you first identify members of your sample who belong to each group. Then you randomly sample from each of those subgroups in such a way that the sizes of the subgroups in the sample are proportional to their sizes in the population.<\/p>\n<p class=\"Text\">Let\u2019s take an example: Suppose you were interested in views of capital punishment at an urban university. You have the time and resources to interview 200 students. The student body is diverse with respect to age; many older people work during the day and enroll in night courses (average age is 39), while younger students generally enroll in day classes (average age of 19). It is possible that night students have different views about capital punishment than day students. If 70% of the students were day students, it makes sense to ensure that 70% of the sample consisted of day students. Thus, your sample of 200 students would consist of 140 day students and 60 night students. The proportion of day students in the sample and in the population (the entire university) would be the same. Inferences to the entire population of students at the university would therefore be more secure.<\/p>\n<h5 class=\"H3\"><a id=\"convenience-sampling\"><\/a>Convenience Sampling<\/h5>\n<p class=\"Text-1st\">Not all sampling methods are perfect, and sometimes that\u2019s okay. For example, if we are beginning research into a completely unstudied area, we may sometimes take some shortcuts to quickly gather data and get a general idea of how things work before fully investing a lot of time and money into well-designed research projects with proper sampling. This is known as <span class=\"key-term\">convenience sampling<\/span>, named for its ease of use. In limited cases, such as the one just described, convenience sampling is okay because we intend to follow up with a representative sample. Unfortunately, sometimes convenience sampling is used due only to its convenience without the intent of improving on it in future work.<\/p>\n<h3 class=\"H1\">Types of Statistical Analyses<\/h3>\n<p class=\"Text-1st\">Now that we understand the nature of our data, let\u2019s turn to the types of statistics we can use to interpret them. There are two types of statistics: descriptive and inferential.<\/p>\n<h4 class=\"H2\">Descriptive Statistics<\/h4>\n<p class=\"Text-1st\"><span class=\"key-term\"><a id=\"descriptive-statistics\"><\/a><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_39_505\">Descriptive statistics<\/a><\/span> are numbers that are used to summarize and describe data. The word \u201cdata\u201d refers to the information that has been collected from an experiment, a survey, a historical record, etc. (By the way, <span class=\"italic\">data<\/span> is plural. One piece of information is called a <span class=\"italic\">datum<\/span>.) If we are analyzing birth certificates, for example, a descriptive statistic might be the percentage of certificates issued in New York State, or the average age of the mother. Any other number we choose to compute also counts as a descriptive statistic for the data from which the statistic is computed. Several descriptive statistics are often used at one time to give a full picture of the data.<\/p>\n<p class=\"Text\">Descriptive statistics are just descriptive. They do not involve generalizing beyond the data at hand. Generalizing from our data to another set of cases is the business of inferential statistics, which you\u2019ll be studying in another section. Here we focus on (mere) descriptive statistics.<\/p>\n<p class=\"Text\">Some descriptive statistics are shown in <a href=\"#_idTextAnchor030\"><span class=\"Fig-table-number-underscore\">Table 1.1<\/span><\/a>. The table shows the average salaries for various occupations in the United States in 1999. Descriptive statistics like these offer insight into American society. It is interesting to note, for example, that we pay the people who educate our children and who protect our citizens a great deal less than we pay people who take care of our feet or our teeth.<\/p>\n<div class=\"_idGenObjectLayout-1\">\n<div id=\"_idContainer005\" class=\"Basic-Text-Frame\">\n<p class=\"Table-title\"><span class=\"Fig-table-number\">Table 1.1.<\/span> Average salaries for various U.S. occupations in 1999.<\/p>\n<table id=\"table004\" class=\"Foster-table\">\n<colgroup>\n<col class=\"_idGenTableRowColumn-13\" \/>\n<col class=\"_idGenTableRowColumn-14\" \/><\/colgroup>\n<thead>\n<tr class=\"Foster-table _idGenTableRowColumn-5\">\n<th class=\"Foster-table Table-col-hd CellOverride-2\" scope=\"row\">\n<p class=\"Table-col-hd\">Occupation<\/p>\n<\/th>\n<th class=\"Foster-table Table-col-hd\" scope=\"row\">\n<p class=\"Table-col-hd ParaOverride-4\">Salary<\/p>\n<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr class=\"Foster-table _idGenTableRowColumn-6\">\n<td class=\"Foster-table Table-body CellOverride-2 _idGenCellOverride-1\">\n<p class=\"Table-body\">Pediatricians<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-3 _idGenCellOverride-1\">\n<p class=\"Table-body ParaOverride-5\">$112,760<\/p>\n<\/td>\n<\/tr>\n<tr class=\"Foster-table _idGenTableRowColumn-7\">\n<td class=\"Foster-table Table-body CellOverride-2 _idGenCellOverride-2\">\n<p class=\"Table-body\">Dentists<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-3 _idGenCellOverride-2\">\n<p class=\"Table-body ParaOverride-5\">$106,130<\/p>\n<\/td>\n<\/tr>\n<tr class=\"Foster-table _idGenTableRowColumn-6\">\n<td class=\"Foster-table Table-body CellOverride-2 _idGenCellOverride-2\">\n<p class=\"Table-body\">Podiatrists<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-3 _idGenCellOverride-2\">\n<p class=\"Table-body ParaOverride-5\">$100,090<\/p>\n<\/td>\n<\/tr>\n<tr class=\"Foster-table _idGenTableRowColumn-7\">\n<td class=\"Foster-table Table-body CellOverride-2 _idGenCellOverride-2\">\n<p class=\"Table-body\">Physicists<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-3 _idGenCellOverride-2\">\n<p class=\"Table-body ParaOverride-5\">$76,140<\/p>\n<\/td>\n<\/tr>\n<tr class=\"Foster-table _idGenTableRowColumn-6\">\n<td class=\"Foster-table Table-body CellOverride-2 _idGenCellOverride-2\">\n<p class=\"Table-body\">Architects<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-3 _idGenCellOverride-2\">\n<p class=\"Table-body ParaOverride-5\">$53,410<\/p>\n<\/td>\n<\/tr>\n<tr class=\"Foster-table _idGenTableRowColumn-7\">\n<td class=\"Foster-table Table-body CellOverride-2 _idGenCellOverride-2\">\n<p class=\"Table-body\">School, clinical, and counseling psychologists<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-3 _idGenCellOverride-2\">\n<p class=\"Table-body ParaOverride-5\">$49,720<\/p>\n<\/td>\n<\/tr>\n<tr class=\"Foster-table _idGenTableRowColumn-6\">\n<td class=\"Foster-table Table-body CellOverride-2 _idGenCellOverride-2\">\n<p class=\"Table-body\">Flight attendants<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-3 _idGenCellOverride-2\">\n<p class=\"Table-body ParaOverride-5\">$47,910<\/p>\n<\/td>\n<\/tr>\n<tr class=\"Foster-table _idGenTableRowColumn-7\">\n<td class=\"Foster-table Table-body CellOverride-2 _idGenCellOverride-2\">\n<p class=\"Table-body\">Elementary school teachers<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-3 _idGenCellOverride-2\">\n<p class=\"Table-body ParaOverride-5\">$39,560<\/p>\n<\/td>\n<\/tr>\n<tr class=\"Foster-table _idGenTableRowColumn-6\">\n<td class=\"Foster-table Table-body CellOverride-2 _idGenCellOverride-2\">\n<p class=\"Table-body\">Police officers<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-3 _idGenCellOverride-2\">\n<p class=\"Table-body ParaOverride-5\">$38,710<\/p>\n<\/td>\n<\/tr>\n<tr class=\"Foster-table _idGenTableRowColumn-8\">\n<td class=\"Foster-table Table-body-last Table-body CellOverride-2\">\n<p class=\"Table-body\">Floral designers<\/p>\n<\/td>\n<td class=\"Foster-table Table-body-last Table-body CellOverride-3\">\n<p class=\"Table-body ParaOverride-5\">$18,980<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<\/div>\n<p class=\"Text\">For more descriptive statistics, consider <a href=\"#_idTextAnchor031\"><span class=\"Fig-table-number-underscore\">Table 1.2<\/span><\/a>. It shows the number of unmarried men per 100 unmarried women in U.S. metro areas in 1990. From this table we see that men outnumber women most in Jacksonville, North Carolina, and women outnumber men most in Sarasota, Florida. You can see that descriptive statistics can be useful if we are looking for an opposite-sex partner! (These data come from the <a href=\"https:\/\/www.infoplease.com\/us\/states\/the-top-ten-us-male-female-ratios\"><span class=\"Hyperlink-underscore\">Information Please Almanac<\/span><\/a>.)<\/p>\n<div class=\"_idGenObjectLayout-1\">\n<div id=\"_idContainer006\" class=\"_idGenObjectStyleOverride-1\">\n<p class=\"Table-title\"><span class=\"Fig-table-number\">Table 1.2.<\/span> Number of unmarried men per 100 unmarried women in U.S. metro areas in 1990. <span class=\"italic CharOverride-3\">note<\/span><span class=\"italic\">: Unmarried includes never<\/span>&#8211;<span class=\"italic\">married, widowed, and divorced persons, 15 years or older.<\/span><\/p>\n<table id=\"table005\" class=\"Foster-table\" style=\"height: 187px;\">\n<colgroup>\n<col class=\"_idGenTableRowColumn-15\" \/>\n<col class=\"_idGenTableRowColumn-16\" \/>\n<col class=\"_idGenTableRowColumn-17\" \/>\n<col class=\"_idGenTableRowColumn-18\" \/><\/colgroup>\n<thead>\n<tr class=\"Foster-table _idGenTableRowColumn-19\" style=\"height: 17px;\">\n<td class=\"Foster-table Table-col-hd CellOverride-4\" style=\"height: 17px; width: 324.312px;\">\n<p class=\"Table-col-hd\">Cities with Mostly Men<\/p>\n<\/td>\n<td class=\"Foster-table Table-col-hd CellOverride-5\" style=\"height: 17px; width: 136.523px;\">\n<p class=\"Table-col-hd ParaOverride-4\">Men per 100 Women<\/p>\n<\/td>\n<td class=\"Foster-table Table-col-hd CellOverride-4\" style=\"height: 17px; width: 215.656px;\">\n<p class=\"Table-col-hd\">Cities with Mostly Women<\/p>\n<\/td>\n<td class=\"Foster-table Table-col-hd\" style=\"height: 17px; width: 136.523px;\">\n<p class=\"Table-col-hd ParaOverride-4\">Men per 100 Women<\/p>\n<\/td>\n<\/tr>\n<\/thead>\n<tbody>\n<tr class=\"Foster-table _idGenTableRowColumn-6\" style=\"height: 17px;\">\n<th style=\"height: 17px; width: 324.312px;\" scope=\"row\">\n<p class=\"Table-numbered-list ParaOverride-6\">1. Jacksonville, North Carolina<\/p>\n<\/th>\n<td class=\"Foster-table Table-body CellOverride-5 _idGenCellOverride-1\" style=\"height: 17px; width: 136.523px;\">\n<p class=\"Table-body ParaOverride-4\">224<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-4 _idGenCellOverride-1\" style=\"height: 17px; width: 215.656px;\">\n<p class=\"Table-numbered-list ParaOverride-6\">1. Sarasota, Florida<\/p>\n<\/td>\n<td class=\"Foster-table Table-body _idGenCellOverride-1\" style=\"height: 17px; width: 136.523px;\">\n<p class=\"Table-body ParaOverride-4\">66<\/p>\n<\/td>\n<\/tr>\n<tr class=\"Foster-table _idGenTableRowColumn-7\" style=\"height: 17px;\">\n<td class=\"Foster-table Table-body CellOverride-4 _idGenCellOverride-2\" style=\"height: 17px; width: 324.312px;\">\n<p class=\"Table-numbered-list ParaOverride-6\">2. Killeen\u2013Temple, Texas<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-5 _idGenCellOverride-2\" style=\"height: 17px; width: 136.523px;\">\n<p class=\"Table-body ParaOverride-4\">123<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-4 _idGenCellOverride-2\" style=\"height: 17px; width: 215.656px;\">\n<p class=\"Table-numbered-list ParaOverride-6\">2. Bradenton, Florida<\/p>\n<\/td>\n<td class=\"Foster-table Table-body _idGenCellOverride-2\" style=\"height: 17px; width: 136.523px;\">\n<p class=\"Table-body ParaOverride-4\">68<\/p>\n<\/td>\n<\/tr>\n<tr class=\"Foster-table _idGenTableRowColumn-6\" style=\"height: 17px;\">\n<td class=\"Foster-table Table-body CellOverride-4 _idGenCellOverride-2\" style=\"height: 17px; width: 324.312px;\">\n<p class=\"Table-numbered-list ParaOverride-6\">3. Fayetteville, North Carolina<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-5 _idGenCellOverride-2\" style=\"height: 17px; width: 136.523px;\">\n<p class=\"Table-body ParaOverride-4\">118<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-4 _idGenCellOverride-2\" style=\"height: 17px; width: 215.656px;\">\n<p class=\"Table-numbered-list ParaOverride-6\">3. Altoona, Pennsylvania<\/p>\n<\/td>\n<td class=\"Foster-table Table-body _idGenCellOverride-2\" style=\"height: 17px; width: 136.523px;\">\n<p class=\"Table-body ParaOverride-4\">69<\/p>\n<\/td>\n<\/tr>\n<tr class=\"Foster-table _idGenTableRowColumn-7\" style=\"height: 17px;\">\n<td class=\"Foster-table Table-body CellOverride-4 _idGenCellOverride-2\" style=\"height: 17px; width: 324.312px;\">\n<p class=\"Table-numbered-list ParaOverride-6\">4. Brazoria, Texas<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-5 _idGenCellOverride-2\" style=\"height: 17px; width: 136.523px;\">\n<p class=\"Table-body ParaOverride-4\">117<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-4 _idGenCellOverride-2\" style=\"height: 17px; width: 215.656px;\">\n<p class=\"Table-numbered-list ParaOverride-6\">4. Springfield, Illinois<\/p>\n<\/td>\n<td class=\"Foster-table Table-body _idGenCellOverride-2\" style=\"height: 17px; width: 136.523px;\">\n<p class=\"Table-body ParaOverride-4\">70<\/p>\n<\/td>\n<\/tr>\n<tr class=\"Foster-table _idGenTableRowColumn-6\" style=\"height: 17px;\">\n<td class=\"Foster-table Table-body CellOverride-4 _idGenCellOverride-2\" style=\"height: 17px; width: 324.312px;\">\n<p class=\"Table-numbered-list ParaOverride-6\">5. Lawton, Oklahoma<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-5 _idGenCellOverride-2\" style=\"height: 17px; width: 136.523px;\">\n<p class=\"Table-body ParaOverride-4\">116<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-4 _idGenCellOverride-2\" style=\"height: 17px; width: 215.656px;\">\n<p class=\"Table-numbered-list ParaOverride-6\">5. Jacksonville, Tennessee<\/p>\n<\/td>\n<td class=\"Foster-table Table-body _idGenCellOverride-2\" style=\"height: 17px; width: 136.523px;\">\n<p class=\"Table-body ParaOverride-4\">70<\/p>\n<\/td>\n<\/tr>\n<tr class=\"Foster-table _idGenTableRowColumn-7\" style=\"height: 17px;\">\n<td class=\"Foster-table Table-body CellOverride-4 _idGenCellOverride-2\" style=\"height: 17px; width: 324.312px;\">\n<p class=\"Table-numbered-list ParaOverride-6\">6. State College, Pennsylvania<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-5 _idGenCellOverride-2\" style=\"height: 17px; width: 136.523px;\">\n<p class=\"Table-body ParaOverride-4\">113<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-4 _idGenCellOverride-2\" style=\"height: 17px; width: 215.656px;\">\n<p class=\"Table-numbered-list ParaOverride-6\">6. Gadsden, Alabama<\/p>\n<\/td>\n<td class=\"Foster-table Table-body _idGenCellOverride-2\" style=\"height: 17px; width: 136.523px;\">\n<p class=\"Table-body ParaOverride-4\">70<\/p>\n<\/td>\n<\/tr>\n<tr class=\"Foster-table _idGenTableRowColumn-20\" style=\"height: 17px;\">\n<td class=\"Foster-table Table-body CellOverride-4 _idGenCellOverride-2\" style=\"height: 17px; width: 324.312px;\">\n<p class=\"Table-numbered-list ParaOverride-6\">7. Clarksville\u2013Hopkinsville, Tennessee\u2013Kentucky<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-5 _idGenCellOverride-2\" style=\"height: 17px; width: 136.523px;\">\n<p class=\"Table-body ParaOverride-4\">113<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-4 _idGenCellOverride-2\" style=\"height: 17px; width: 215.656px;\">\n<p class=\"Table-numbered-list ParaOverride-6\">7. Wheeling, West\u00a0Virginia\u2013Ohio<\/p>\n<\/td>\n<td class=\"Foster-table Table-body _idGenCellOverride-2\" style=\"height: 17px; width: 136.523px;\">\n<p class=\"Table-body ParaOverride-4\">70<\/p>\n<\/td>\n<\/tr>\n<tr class=\"Foster-table _idGenTableRowColumn-7\" style=\"height: 17px;\">\n<td class=\"Foster-table Table-body CellOverride-4 _idGenCellOverride-2\" style=\"height: 17px; width: 324.312px;\">\n<p class=\"Table-numbered-list ParaOverride-6\">8. Anchorage, Alaska<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-5 _idGenCellOverride-2\" style=\"height: 17px; width: 136.523px;\">\n<p class=\"Table-body ParaOverride-4\">112<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-4 _idGenCellOverride-2\" style=\"height: 17px; width: 215.656px;\">\n<p class=\"Table-numbered-list ParaOverride-6\">8. Charleston, West Virginia<\/p>\n<\/td>\n<td class=\"Foster-table Table-body _idGenCellOverride-2\" style=\"height: 17px; width: 136.523px;\">\n<p class=\"Table-body ParaOverride-4\">71<\/p>\n<\/td>\n<\/tr>\n<tr class=\"Foster-table _idGenTableRowColumn-6\" style=\"height: 17px;\">\n<td class=\"Foster-table Table-body CellOverride-4 _idGenCellOverride-2\" style=\"height: 17px; width: 324.312px;\">\n<p class=\"Table-numbered-list ParaOverride-6\">9. Salinas\u2013Seaside\u2013Monterey, California<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-5 _idGenCellOverride-2\" style=\"height: 17px; width: 136.523px;\">\n<p class=\"Table-body ParaOverride-4\">112<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-4 _idGenCellOverride-2\" style=\"height: 17px; width: 215.656px;\">\n<p class=\"Table-numbered-list ParaOverride-6\">9. St. Joseph, Missouri<\/p>\n<\/td>\n<td class=\"Foster-table Table-body _idGenCellOverride-2\" style=\"height: 17px; width: 136.523px;\">\n<p class=\"Table-body ParaOverride-4\">71<\/p>\n<\/td>\n<\/tr>\n<tr class=\"Foster-table _idGenTableRowColumn-8\" style=\"height: 17px;\">\n<td class=\"Foster-table Table-body-last Table-body CellOverride-4\" style=\"height: 17px; width: 324.312px;\">\n<p class=\"Table-numbered-list ParaOverride-6\">10. Bryan\u2013College Station, Texas<\/p>\n<\/td>\n<td class=\"Foster-table Table-body-last Table-body CellOverride-5\" style=\"height: 17px; width: 136.523px;\">\n<p class=\"Table-body ParaOverride-4\">111<\/p>\n<\/td>\n<td class=\"Foster-table Table-body-last Table-body CellOverride-4\" style=\"height: 17px; width: 215.656px;\">\n<p class=\"Table-numbered-list ParaOverride-6\">10. Lynchburg, Virginia<\/p>\n<\/td>\n<td class=\"Foster-table Table-body-last Table-body\" style=\"height: 17px; width: 136.523px;\">\n<p class=\"Table-body ParaOverride-4\">71<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<\/div>\n<p class=\"Text\">These descriptive statistics may make us ponder why the numbers are so disparate in these cities. One potential explanation, for instance, as to why there are more women in Florida than men may involve the fact that elderly individuals tend to move down to the Sarasota region and that women tend to outlive men. Thus, more women might live in Sarasota than men. However, in the absence of proper data, this is only speculation.<\/p>\n<p class=\"Text\">There are many descriptive statistics that we can compute from the data in these tables. To gain insight into the improvement in speed over the years, let us divide the men\u2019s times into two pieces, namely, the first 13 races (up to 1952) and the second 13 (starting from 1956). The mean winning time for the first 13 races is 2 hours, 44 minutes, and 22 seconds (written 2:44:22). The mean winning time for the second 13 races is 2:13:18. This is quite a difference (over half an hour). Does this prove that the fastest men are running faster? Or is the difference just due to chance, no more than what often emerges from chance differences in performance from year to year? We can\u2019t answer this question with descriptive statistics alone. All we can affirm is that the two means are \u201csuggestive.\u201d<\/p>\n<p class=\"Text\">It is also important to differentiate what we use to describe populations vs. what we use to describe samples. A population is described by a parameter; the parameter is the true value of the descriptive in the population, but one that we can never know for sure. For example, the Bureau of Labor Statistics reports that the average hourly wage of chefs is $23.87. However, even if this number were computed using information from every single chef in the United States (making it a parameter), it would quickly become slightly off as one chef retires and a new chef enters the job market. Additionally, as noted above, there is virtually no way to collect data from every single person in a population. In order to understand a variable, we estimate the population parameter using a sample statistic. Here, the term <span class=\"italic\">statistic<\/span> refers to the specific number we compute from the data (e.g., the average), not the field of statistics. A sample statistic is an estimate of the true population parameter, and if our sample is representative of the population, then the statistic is considered to be a good estimator of the parameter.<\/p>\n<p class=\"Text\">Even the best sample will be somewhat off from the full population, earlier referred to as sampling bias, and as a result, there will always be a tiny discrepancy between the parameter and the statistic we use to estimate it. This difference is known as <span class=\"key-term\"><a id=\"sampling-error\"><\/a>sampling error<\/span>, and, as we will see throughout the course, understanding sampling error is the key to understanding statistics. Every observation we make about a variable, be it a full research study or observing an individual\u2019s behavior, is incapable of being completely representative of all possibilities for that variable. Knowing where to draw the line between an unusual observation and a true difference is what statistics is all about.<\/p>\n<h4 class=\"H2\">Inferential Statistics<\/h4>\n<p class=\"Text-1st\">Descriptive statistics are wonderful at telling us what our data look like. However, what we often want to understand is how our data behave. What variables are related to other variables? Under what conditions will the value of a variable change? Are two groups different from each other, and if so, are people within each group different or similar? These are the questions answered by inferential statistics, and inferential statistics are how we generalize from our sample back up to our population. <a href=\"https:\/\/pressbooks.palomar.edu\/introtostats\/part\/unit-2-hypothesis-testing\/\"><span class=\"Hyperlink-underscore\">Unit 2<\/span><\/a> and <a href=\"https:\/\/pressbooks.palomar.edu\/introtostats\/part\/unit-3-additional-hypothesis-tests\/\"><span class=\"Hyperlink-underscore\">Unit 3<\/span><\/a> are all about <span class=\"key-term\"><a id=\"inferential-statistics\"><\/a><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_39_526\">inferential statistics<\/a><\/span>, the formal analyses and tests we run to make conclusions about our data.<\/p>\n<p class=\"Text\">For example, we will learn how to use a <span class=\"italic\">t <\/span>statistic to determine whether people change over time when enrolled in an intervention. We will also use an <span class=\"italic\">F\u00a0<\/span>statistic to determine if we can predict future values on a variable based on current known values of a variable. There are many types of inferential statistics, each allowing us insight into a different behavior of the data we collect. This course will only touch on a small subset (or a <span class=\"italic\">sample<\/span>) of them, but the principles we learn along the way will make it easier to learn new tests, as most inferential statistics follow the same structure and format.<\/p>\n<h3 class=\"H1\">A Note about Statistical Software<\/h3>\n<p class=\"Text-1st\">Many pieces of technology support statistical analysis and quantitative data analysis done by psychologists. The statistical software we use is the proprietary Statistical Package for the Social Sciences (SPSS) which can be accessed through the virtual desktop at Palomar College.<\/p>\n<h3 class=\"H1\">Mathematical Notation<\/h3>\n<p class=\"Text-1st\">As noted earlier, statistics is not math. It does, however, use math as a tool. Many statistical formulas involve summing numbers. Fortunately, there is a convenient notation for expressing summation. This section covers the basics of this summation notation.<\/p>\n<p class=\"Text\">Let\u2019s say we have a variable <span class=\"italic\">X<\/span> that represents the weights (in grams) of 4 grapes:<\/p>\n<table id=\"table008\" class=\"Foster-table _idGenTablePara-1\" style=\"height: 78px; width: 194px;\">\n<colgroup>\n<col class=\"_idGenTableRowColumn-25\" \/>\n<col class=\"_idGenTableRowColumn-26\" \/><\/colgroup>\n<thead>\n<tr class=\"Foster-table _idGenTableRowColumn-5\" style=\"height: 10px;\">\n<th class=\"Foster-table Table-col-hd CellOverride-2\" style=\"height: 10px; width: 235.844px;\" scope=\"row\">\n<p class=\"Table-col-hd ParaOverride-4\">Grape<\/p>\n<\/th>\n<th class=\"Foster-table Table-col-hd\" style=\"height: 10px; width: 160px;\" scope=\"row\">\n<p class=\"Table-col-hd ParaOverride-4\"><span class=\"bold-italic\">X<\/span><\/p>\n<\/th>\n<\/tr>\n<tr class=\"Foster-table _idGenTableRowColumn-6\" style=\"height: 17px;\">\n<th class=\"Foster-table Table-body CellOverride-2 _idGenCellOverride-1\" style=\"height: 17px; width: 235.844px;\" scope=\"row\">\n<p class=\"Table-body ParaOverride-4\">Grape 1<\/p>\n<\/th>\n<th class=\"Foster-table Table-body _idGenCellOverride-1\" style=\"height: 17px; width: 160px;\" scope=\"row\">\n<p class=\"Table-body ParaOverride-4\">4.6<\/p>\n<\/th>\n<\/tr>\n<tr class=\"Foster-table _idGenTableRowColumn-7\" style=\"height: 17px;\">\n<th class=\"Foster-table Table-body CellOverride-2 _idGenCellOverride-2\" style=\"height: 17px; width: 235.844px;\" scope=\"row\">\n<p class=\"Table-body ParaOverride-4\">Grape 2<\/p>\n<\/th>\n<th class=\"Foster-table Table-body _idGenCellOverride-2\" style=\"height: 17px; width: 160px;\" scope=\"row\">\n<p class=\"Table-body ParaOverride-4\">5.1<\/p>\n<\/th>\n<\/tr>\n<tr class=\"Foster-table _idGenTableRowColumn-6\" style=\"height: 17px;\">\n<th class=\"Foster-table Table-body CellOverride-2 _idGenCellOverride-2\" style=\"height: 17px; width: 235.844px;\" scope=\"row\">\n<p class=\"Table-body ParaOverride-4\">Grape 3<\/p>\n<\/th>\n<th class=\"Foster-table Table-body _idGenCellOverride-2\" style=\"height: 17px; width: 160px;\" scope=\"row\">\n<p class=\"Table-body ParaOverride-4\">4.9<\/p>\n<\/th>\n<\/tr>\n<tr class=\"Foster-table _idGenTableRowColumn-8\" style=\"height: 17px;\">\n<th class=\"Foster-table Table-body-last Table-body CellOverride-2\" style=\"height: 17px; width: 235.844px;\" scope=\"row\">\n<p class=\"Table-body ParaOverride-4\">Grape 4<\/p>\n<\/th>\n<th class=\"Foster-table Table-body-last Table-body\" style=\"height: 17px; width: 160px;\" scope=\"row\">\n<p class=\"Table-body ParaOverride-4\">4.4<\/p>\n<\/th>\n<\/tr>\n<\/thead>\n<\/table>\n<p class=\"Text\">The Greek letter <img decoding=\"async\" class=\"_idGenObjectAttribute-5\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2021\/12\/Eqn1.1-sigma-2.png\" alt=\"Upper Sigma\" \/> indicates summation.<\/p>\n<p class=\"Text\">When all the scores of a variable (such as <span class=\"italic\">X<\/span>) are to be summed, it is often convenient to use the following abbreviated notation:<\/p>\n<p class=\"Equation\"><img decoding=\"async\" class=\"_idGenObjectAttribute-10\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn1.4-2.png\" alt=\"sigma-summation Upper X\" \/><strong> =\u00a0 4.6 + 5.1 + 4.9 + 4.4 equals 19.<\/strong><\/p>\n<p class=\"Text\">Thus it means to sum all the values of <span class=\"italic\">X<\/span>.<\/p>\n<p class=\"Text\">Many formulas involve squaring numbers before they are summed. This is indicated as<\/p>\n<p class=\"Equation\"><img loading=\"lazy\" decoding=\"async\" class=\"_idGenObjectAttribute-11 alignnone\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn1.5-2.png\" alt=\"sigma-summation Upper X Sup 2 Base. Square and then add x values. Total is 90.54\" width=\"744\" height=\"42\" \/><\/p>\n<p class=\"Text\">Notice that:<\/p>\n<p class=\"Equation\"><img decoding=\"async\" class=\"_idGenObjectAttribute-12\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn1.6-2.png\" alt=\"l-par sigma-summation Upper X r-par Sup 2 Base not-equals sigma-summation Upper X Sup 2\" \/><\/p>\n<p class=\"Text\">because the expression on the left means to sum up all the values of <span class=\"italic\">X<\/span> and then square the sum (<img decoding=\"async\" class=\"_idGenObjectAttribute-13\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn1.6a-2.png\" alt=\"19 Sup 2 Base equals 361\" \/>), whereas the expression on the right means to square the numbers and then sum the squares (90.54, as shown).<\/p>\n<p class=\"Text\">Some formulas involve the sum of cross products. Below are the data for variables <span class=\"italic\">X<\/span> and <span class=\"italic\">Y<\/span>. The cross products (<span class=\"italic\">XY<\/span>) are shown in the third column. The sum of the cross products is <img decoding=\"async\" class=\"_idGenObjectAttribute-14\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn1.6b-2.png\" alt=\"3 plus 4 plus 21 equals 28\" \/>.<\/p>\n<table id=\"table009\" class=\"Foster-table _idGenTablePara-1\">\n<colgroup>\n<col class=\"_idGenTableRowColumn-27\" \/>\n<col class=\"_idGenTableRowColumn-27\" \/>\n<col class=\"_idGenTableRowColumn-2\" \/><\/colgroup>\n<thead>\n<tr class=\"Foster-table _idGenTableRowColumn-5\">\n<th class=\"Foster-table Table-col-hd CellOverride-2\" scope=\"row\">\n<p class=\"Table-col-hd\"><span class=\"bold-italic\">X<\/span><\/p>\n<\/th>\n<th class=\"Foster-table Table-col-hd CellOverride-2\" scope=\"row\">\n<p class=\"Table-col-hd\"><span class=\"bold-italic\">Y<\/span><\/p>\n<\/th>\n<th class=\"Foster-table Table-col-hd\" scope=\"row\">\n<p class=\"Table-col-hd\"><span class=\"bold-italic\">XY<\/span><\/p>\n<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr class=\"Foster-table _idGenTableRowColumn-6\">\n<th class=\"Foster-table Table-body CellOverride-2 _idGenCellOverride-1\" scope=\"row\">\n<p class=\"Table-body\">1<\/p>\n<\/th>\n<th class=\"Foster-table Table-body CellOverride-2 _idGenCellOverride-1\" scope=\"row\">\n<p class=\"Table-body\">3<\/p>\n<\/th>\n<th class=\"Foster-table Table-body _idGenCellOverride-1\" scope=\"row\">\n<p class=\"Table-body\">3<\/p>\n<\/th>\n<\/tr>\n<tr class=\"Foster-table _idGenTableRowColumn-7\">\n<th class=\"Foster-table Table-body CellOverride-2 _idGenCellOverride-2\" scope=\"row\">\n<p class=\"Table-body\">2<\/p>\n<\/th>\n<th class=\"Foster-table Table-body CellOverride-2 _idGenCellOverride-2\" scope=\"row\">\n<p class=\"Table-body\">2<\/p>\n<\/th>\n<th class=\"Foster-table Table-body _idGenCellOverride-2\" scope=\"row\">\n<p class=\"Table-body\">4<\/p>\n<\/th>\n<\/tr>\n<tr class=\"Foster-table _idGenTableRowColumn-11\">\n<th class=\"Foster-table Table-body-last Table-body CellOverride-2\" scope=\"row\">\n<p class=\"Table-body\">3<\/p>\n<\/th>\n<th class=\"Foster-table Table-body-last Table-body CellOverride-2\" scope=\"row\">\n<p class=\"Table-body\">7<\/p>\n<\/th>\n<th class=\"Foster-table Table-body-last Table-body\" scope=\"row\">\n<p class=\"Table-body\">21<\/p>\n<\/th>\n<\/tr>\n<\/tbody>\n<\/table>\n<p class=\"Text\">In summation notation, this is written as:<\/p>\n<p class=\"Equation\"><img decoding=\"async\" class=\"_idGenObjectAttribute-15\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn1.7-2.png\" alt=\"sigma-summation Upper X Upper Y equals 28\" \/><\/p>\n<h3 class=\"H1\">Exercises<\/h3>\n<ol>\n<li class=\"Numbered-list-Exercises-1st\">In your own words, describe why we study statistics.<\/li>\n<li class=\"Numbered-list-Exercises\">For each of the following, determine if the variable is continuous or discrete:\n<ol>\n<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">Time taken to read a book chapter<\/li>\n<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">Favorite food<\/li>\n<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">Cognitive ability<\/li>\n<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">Temperature<\/li>\n<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">Letter grade received in a class<\/li>\n<\/ol>\n<\/li>\n<li class=\"Numbered-list-Exercises\">For each of the following, determine the level of measurement:\n<ol>\n<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">T-shirt size<\/li>\n<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">Time taken to run 100-meter race<\/li>\n<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">First, second, and third place in 100-meter race<\/li>\n<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">Birthplace<\/li>\n<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">Temperature in Celsius<\/li>\n<\/ol>\n<\/li>\n<li class=\"Numbered-list-Exercises\">What is the difference between a population and a sample? Which is described by a parameter and which is described by a statistic?<\/li>\n<li class=\"Numbered-list-Exercises\">What is sampling bias? What is sampling error?<\/li>\n<li class=\"Numbered-list-Exercises\">What is the difference between a simple random sample and a stratified random sample?<\/li>\n<li class=\"Numbered-list-Exercises\"><a id=\"non-experimental-research\"><\/a><a id=\"experimental-research\"><\/a>What are the two key characteristics of a true experimental design?<\/li>\n<li class=\"Numbered-list-Exercises\"><a id=\"quasi-experimental-research\"><\/a>When would we use a quasi-experimental design?<\/li>\n<li class=\"Numbered-list-Exercises\">Use the following dataset for the computations below:<br \/>\n<table id=\"table010\" class=\"Foster-table _idGenTablePara-1\" style=\"height: 102px;\">\n<colgroup>\n<col class=\"_idGenTableRowColumn-27\" \/>\n<col class=\"_idGenTableRowColumn-28\" \/><\/colgroup>\n<thead>\n<tr class=\"Foster-table _idGenTableRowColumn-5\" style=\"height: 17px;\">\n<td class=\"Foster-table Table-col-hd\" style=\"height: 17px; width: 194px;\">\n<p class=\"Table-col-hd\"><span class=\"bold-italic\">X<\/span><\/p>\n<\/td>\n<td class=\"Foster-table Table-col-hd\" style=\"height: 17px; width: 194px;\">\n<p class=\"Table-col-hd\"><span class=\"bold-italic\">Y<\/span><\/p>\n<\/td>\n<\/tr>\n<\/thead>\n<tbody>\n<tr class=\"Foster-table _idGenTableRowColumn-6\" style=\"height: 17px;\">\n<td class=\"Foster-table Table-body _idGenCellOverride-1\" style=\"height: 17px; width: 194px;\">\n<p class=\"Table-body\">2<\/p>\n<\/td>\n<td class=\"Foster-table Table-body _idGenCellOverride-1\" style=\"height: 17px; width: 194px;\">\n<p class=\"Table-body\">8<\/p>\n<\/td>\n<\/tr>\n<tr class=\"Foster-table _idGenTableRowColumn-7\" style=\"height: 17px;\">\n<td class=\"Foster-table Table-body _idGenCellOverride-2\" style=\"height: 17px; width: 194px;\">\n<p class=\"Table-body\">3<\/p>\n<\/td>\n<td class=\"Foster-table Table-body _idGenCellOverride-2\" style=\"height: 17px; width: 194px;\">\n<p class=\"Table-body\">8<\/p>\n<\/td>\n<\/tr>\n<tr class=\"Foster-table _idGenTableRowColumn-6\" style=\"height: 17px;\">\n<td class=\"Foster-table Table-body _idGenCellOverride-2\" style=\"height: 17px; width: 194px;\">\n<p class=\"Table-body\">7<\/p>\n<\/td>\n<td class=\"Foster-table Table-body _idGenCellOverride-2\" style=\"height: 17px; width: 194px;\">\n<p class=\"Table-body\">4<\/p>\n<\/td>\n<\/tr>\n<tr class=\"Foster-table _idGenTableRowColumn-7\" style=\"height: 17px;\">\n<td class=\"Foster-table Table-body _idGenCellOverride-2\" style=\"height: 17px; width: 194px;\">\n<p class=\"Table-body\">5<\/p>\n<\/td>\n<th style=\"height: 17px; width: 194px;\" scope=\"row\">\n<p class=\"Table-body\">1<\/p>\n<\/th>\n<\/tr>\n<tr class=\"Foster-table _idGenTableRowColumn-11\" style=\"height: 17px;\">\n<td class=\"Foster-table Table-body-last Table-body\" style=\"height: 17px; width: 194px;\">\n<p class=\"Table-body\">9<\/p>\n<\/td>\n<td class=\"Foster-table Table-body-last Table-body\" style=\"height: 17px; width: 194px;\">\n<p class=\"Table-body\">4<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<ol>\n<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\"><img decoding=\"async\" class=\"_idGenObjectAttribute-10\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn1.4-2.png\" alt=\"sigma-summation Upper X\" \/><\/li>\n<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\"><span xml:lang=\"ar-SA\"><img decoding=\"async\" class=\"_idGenObjectAttribute-16\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn1.9-2.png\" alt=\"sigma-summation Upper Y Sup 2\" \/><\/span><\/li>\n<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\"><img decoding=\"async\" class=\"_idGenObjectAttribute-17\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn1.10-2.png\" alt=\"sigma-summation Upper X Upper Y\" \/><\/li>\n<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\"><img decoding=\"async\" class=\"_idGenObjectAttribute-18\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn1.11-2.png\" alt=\"l-par sigma-summation Upper Y r-par Sup 2\" \/><\/li>\n<\/ol>\n<\/li>\n<li class=\"Numbered-list-Exercises\">What are the most common measures of central tendency and spread?<\/li>\n<\/ol>\n<div class=\"textbox textbox--learning-objectives\">\n<header class=\"textbox__header\">\n<h3 class=\"H1\">Answers to Odd-Numbered Exercises<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<p>1)<\/p>\n<p><span style=\"font-size: 14pt;\">Your answer could take many forms but should include information about objectively interpreting information and\/or communicating results and research conclusions.<\/span><\/p>\n<p>3)<\/p>\n<p><span style=\"font-size: 14pt;\">Ordinal<\/span><\/p>\n<p>&nbsp;<\/p>\n<p>5)<\/p>\n<p>Ratio<\/p>\n<ol>\n<li style=\"list-style-type: none;\">\n<ol>\n<li class=\"Numbered-list-Exercises-sub-odd _idGenParaOverride-2\">Ordinal<\/li>\n<li class=\"Numbered-list-Exercises-sub-odd _idGenParaOverride-2\">Nominal<\/li>\n<li class=\"Numbered-list-Exercises-sub-odd _idGenParaOverride-2\">Interval<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<p>7)<\/p>\n<p><span style=\"font-size: 14pt;\">Sampling bias is the difference in demographic characteristics between a sample and the population it should represent. Sampling error is the difference between a population parameter and sample statistic that is caused by random chance due to sampling bias.<\/span><\/p>\n<p>9)<\/p>\n<p><span style=\"font-size: 14pt;\">Random assignment to treatment conditions and manipulation of the independent variable<\/span><\/p>\n<ol>\n<li>26<\/li>\n<li>161<\/li>\n<li>109<\/li>\n<li>625<\/li>\n<\/ol>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"glossary\"><span class=\"screen-reader-text\" id=\"definition\">definition<\/span><template id=\"term_39_495\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_39_495\"><div tabindex=\"-1\"><p>The group in an experimental study that is not receiving the treatment being tested.<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_39_507\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_39_507\"><div tabindex=\"-1\"><p>A variable that exists in indivisible units. For quantitative variables, it is measured in whole numbers that are discrete points on the scale.<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_39_494\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_39_494\"><div tabindex=\"-1\"><p>Numerical variables that can take on any value in a certain range. Time and distance are continuous; gender, SAT score, and \u201ctime rounded to the nearest second\u201d are not.<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_39_612\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_39_612\"><div tabindex=\"-1\"><\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_39_576\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_39_576\"><div tabindex=\"-1\"><p>A subset of a population, often taken for the purpose of statistical inference.<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_39_505\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_39_505\"><div tabindex=\"-1\"><p>A set of statistics\u2014such as the mean, standard deviation, and skew\u2014that describe a distribution.<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_39_526\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_39_526\"><div tabindex=\"-1\"><p>The branch of statistics concerned with drawing conclusions about a population from a sample. This is generally done through random sampling, followed by inferences made about central tendency, or any of a number of other aspects of a<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><\/div>","protected":false},"author":7,"menu_order":1,"template":"","meta":{"pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-39","chapter","type-chapter","status-publish","hentry"],"part":21,"_links":{"self":[{"href":"https:\/\/pressbooks.palomar.edu\/introtostats\/wp-json\/pressbooks\/v2\/chapters\/39","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/pressbooks.palomar.edu\/introtostats\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/pressbooks.palomar.edu\/introtostats\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/pressbooks.palomar.edu\/introtostats\/wp-json\/wp\/v2\/users\/7"}],"version-history":[{"count":38,"href":"https:\/\/pressbooks.palomar.edu\/introtostats\/wp-json\/pressbooks\/v2\/chapters\/39\/revisions"}],"predecessor-version":[{"id":1188,"href":"https:\/\/pressbooks.palomar.edu\/introtostats\/wp-json\/pressbooks\/v2\/chapters\/39\/revisions\/1188"}],"part":[{"href":"https:\/\/pressbooks.palomar.edu\/introtostats\/wp-json\/pressbooks\/v2\/parts\/21"}],"metadata":[{"href":"https:\/\/pressbooks.palomar.edu\/introtostats\/wp-json\/pressbooks\/v2\/chapters\/39\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/pressbooks.palomar.edu\/introtostats\/wp-json\/wp\/v2\/media?parent=39"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/pressbooks.palomar.edu\/introtostats\/wp-json\/pressbooks\/v2\/chapter-type?post=39"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/pressbooks.palomar.edu\/introtostats\/wp-json\/wp\/v2\/contributor?post=39"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/pressbooks.palomar.edu\/introtostats\/wp-json\/wp\/v2\/license?post=39"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}