
{"id":153,"date":"2021-12-15T18:19:59","date_gmt":"2021-12-15T18:19:59","guid":{"rendered":"https:\/\/pressbooks.palomar.edu\/introtostats\/chapter\/chapter-4\/"},"modified":"2025-08-28T01:02:04","modified_gmt":"2025-08-28T01:02:04","slug":"chapter-4","status":"publish","type":"chapter","link":"https:\/\/pressbooks.palomar.edu\/introtostats\/chapter\/chapter-4\/","title":{"raw":"Chapter 4: z Scores and the Standard Normal Distribution","rendered":"Chapter 4: z Scores and the Standard Normal Distribution"},"content":{"raw":"<div class=\"textbox textbox--sidebar textbox--learning-objectives\"><header class=\"textbox__header\">\r\n<h3 class=\"Chapter-element-head\">Key Terms<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n\r\n&nbsp;\r\n<p class=\"Key-terms\"><a href=\"#_idTextAnchor139\"><span class=\"Hyperlink-underscore\">area in the tails of the distribution<\/span><\/a><\/p>\r\n<p class=\"Key-terms\"><a href=\"#_idTextAnchor138\"><span class=\"Hyperlink-underscore\">area under the curve<\/span><\/a><\/p>\r\n<p class=\"Key-terms\"><a href=\"#_idTextAnchor129\"><span class=\"Hyperlink-underscore\">normal distribution<\/span><\/a><\/p>\r\n<p class=\"Key-terms\"><a href=\"#_idTextAnchor136\"><span class=\"Hyperlink-underscore\">scale<\/span><\/a><\/p>\r\n<p class=\"Key-terms\"><a href=\"#_idTextAnchor137\"><span class=\"Hyperlink-underscore\">standardization<\/span><\/a><\/p>\r\n<p class=\"Key-terms\"><a href=\"#_idTextAnchor130\"><span class=\"Hyperlink-underscore\">standard normal distribution<\/span><\/a><\/p>\r\n<p class=\"Key-terms\"><a href=\"#_idTextAnchor132\"><span class=\"Hyperlink-underscore CharOverride-12\">z<\/span><span class=\"Hyperlink-underscore\"> score<\/span><\/a><\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<p class=\"Text-1st\">We now understand how to describe and present our data visually and numerically. These simple tools, and the principles behind them, will help you interpret information presented to you and understand the basics of a variable. Moving forward, we now turn our attention to how scores within a distribution are related to one another, how to precisely describe a score\u2019s location within the distribution, and how to compare scores from different distributions.<\/p>\r\n\r\n<h3 class=\"H1\"><a id=\"_idTextAnchor128\"><\/a>Normal Distributions<\/h3>\r\n<p class=\"Text-1st\">The [pb_glossary id=\"650\"]<a id=\"_idTextAnchor129\"><\/a>[\/pb_glossary]<span class=\"key-term\">normal distribution<\/span> is the most important and most widely used distribution in statistics. It is sometimes called the \u201cbell curve,\u201d although the tonal qualities of such a bell would be less than pleasing. It is also called the \u201cGaussian curve\u201d of Gaussian distribution after the mathematician Karl Friedrich Gauss.<\/p>\r\n<p class=\"Text\">Strictly speaking, it is not correct to talk about \u201cthe normal distribution\u201d since there are many normal distributions. Normal distributions can differ in their means and in their standard deviations. <a href=\"#_idTextAnchor131\"><span class=\"Fig-table-number-underscore\">Figure 4.1<\/span><\/a> shows three normal distributions. The blue (left-most) distribution has a mean of \u22123 and a standard deviation of 0.5, the distribution in red (the middle distribution) has a mean of 0 and a standard deviation of 1, and the black (right-most) distribution has a mean of 2 and a standard deviation of 3. These as well as all other normal distributions are symmetric with relatively more values at the center of the distribution and relatively few in the tails. What is consistent about all normal distribution is the shape and the proportion of scores within a given distance along the <span class=\"italic\">x<\/span>-axis. We will focus on the [pb_glossary id=\"653\"]<a id=\"_idTextAnchor130\"><\/a>[\/pb_glossary]<span class=\"key-term\">standard normal distribution<\/span> (also known as the unit normal distribution), which has a mean of 0 and a standard deviation of 1 (i.e., the red distribution in <a href=\"#_idTextAnchor131\"><span class=\"Fig-table-number-underscore\">Figure 4.1<\/span><\/a>).<\/p>\r\n\r\n<div class=\"_idGenObjectLayout-2\">\r\n<div id=\"_idContainer199\" class=\"Side-legend\">\r\n<p class=\"Fig-legend\"><span class=\"Fig-table-number\"><a id=\"_idTextAnchor131\"><\/a>Figure 4.1.<\/span> Normal distributions differing in mean and standard deviation. <span class=\"Fig-source\">(\u201c<\/span><a href=\"https:\/\/irl.umsl.edu\/oer-img\/51\"><span class=\"Fig-source\"><span class=\"Hyperlink-underscore\">Normal Distributions with Different Means and Standard Deviations<\/span><\/span><\/a><span class=\"Fig-source\">\u201d by Judy Schmitt is licensed under <\/span><a href=\"https:\/\/creativecommons.org\/licenses\/by-nc-sa\/4.0\/\"><span class=\"Fig-source\"><span class=\"Hyperlink-underscore\">CC BY-NC-SA 4.0<\/span><\/span><\/a><span class=\"Fig-source\">.)<\/span><\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<div class=\"_idGenObjectLayout-1\">\r\n<div id=\"_idContainer200\" class=\"_idGenObjectStyleOverride-1\"><img class=\"_idGenObjectAttribute-19\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2021\/12\/Normal_Distributions_with_Different_Means_and_Standard_Deviations-2.png\" alt=\"\" \/><\/div>\r\n<\/div>\r\n<p class=\"Text\">Seven features of normal distributions are listed below.<\/p>\r\n\r\n<ol>\r\n \t<li class=\"Numbered-list-first\">Normal distributions are symmetric around their mean.<\/li>\r\n \t<li class=\"Numbered-list ParaOverride-16\">The mean, median, and mode of a normal distribution are equal.<\/li>\r\n \t<li class=\"Numbered-list ParaOverride-16\">The area under the normal curve is equal to 1.0.<\/li>\r\n \t<li class=\"Numbered-list ParaOverride-16\">Normal distributions are denser in the center and less dense in the tails.<\/li>\r\n \t<li class=\"Numbered-list ParaOverride-16\">Normal distributions are defined by two parameters, the mean (<img class=\"_idGenObjectAttribute-31\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn2.14-mu-3.png\" alt=\"mu\" \/>) and the standard deviation (<span class=\"Symbol\">s<\/span>).<\/li>\r\n \t<li class=\"Numbered-list ParaOverride-16\">68% of the area of a normal distribution is within one standard deviation of the mean.<\/li>\r\n \t<li class=\"Numbered-list ParaOverride-16\">Approximately 95% of the area of a normal distribution is within two standard deviations of the mean.<\/li>\r\n<\/ol>\r\n<strong data-start=\"333\" data-end=\"384\">Social Justice Example (Housing Affordability):<\/strong><br data-start=\"384\" data-end=\"387\" \/>Consider rental housing costs across a large metropolitan area. If the costs follow an approximately normal distribution, most renters will pay amounts close to the average, while very high or very low rents will fall in the tails. But in cities where gentrification and displacement are occurring, the distribution of rents may become skewed rather than normal, with many tenants pushed into unaffordable extremes. Understanding the normal curve helps us recognize when the \u201ctypical\u201d experience no longer reflects people\u2019s lived reality.\r\n<p class=\"Text\">These properties enable us to use the normal distribution to understand how scores relate to one another within and across a distribution. But first, we need to learn how to calculate the standardized score that makes up a standard normal distribution.<\/p>\r\n\r\n<h3 class=\"H1\"><span class=\"bold-italic CharOverride-3\">Z<\/span> Scores<\/h3>\r\n<p class=\"Text-1st\">A [pb_glossary id=\"654\"]<a id=\"_idTextAnchor132\"><\/a>[\/pb_glossary]<span class=\"key-term CharOverride-2\">z<\/span><span class=\"key-term\"> score<\/span> is a standardized version of a raw score (<span class=\"italic\">x<\/span>) that gives information about the relative location of that score within its distribution. The formula for converting a raw score into a <span class=\"italic\">z<\/span> score is<\/p>\r\n<p class=\"Equation\"><img class=\"_idGenObjectAttribute-60\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn4.1-2.png\" alt=\"\" \/><\/p>\r\n<p class=\"Text\">for values from a population and<\/p>\r\n<p class=\"Equation\"><img class=\"_idGenObjectAttribute-61\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn4.2-2.png\" alt=\"\" \/><\/p>\r\n<p class=\"Text\">for values from a sample.<\/p>\r\n<p class=\"Text\">As you can see, <span class=\"italic\">z<\/span> scores combine information about where the distribution is located (the mean\/center) with how wide the distribution is (the standard deviation\/spread) to interpret a raw score\u00a0(<span class=\"italic\">x<\/span>). Specifically, <span class=\"italic\">z<\/span> scores will tell us how far the score is away from the mean in units of standard deviations and in what direction.<\/p>\r\n<p class=\"Text\">The value of a <span class=\"italic\">z<\/span> score has two parts: the sign (positive or negative) and the magnitude (the actual number). The sign of the <span class=\"italic\">z<\/span> score tells you in which half of the distribution the <span class=\"italic\">z<\/span> score falls: a positive sign (or no sign) indicates that the score is above the mean and on the right-hand side or upper end of the distribution, and a negative sign tells you the score is below the mean and on the left-hand side or lower end of the distribution. The magnitude of the number tells you, in units of standard deviations, how far away the score is from the center or mean. The magnitude can take on any value between negative and positive infinity, but for reasons we will see soon, they generally fall between \u22123 and 3.<\/p>\r\n<p class=\"Text\">Let\u2019s look at some examples. A <span class=\"italic\">z<\/span> score value of \u22121.0 tells us that this <span class=\"italic\">z<\/span> score is 1 standard deviation (because of the magnitude 1.0) below (because of the negative sign) the mean. Similarly, a <span class=\"italic\">z<\/span> score value of 1.0 tells us that this <span class=\"italic\">z<\/span> score is 1 standard deviation above the mean. Thus, these two scores are the same distance away from the mean but in opposite directions. A <span class=\"italic\">z<\/span> score of \u22122.5 is two-and-a-half standard deviations below the mean and is therefore farther from the center than both of the previous scores, and a <span class=\"italic\">z<\/span> score of 0.25 is closer than all of the ones before. In <a href=\"https:\/\/pressbooks.palomar.edu\/introtostats\/part\/unit-2-hypothesis-testing\/\"><span class=\"Hyperlink-underscore\">Unit 2<\/span><\/a>, we will learn to formalize the distinction between what we consider \u201cclose to\u201d the center or \u201cfar from\u201d the center. For now, we will use a rough cut-off of 1.5 standard deviations in either direction as the difference between close scores (those within 1.5 standard deviations or between <span class=\"italic\">z <\/span>= \u22121.5 and <span class=\"italic\">z <\/span>= 1.5) and extreme scores (those farther than 1.5 standard deviations\u2014below <span class=\"italic\">z <\/span>= \u22121.5 or above <span class=\"italic\">z <\/span>= 1.5).<\/p>\r\n<p class=\"Text\">We can also convert raw scores into <span class=\"italic\">z<\/span> scores to get a better idea of where in the distribution those scores fall. Let\u2019s say we get a score of 68 on an exam. We may be disappointed to have scored so low, but perhaps it was just a very hard exam. Having information about the distribution of all scores in the class would be helpful to put some perspective on ours. We find out that the class got an average score of 54 with a standard deviation of 8. To find out our relative location within this distribution, we simply convert our test score into a <span class=\"italic\">z<\/span> score.<\/p>\r\n<p class=\"Equation\"><img class=\"_idGenObjectAttribute-62\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn4.3-2.png\" alt=\"\" \/><\/p>\r\n<p class=\"Text\">We find that we are 1.75 standard deviations above the average, above our rough cut-off for close and far. Suddenly our 68 is looking pretty good!<\/p>\r\n<strong data-start=\"1138\" data-end=\"1199\">Social Justice Example (Standardized Testing Inequities):<\/strong><br data-start=\"1199\" data-end=\"1202\" \/>Imagine two students take different standardized exams. A Latinx student scores 480 on the SAT Math section (mean = 511, SD = 120), and a white student scores 21 on the ACT Math section (mean = 20, SD = 5). At first glance, the raw scores suggest different outcomes, but converting them into z scores allows us to compare fairly across distributions. The SAT score converts to z = \u22120.26 (slightly below average), while the ACT score converts to z = 0.20 (slightly above average). This illustrates how z scores are crucial when comparing performance across systems that use different metrics\u2014and why it is important to examine how different groups fare relative to their peers, not just in absolute terms.\r\n<p class=\"Text\"><a href=\"#_idTextAnchor133\"><span class=\"Fig-table-number-underscore\">Figure 4.2<\/span><\/a> shows both the raw score and the <span class=\"italic\">z<\/span> score on their respective distributions. Notice that the red line indicating where each score lies is in the same relative spot for both. This is because transforming a raw score into a <span class=\"italic\">z<\/span> score does not change its relative location, it only makes it easier to know precisely where it is.<\/p>\r\n\r\n<div class=\"_idGenObjectLayout-2\">\r\n<div id=\"_idContainer205\" class=\"Side-legend\">\r\n<p class=\"Fig-legend\"><span class=\"Fig-table-number\"><a id=\"_idTextAnchor133\"><\/a>Figure 4.2.<\/span> Raw and standardized versions of a single score. <span class=\"Fig-source\">(\u201c<\/span><a href=\"https:\/\/irl.umsl.edu\/oer-img\/52\"><span class=\"Fig-source\"><span class=\"Hyperlink-underscore\">Raw and Standardized Versions of a Score<\/span><\/span><\/a><span class=\"Fig-source\">\u201d by Judy Schmitt is licensed under <\/span><a href=\"https:\/\/creativecommons.org\/licenses\/by-nc-sa\/4.0\/\"><span class=\"Fig-source\"><span class=\"Hyperlink-underscore\">CC BY-NC-SA 4.0<\/span><\/span><\/a><span class=\"Fig-source\">.)<\/span><\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<div class=\"_idGenObjectLayout-1\">\r\n<div id=\"_idContainer206\" class=\"_idGenObjectStyleOverride-1\"><img class=\"_idGenObjectAttribute-19\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Raw_and_Standardized_Versions_of_a_Score-2.png\" alt=\"\" \/><\/div>\r\n<\/div>\r\n<p class=\"Text\"><span class=\"italic\">z<\/span> Scores are also useful for comparing scores from different distributions. Let\u2019s say we take the SAT and score 501 on both the math and critical reading sections. Does that mean we did equally well on both? Scores on the math portion are distributed normally with a mean of 511 and standard deviation of 120, so our <span class=\"italic\">z<\/span> score on the math section is<\/p>\r\n<p class=\"Equation\"><img class=\"_idGenObjectAttribute-63\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn4.4-2.png\" alt=\"\" \/><\/p>\r\n<p class=\"Text\">which is just slightly below average (note the use of \u201cmath\u201d as a subscript; subscripts are used when presenting multiple versions of the same statistic in order to know which one is which and have no bearing on the actual calculation). The critical reading section has a mean of 495 and standard deviation of 116, so<\/p>\r\n<p class=\"Equation\"><img class=\"_idGenObjectAttribute-64\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn4.5-2.png\" alt=\"\" \/><\/p>\r\n<p class=\"Text\">So even though we were almost exactly average on both tests, we did a little bit better on the critical reading portion relative to other people.<\/p>\r\n<p class=\"Text\">Finally, <span class=\"italic\">z<\/span> scores are incredibly useful if we need to combine information from different measures that are on different scales. Let\u2019s say we give a set of employees a series of tests on things like job knowledge, personality, and leadership. We may want to combine these into a single score we can use to rate employees for development or promotion, but look what happens when we take the average of raw scores from different scales, as shown in <a href=\"#_idTextAnchor134\"><span class=\"Fig-table-number-underscore\">Table 4.1<\/span><\/a>.<\/p>\r\n\r\n<div class=\"_idGenObjectLayout-1\">\r\n<div id=\"_idContainer209\" class=\"_idGenObjectStyleOverride-1\">\r\n<p class=\"Table-title\"><span class=\"Fig-table-number\"><a id=\"_idTextAnchor134\"><\/a>Table 4.1.<\/span> Raw test scores on different scales (ranges in parentheses).<\/p>\r\n\r\n<table id=\"table027\" class=\"Foster-table\"><colgroup> <col class=\"_idGenTableRowColumn-57\" \/> <col class=\"_idGenTableRowColumn-58\" \/> <col class=\"_idGenTableRowColumn-33\" \/> <col class=\"_idGenTableRowColumn-59\" \/> <col class=\"_idGenTableRowColumn-60\" \/> <\/colgroup>\r\n<thead>\r\n<tr class=\"Foster-table _idGenTableRowColumn-19\">\r\n<td class=\"Foster-table Table-col-hd CellOverride-19\">\r\n<p class=\"Table-col-hd\">Employee<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-col-hd CellOverride-19\">\r\n<p class=\"Table-col-hd ParaOverride-4\">Job Knowledge\r\n(0\u2013100)<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-col-hd CellOverride-19\">\r\n<p class=\"Table-col-hd ParaOverride-4\">Personality\r\n(1\u20135)<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-col-hd CellOverride-19\">\r\n<p class=\"Table-col-hd ParaOverride-4\">Leadership\r\n(1\u20135)<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-col-hd\">\r\n<p class=\"Table-col-hd ParaOverride-4\">Average<\/p>\r\n<\/td>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr class=\"Foster-table _idGenTableRowColumn-6\">\r\n<td class=\"Foster-table Table-body CellOverride-19 _idGenCellOverride-1\">\r\n<p class=\"Table-body\">Employee 1<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-19 _idGenCellOverride-1\">\r\n<p class=\"Table-body ParaOverride-4\">98<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-19 _idGenCellOverride-1\">\r\n<p class=\"Table-body ParaOverride-4\">4.2<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-19 _idGenCellOverride-1\">\r\n<p class=\"Table-body ParaOverride-4\">1.1<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body _idGenCellOverride-1\">\r\n<p class=\"Table-body ParaOverride-4\">34.43<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr class=\"Foster-table _idGenTableRowColumn-7\">\r\n<td class=\"Foster-table Table-body CellOverride-19 _idGenCellOverride-2\">\r\n<p class=\"Table-body\">Employee 2<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-19 _idGenCellOverride-2\">\r\n<p class=\"Table-body ParaOverride-4\">96<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-19 _idGenCellOverride-2\">\r\n<p class=\"Table-body ParaOverride-4\">3.1<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-19 _idGenCellOverride-2\">\r\n<p class=\"Table-body ParaOverride-4\">4.5<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body _idGenCellOverride-2\">\r\n<p class=\"Table-body ParaOverride-4\">34.53<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr class=\"Foster-table _idGenTableRowColumn-11\">\r\n<td class=\"Foster-table Table-body-last Table-body CellOverride-19\">\r\n<p class=\"Table-body\">Employee 3<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body-last Table-body CellOverride-19\">\r\n<p class=\"Table-body ParaOverride-4\">97<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body-last Table-body CellOverride-19\">\r\n<p class=\"Table-body ParaOverride-4\">2.9<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body-last Table-body CellOverride-19\">\r\n<p class=\"Table-body ParaOverride-4\">3.6<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body-last Table-body\">\r\n<p class=\"Table-body ParaOverride-4\">34.50<\/p>\r\n<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<\/div>\r\n<\/div>\r\n<p class=\"Text\">Because the job knowledge scores were so big and the scores were so similar, they overpowered the other scores and removed almost all variability in the average. However, if we standardize these scores into <span class=\"italic\">z<\/span> scores, our averages retain more variability and it is easier to assess differences between employees, as shown in <a href=\"#_idTextAnchor135\"><span class=\"Fig-table-number-underscore\">Table 4.2<\/span><\/a>.<\/p>\r\n<p class=\"Text\">To convert all these scores into z scores we simply find the average for each category and use our z score formula to convert raw scores into z-scores.<\/p>\r\n<p class=\"Text\"><img src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn4.2-2.png\" alt=\"image\" \/><\/p>\r\nFor Employee 1 their raw score is 98.\u00a0 The mean is 97 and the standard deviation is 1.\u00a0 The z score is 98-97\/1 = 1.00.\u00a0 You can standardize scores for each employee using the z formula as shown in table 4.2.\u00a0 You can do the same for the other categories (Personality and Leadership) using the means and standard deviations for those categories.\u00a0 Then you can sum across Employees to get the Average.\u00a0 It is now easier to compare overall scores.\r\n<div class=\"_idGenObjectLayout-1\">\r\n<div id=\"_idContainer210\" class=\"_idGenObjectStyleOverride-1\">\r\n<p class=\"Table-title\"><span class=\"Fig-table-number\"><a id=\"_idTextAnchor135\"><\/a>Table 4.2.<\/span> Standardized scores.<\/p>\r\n\r\n<table id=\"table028\" class=\"Foster-table\"><colgroup> <col class=\"_idGenTableRowColumn-57\" \/> <col class=\"_idGenTableRowColumn-58\" \/> <col class=\"_idGenTableRowColumn-33\" \/> <col class=\"_idGenTableRowColumn-59\" \/> <col class=\"_idGenTableRowColumn-60\" \/> <\/colgroup>\r\n<thead>\r\n<tr class=\"Foster-table _idGenTableRowColumn-19\">\r\n<td class=\"Foster-table Table-col-hd CellOverride-19\">\r\n<p class=\"Table-col-hd\">Employee<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-col-hd CellOverride-19\">\r\n<p class=\"Table-col-hd ParaOverride-4\">Job Knowledge\r\n(0\u2013100)<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-col-hd CellOverride-19\">\r\n<p class=\"Table-col-hd ParaOverride-4\">Personality\r\n(1\u20135)<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-col-hd CellOverride-19\">\r\n<p class=\"Table-col-hd ParaOverride-4\">Leadership\r\n(1\u20135)<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-col-hd\">\r\n<p class=\"Table-col-hd ParaOverride-4\">Average<\/p>\r\n<\/td>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr class=\"Foster-table _idGenTableRowColumn-6\">\r\n<td class=\"Foster-table Table-body CellOverride-19 _idGenCellOverride-1\">\r\n<p class=\"Table-body\">Employee 1<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-19 _idGenCellOverride-1\">\r\n<p class=\"Table-body\">1.00<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-19 _idGenCellOverride-1\">\r\n<p class=\"Table-body\">1.14<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-19 _idGenCellOverride-1\">\r\n<p class=\"Table-body\">\u22121.12<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body _idGenCellOverride-1\">\r\n<p class=\"Table-body\">0.34<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr class=\"Foster-table _idGenTableRowColumn-7\">\r\n<td class=\"Foster-table Table-body CellOverride-19 _idGenCellOverride-2\">\r\n<p class=\"Table-body\">Employee 2<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-19 _idGenCellOverride-2\">\r\n<p class=\"Table-body\">\u22121.00<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-19 _idGenCellOverride-2\">\r\n<p class=\"Table-body\">\u22120.43<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body CellOverride-19 _idGenCellOverride-2\">\r\n<p class=\"Table-body\">0.81<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body _idGenCellOverride-2\">\r\n<p class=\"Table-body\">\u22120.20<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr class=\"Foster-table _idGenTableRowColumn-11\">\r\n<td class=\"Foster-table Table-body-last Table-body CellOverride-19\">\r\n<p class=\"Table-body\">Employee 3<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body-last Table-body CellOverride-19\">\r\n<p class=\"Table-body\">0.00<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body-last Table-body CellOverride-19\">\r\n<p class=\"Table-body\">\u22120.71<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body-last Table-body CellOverride-19\">\r\n<p class=\"Table-body\">0.30<\/p>\r\n<\/td>\r\n<td class=\"Foster-table Table-body-last Table-body\">\r\n<p class=\"Table-body\">\u22120.14<\/p>\r\n<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<\/div>\r\n<\/div>\r\n<strong data-start=\"2141\" data-end=\"2198\">Social Justice Example (Pay Equity in the Workplace):<\/strong><br data-start=\"2198\" data-end=\"2201\" \/>Raw salaries often come in very different scales depending on job type (hourly vs. annual salaries, part-time vs. full-time). Simply averaging them can make inequities invisible. Standardizing into z scores lets us meaningfully compare across roles. For example, two workers may each earn $20\/hour, but in one workplace that\u2019s well above the mean (positive z score), while in another it\u2019s below average (negative z score). In social justice research, this allows us to identify whether women, people of color, or other marginalized groups are consistently clustered below the mean even when their absolute wages seem similar.\r\n<h4 class=\"H2\">Setting the Scale of a Distribution<\/h4>\r\n<p class=\"Text-1st\">Another convenient characteristic of <span class=\"italic\">z<\/span> scores is that they can be converted into any \u201cscale\u201d that we would like. Here, the term [pb_glossary id=\"651\"]<a id=\"_idTextAnchor136\"><\/a>[\/pb_glossary]<span class=\"key-term\">scale<\/span> means how far apart the scores are (their spread) and where they are located (their central tendency). This can be very useful if we don\u2019t want to work with negative numbers or if we have a specific range we would like to present. The formulas for transforming <span class=\"italic\">z <\/span>to <span class=\"italic\">x<\/span> are:<\/p>\r\n<p class=\"Equation\"><img class=\"_idGenObjectAttribute-65\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn4.6-2.png\" alt=\"\" \/><\/p>\r\n<p class=\"Text\">for a population and<\/p>\r\n<p class=\"Equation\"><img class=\"_idGenObjectAttribute-66\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn4.7-2.png\" alt=\"\" \/><\/p>\r\n<p class=\"Text\">for a sample. Notice that these are just simple rearrangements of the original formulas for calculating <span class=\"italic\">z <\/span>from raw scores.<\/p>\r\n<p class=\"Text\">Let\u2019s say we create a new measure of intelligence, and initial calibration finds that our scores have a mean of 40 and standard deviation of 7. Three people who have scores of 52, 43, and 34 want to know how well they did on the measure. We can convert their raw scores into <span class=\"italic\">z<\/span> scores:<\/p>\r\n<p class=\"Equation\"><img class=\"_idGenObjectAttribute-67\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn4.8-2.png\" alt=\"\" \/><\/p>\r\n<p class=\"Equation\"><img class=\"_idGenObjectAttribute-68\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn4.9-2.png\" alt=\"\" \/><\/p>\r\n<p class=\"Equation\"><img class=\"_idGenObjectAttribute-69\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn4.10-2.png\" alt=\"\" \/><\/p>\r\n<p class=\"Text\">A problem is that these new <span class=\"italic\">z<\/span> scores aren\u2019t exactly intuitive for many people. We can give people information about their relative location in the distribution (for instance, the first person scored well above average), or we can translate these <span class=\"italic\">z<\/span> scores into the more familiar metric of IQ scores, which have a mean of 100 and standard deviation of 16:<\/p>\r\n<p class=\"Text ParaOverride-4\">IQ = 1.71(16) + 100 = 127.36<\/p>\r\n<p class=\"Text ParaOverride-4\">IQ = 0.43(16) + 100 = 106.88<\/p>\r\n<p class=\"Text ParaOverride-4\">IQ = \u22120.80(16) + 100 = 87.20<\/p>\r\n<p class=\"Text\">We would also likely round these values to 127, 107, and 87, respectively, for convenience.<\/p>\r\n\r\n<h3 class=\"H1\"><span class=\"bold-italic CharOverride-3\">Z<\/span> Scores and the Area under the Curve<\/h3>\r\n<p class=\"Text-1st\"><span class=\"italic\">z<\/span> Scores and the standard normal distribution go hand-in-hand. A <span class=\"italic\">z<\/span> score will tell you exactly where in the standard normal distribution a value is located, and any normal distribution can be converted into a standard normal distribution by converting all of the scores in the distribution into <span class=\"italic\">z<\/span> scores, a process known as [pb_glossary id=\"652\"]<a id=\"_idTextAnchor137\"><\/a>[\/pb_glossary]<span class=\"key-term\">standardization<\/span>.<\/p>\r\n<p class=\"Text\">We saw in <a href=\"https:\/\/pressbooks.palomar.edu\/introtostats\/chapter\/chapter-3\/\"><span class=\"Hyperlink-underscore\">Chapter <\/span><\/a><span class=\"Hyperlink-underscore\">3<\/span> that standard deviations can be used to divide the normal distribution: 68% of the distribution falls within 1 standard deviation of the mean, 95% within (roughly) 2 standard deviations, and 99.7% within 3 standard deviations. Because <span class=\"italic\">z<\/span> scores are in units of standard deviations, this means that 68% of scores fall between <span class=\"italic\">z <\/span>= \u22121.0 and <span class=\"italic\">z <\/span>= 1.0 and so on. We call this 68% (or any percentage we have based on our <span class=\"italic\">z<\/span> scores) the proportion of the [pb_glossary id=\"649\"]<a id=\"_idTextAnchor138\"><\/a>[\/pb_glossary]<span class=\"key-term\">area under the curve<\/span>. Any area under the curve is bounded by (defined by, delineated by, etc.) by a single <span class=\"italic\">z<\/span> score or pair of <span class=\"italic\">z<\/span> scores.<\/p>\r\n<p class=\"Text\">An important property to point out here is that, by virtue of the fact that the total area under the curve of a distribution is always equal to 1.0 (see <a href=\"#_idTextAnchor128\"><span class=\"Hyperlink-underscore\">section on Normal Distributions<\/span><\/a> at the beginning of this chapter), these areas under the curve can be added together or subtracted from 1 to find the proportion in other areas. For example, we know that the area between <span class=\"italic\">z <\/span>= \u22121.0 and <span class=\"italic\">z <\/span>= 1.0 (i.e., within one standard deviation of the mean) contains 68% of the area under the curve, which can be represented in decimal form as .6800. (To change a percentage to a decimal, simply move the decimal point 2 places to the left.) Because the total area under the curve is equal to 1.0, that means that the proportion of the area outside <span class=\"italic\">z <\/span>= \u22121.0 and <span class=\"italic\">z <\/span>= 1.0 is equal to 1.0 \u2212 .6800 = .3200 or 32% (see <a href=\"#_idTextAnchor140\"><span class=\"Fig-table-number-underscore\">Figure 4.3<\/span><\/a>). This area is called the [pb_glossary id=\"648\"]<a id=\"_idTextAnchor139\"><\/a>[\/pb_glossary]<span class=\"key-term\">area in the tails of the distribution<\/span>. Because this area is split between two tails and because the normal distribution is symmetrical, each tail has exactly one-half, or 16%, of the area under the curve.<\/p>\r\n\r\n<div class=\"_idGenObjectLayout-2\">\r\n<div id=\"_idContainer216\" class=\"Side-legend\">\r\n<p class=\"Fig-legend\"><span class=\"Fig-table-number\"><a id=\"_idTextAnchor140\"><\/a>Figure 4.3.<\/span> Shaded areas represent the area under the curve in the tails. <span class=\"Fig-source\">(\u201c<\/span><a href=\"https:\/\/irl.umsl.edu\/oer-img\/53\"><span class=\"Fig-source\"><span class=\"Hyperlink-underscore\">Area under the Curve in the Tails<\/span><\/span><\/a><span class=\"Fig-source\">\u201d by Judy Schmitt is licensed under <\/span><a href=\"https:\/\/creativecommons.org\/licenses\/by-nc-sa\/4.0\/\"><span class=\"Fig-source\"><span class=\"Hyperlink-underscore\">CC BY-NC-SA 4.0<\/span><\/span><\/a><span class=\"Fig-source\">.)<\/span><\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<div class=\"_idGenObjectLayout-1\">\r\n<div id=\"_idContainer217\" class=\"_idGenObjectStyleOverride-1\"><img class=\"_idGenObjectAttribute-19\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Area_under_the_Curve_in_the_Tails-2.png\" alt=\"\" \/><\/div>\r\n<\/div>\r\n<strong data-start=\"3013\" data-end=\"3061\">Social Justice Example (Health Disparities):<\/strong><br data-start=\"3061\" data-end=\"3064\" \/>In public health, z scores and the normal curve are often used to identify whether outcomes fall within expected ranges. For instance, birth weights in a healthy population typically follow a normal distribution. Babies born more than 2 standard deviations below the mean are considered low birth weight, a risk factor for health complications. Social justice researchers use these cut-offs to reveal disparities: in many U.S. cities, Black and Indigenous mothers are far more likely to have infants in the \u201ctails\u201d of the distribution, reflecting systemic inequities in access to prenatal care and safe living conditions\r\n<p class=\"Text\">We will have much more to say about this concept in the coming chapters. As it turns out, this is a quite powerful idea that enables us to make statements about how likely an outcome is and what that means for research questions we would like to answer and hypotheses we would like to test. But first, we need to make a brief foray into some ideas about probability in <a href=\"https:\/\/pressbooks.palomar.edu\/introtostats\/chapter\/chapter-5\/\"><span class=\"Hyperlink-underscore\">Chapter 5<\/span><\/a>.<\/p>\r\n\r\n<h3 class=\"H1\">Exercises<\/h3>\r\n<ol>\r\n \t<li class=\"Numbered-list-Exercises-1st\">What are the two pieces of information contained in a <span class=\"italic\">z<\/span> score?<\/li>\r\n \t<li class=\"Numbered-list-Exercises\">A <span class=\"italic\">z<\/span> score takes a raw score and standardizes it into units of .<\/li>\r\n \t<li class=\"Numbered-list-Exercises\">Assume the following five scores represent a sample: 2, 3, 5, 5, 6. Transform these scores into <span class=\"italic\">z<\/span>\u00a0scores.<\/li>\r\n \t<li class=\"Numbered-list-Exercises\">True or false:\r\n<ol>\r\n \t<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">All normal distributions are symmetrical.<\/li>\r\n \t<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">All normal distributions have a mean of 1.0.<\/li>\r\n \t<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">All normal distributions have a standard deviation of 1.0.<\/li>\r\n \t<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">The total area under the curve of all normal distributions is equal to 1.<\/li>\r\n<\/ol>\r\n<\/li>\r\n \t<li class=\"Numbered-list-Exercises\">Interpret the location, direction, and distance (near or far) of the following <span class=\"italic\">z<\/span> scores:\r\n<ol>\r\n \t<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">\u22122.00<\/li>\r\n \t<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">1.25<\/li>\r\n \t<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">3.50<\/li>\r\n \t<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">\u22120.34<\/li>\r\n<\/ol>\r\n<\/li>\r\n \t<li class=\"Numbered-list-Exercises\">Transform the following <span class=\"italic\">z<\/span> scores into a distribution with a mean of 10 and standard deviation of\u00a02: \u22121.75, 2.20, 1.65, \u22120.95<\/li>\r\n \t<li class=\"Numbered-list-Exercises\">Calculate <span class=\"italic\">z<\/span> scores for the following raw scores taken from a population with a mean of 100 and standard deviation of 16: 112, 109, 56, 88, 135, 99<\/li>\r\n \t<li class=\"Numbered-list-Exercises\">What does a <span class=\"italic\">z<\/span> score of 0.00 represent?<\/li>\r\n \t<li class=\"Numbered-list-Exercises\">For a distribution with a standard deviation of 20, find <span class=\"italic\">z<\/span> scores that correspond to:\r\n<ol>\r\n \t<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">One-half of a standard deviation below the mean<\/li>\r\n \t<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">5 points above the mean<\/li>\r\n \t<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">Three standard deviations above the mean<\/li>\r\n \t<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">22 points below the mean<\/li>\r\n<\/ol>\r\n<\/li>\r\n \t<li class=\"Numbered-list-Exercises\">Calculate the raw score for the following <span class=\"italic\">z<\/span> scores from a distribution with a mean of 15 and standard deviation of 3:\r\n<ol>\r\n \t<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">4.0<\/li>\r\n \t<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">2.2<\/li>\r\n \t<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">\u22121.3<\/li>\r\n \t<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">0.46<\/li>\r\n<\/ol>\r\n<\/li>\r\n<\/ol>\r\n<div class=\"textbox textbox--learning-objectives\"><header class=\"textbox__header\">\r\n<h3 class=\"H1\">Answers to Odd-Numbered Exercises<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n\r\n1)\r\n\r\nThe location above or below the mean (from the sign of the number) and the distance in standard deviations away from the mean (from the magnitude of the number)\r\n\r\n&nbsp;\r\n\r\n3)\r\n\r\n<span class=\"italic\" style=\"font-size: 0.8em;font-weight: lighter\">M<\/span><span style=\"font-size: 0.8em;font-weight: lighter\"> = 4.2, <\/span><span class=\"italic\" style=\"font-size: 0.8em;font-weight: lighter\">s<\/span><span style=\"font-size: 0.8em;font-weight: lighter\"> = 1.64; <\/span><span class=\"italic\" style=\"font-size: 0.8em;font-weight: lighter\">z <\/span><span style=\"font-size: 0.8em;font-weight: lighter\">= \u22121.34, \u22120.73, 0.49, 0.49, 1.10<\/span>\r\n\r\n&nbsp;\r\n\r\n5)\r\n\r\na)\r\n\r\n2 standard deviations below the mean, far\r\n\r\nb)\r\n\r\n1.25 standard deviations above the mean, near\r\n\r\nc)\r\n\r\n3.5 standard deviations above the mean, far\r\n\r\nd)\r\n\r\n0.34 standard deviations below the mean, near\r\n\r\n&nbsp;\r\n\r\n7)\r\n\r\n<span class=\"italic\" style=\"font-size: 0.8em;font-weight: lighter\">z <\/span><span style=\"font-size: 0.8em;font-weight: lighter\">= 0.75, 0.56, \u22122.75, \u22120.75, 2.19, \u22120.06<\/span>\r\n\r\n&nbsp;\r\n\r\n9)\r\n\r\na)\r\n\r\n<span style=\"font-size: 0.8em;font-weight: lighter\">\u22120.50<\/span>\r\n\r\nb)\r\n\r\n<span style=\"font-size: 0.8em;font-weight: lighter\">0.25<\/span>\r\n\r\nc)\r\n\r\n<span style=\"font-size: 0.8em;font-weight: lighter\">3.00<\/span>\r\n\r\nd)\r\n\r\n<span style=\"font-size: 0.8em;font-weight: lighter\">1.10<\/span>\r\n\r\n<\/div>\r\n<\/div>\r\n&nbsp;","rendered":"<div class=\"textbox textbox--sidebar textbox--learning-objectives\">\n<header class=\"textbox__header\">\n<h3 class=\"Chapter-element-head\">Key Terms<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<p>&nbsp;<\/p>\n<p class=\"Key-terms\"><a href=\"#_idTextAnchor139\"><span class=\"Hyperlink-underscore\">area in the tails of the distribution<\/span><\/a><\/p>\n<p class=\"Key-terms\"><a href=\"#_idTextAnchor138\"><span class=\"Hyperlink-underscore\">area under the curve<\/span><\/a><\/p>\n<p class=\"Key-terms\"><a href=\"#_idTextAnchor129\"><span class=\"Hyperlink-underscore\">normal distribution<\/span><\/a><\/p>\n<p class=\"Key-terms\"><a href=\"#_idTextAnchor136\"><span class=\"Hyperlink-underscore\">scale<\/span><\/a><\/p>\n<p class=\"Key-terms\"><a href=\"#_idTextAnchor137\"><span class=\"Hyperlink-underscore\">standardization<\/span><\/a><\/p>\n<p class=\"Key-terms\"><a href=\"#_idTextAnchor130\"><span class=\"Hyperlink-underscore\">standard normal distribution<\/span><\/a><\/p>\n<p class=\"Key-terms\"><a href=\"#_idTextAnchor132\"><span class=\"Hyperlink-underscore CharOverride-12\">z<\/span><span class=\"Hyperlink-underscore\"> score<\/span><\/a><\/p>\n<\/div>\n<\/div>\n<p class=\"Text-1st\">We now understand how to describe and present our data visually and numerically. These simple tools, and the principles behind them, will help you interpret information presented to you and understand the basics of a variable. Moving forward, we now turn our attention to how scores within a distribution are related to one another, how to precisely describe a score\u2019s location within the distribution, and how to compare scores from different distributions.<\/p>\n<h3 class=\"H1\"><a id=\"_idTextAnchor128\"><\/a>Normal Distributions<\/h3>\n<p class=\"Text-1st\">The <a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_153_650\"><a id=\"_idTextAnchor129\"><\/a><\/a><span class=\"key-term\">normal distribution<\/span> is the most important and most widely used distribution in statistics. It is sometimes called the \u201cbell curve,\u201d although the tonal qualities of such a bell would be less than pleasing. It is also called the \u201cGaussian curve\u201d of Gaussian distribution after the mathematician Karl Friedrich Gauss.<\/p>\n<p class=\"Text\">Strictly speaking, it is not correct to talk about \u201cthe normal distribution\u201d since there are many normal distributions. Normal distributions can differ in their means and in their standard deviations. <a href=\"#_idTextAnchor131\"><span class=\"Fig-table-number-underscore\">Figure 4.1<\/span><\/a> shows three normal distributions. The blue (left-most) distribution has a mean of \u22123 and a standard deviation of 0.5, the distribution in red (the middle distribution) has a mean of 0 and a standard deviation of 1, and the black (right-most) distribution has a mean of 2 and a standard deviation of 3. These as well as all other normal distributions are symmetric with relatively more values at the center of the distribution and relatively few in the tails. What is consistent about all normal distribution is the shape and the proportion of scores within a given distance along the <span class=\"italic\">x<\/span>-axis. We will focus on the <a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_153_653\"><a id=\"_idTextAnchor130\"><\/a><\/a><span class=\"key-term\">standard normal distribution<\/span> (also known as the unit normal distribution), which has a mean of 0 and a standard deviation of 1 (i.e., the red distribution in <a href=\"#_idTextAnchor131\"><span class=\"Fig-table-number-underscore\">Figure 4.1<\/span><\/a>).<\/p>\n<div class=\"_idGenObjectLayout-2\">\n<div id=\"_idContainer199\" class=\"Side-legend\">\n<p class=\"Fig-legend\"><span class=\"Fig-table-number\"><a id=\"_idTextAnchor131\"><\/a>Figure 4.1.<\/span> Normal distributions differing in mean and standard deviation. <span class=\"Fig-source\">(\u201c<\/span><a href=\"https:\/\/irl.umsl.edu\/oer-img\/51\"><span class=\"Fig-source\"><span class=\"Hyperlink-underscore\">Normal Distributions with Different Means and Standard Deviations<\/span><\/span><\/a><span class=\"Fig-source\">\u201d by Judy Schmitt is licensed under <\/span><a href=\"https:\/\/creativecommons.org\/licenses\/by-nc-sa\/4.0\/\"><span class=\"Fig-source\"><span class=\"Hyperlink-underscore\">CC BY-NC-SA 4.0<\/span><\/span><\/a><span class=\"Fig-source\">.)<\/span><\/p>\n<\/div>\n<\/div>\n<div class=\"_idGenObjectLayout-1\">\n<div id=\"_idContainer200\" class=\"_idGenObjectStyleOverride-1\"><img decoding=\"async\" class=\"_idGenObjectAttribute-19\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2021\/12\/Normal_Distributions_with_Different_Means_and_Standard_Deviations-2.png\" alt=\"\" \/><\/div>\n<\/div>\n<p class=\"Text\">Seven features of normal distributions are listed below.<\/p>\n<ol>\n<li class=\"Numbered-list-first\">Normal distributions are symmetric around their mean.<\/li>\n<li class=\"Numbered-list ParaOverride-16\">The mean, median, and mode of a normal distribution are equal.<\/li>\n<li class=\"Numbered-list ParaOverride-16\">The area under the normal curve is equal to 1.0.<\/li>\n<li class=\"Numbered-list ParaOverride-16\">Normal distributions are denser in the center and less dense in the tails.<\/li>\n<li class=\"Numbered-list ParaOverride-16\">Normal distributions are defined by two parameters, the mean (<img decoding=\"async\" class=\"_idGenObjectAttribute-31\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn2.14-mu-3.png\" alt=\"mu\" \/>) and the standard deviation (<span class=\"Symbol\">s<\/span>).<\/li>\n<li class=\"Numbered-list ParaOverride-16\">68% of the area of a normal distribution is within one standard deviation of the mean.<\/li>\n<li class=\"Numbered-list ParaOverride-16\">Approximately 95% of the area of a normal distribution is within two standard deviations of the mean.<\/li>\n<\/ol>\n<p><strong data-start=\"333\" data-end=\"384\">Social Justice Example (Housing Affordability):<\/strong><br data-start=\"384\" data-end=\"387\" \/>Consider rental housing costs across a large metropolitan area. If the costs follow an approximately normal distribution, most renters will pay amounts close to the average, while very high or very low rents will fall in the tails. But in cities where gentrification and displacement are occurring, the distribution of rents may become skewed rather than normal, with many tenants pushed into unaffordable extremes. Understanding the normal curve helps us recognize when the \u201ctypical\u201d experience no longer reflects people\u2019s lived reality.<\/p>\n<p class=\"Text\">These properties enable us to use the normal distribution to understand how scores relate to one another within and across a distribution. But first, we need to learn how to calculate the standardized score that makes up a standard normal distribution.<\/p>\n<h3 class=\"H1\"><span class=\"bold-italic CharOverride-3\">Z<\/span> Scores<\/h3>\n<p class=\"Text-1st\">A <a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_153_654\"><a id=\"_idTextAnchor132\"><\/a><\/a><span class=\"key-term CharOverride-2\">z<\/span><span class=\"key-term\"> score<\/span> is a standardized version of a raw score (<span class=\"italic\">x<\/span>) that gives information about the relative location of that score within its distribution. The formula for converting a raw score into a <span class=\"italic\">z<\/span> score is<\/p>\n<p class=\"Equation\"><img decoding=\"async\" class=\"_idGenObjectAttribute-60\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn4.1-2.png\" alt=\"\" \/><\/p>\n<p class=\"Text\">for values from a population and<\/p>\n<p class=\"Equation\"><img decoding=\"async\" class=\"_idGenObjectAttribute-61\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn4.2-2.png\" alt=\"\" \/><\/p>\n<p class=\"Text\">for values from a sample.<\/p>\n<p class=\"Text\">As you can see, <span class=\"italic\">z<\/span> scores combine information about where the distribution is located (the mean\/center) with how wide the distribution is (the standard deviation\/spread) to interpret a raw score\u00a0(<span class=\"italic\">x<\/span>). Specifically, <span class=\"italic\">z<\/span> scores will tell us how far the score is away from the mean in units of standard deviations and in what direction.<\/p>\n<p class=\"Text\">The value of a <span class=\"italic\">z<\/span> score has two parts: the sign (positive or negative) and the magnitude (the actual number). The sign of the <span class=\"italic\">z<\/span> score tells you in which half of the distribution the <span class=\"italic\">z<\/span> score falls: a positive sign (or no sign) indicates that the score is above the mean and on the right-hand side or upper end of the distribution, and a negative sign tells you the score is below the mean and on the left-hand side or lower end of the distribution. The magnitude of the number tells you, in units of standard deviations, how far away the score is from the center or mean. The magnitude can take on any value between negative and positive infinity, but for reasons we will see soon, they generally fall between \u22123 and 3.<\/p>\n<p class=\"Text\">Let\u2019s look at some examples. A <span class=\"italic\">z<\/span> score value of \u22121.0 tells us that this <span class=\"italic\">z<\/span> score is 1 standard deviation (because of the magnitude 1.0) below (because of the negative sign) the mean. Similarly, a <span class=\"italic\">z<\/span> score value of 1.0 tells us that this <span class=\"italic\">z<\/span> score is 1 standard deviation above the mean. Thus, these two scores are the same distance away from the mean but in opposite directions. A <span class=\"italic\">z<\/span> score of \u22122.5 is two-and-a-half standard deviations below the mean and is therefore farther from the center than both of the previous scores, and a <span class=\"italic\">z<\/span> score of 0.25 is closer than all of the ones before. In <a href=\"https:\/\/pressbooks.palomar.edu\/introtostats\/part\/unit-2-hypothesis-testing\/\"><span class=\"Hyperlink-underscore\">Unit 2<\/span><\/a>, we will learn to formalize the distinction between what we consider \u201cclose to\u201d the center or \u201cfar from\u201d the center. For now, we will use a rough cut-off of 1.5 standard deviations in either direction as the difference between close scores (those within 1.5 standard deviations or between <span class=\"italic\">z <\/span>= \u22121.5 and <span class=\"italic\">z <\/span>= 1.5) and extreme scores (those farther than 1.5 standard deviations\u2014below <span class=\"italic\">z <\/span>= \u22121.5 or above <span class=\"italic\">z <\/span>= 1.5).<\/p>\n<p class=\"Text\">We can also convert raw scores into <span class=\"italic\">z<\/span> scores to get a better idea of where in the distribution those scores fall. Let\u2019s say we get a score of 68 on an exam. We may be disappointed to have scored so low, but perhaps it was just a very hard exam. Having information about the distribution of all scores in the class would be helpful to put some perspective on ours. We find out that the class got an average score of 54 with a standard deviation of 8. To find out our relative location within this distribution, we simply convert our test score into a <span class=\"italic\">z<\/span> score.<\/p>\n<p class=\"Equation\"><img decoding=\"async\" class=\"_idGenObjectAttribute-62\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn4.3-2.png\" alt=\"\" \/><\/p>\n<p class=\"Text\">We find that we are 1.75 standard deviations above the average, above our rough cut-off for close and far. Suddenly our 68 is looking pretty good!<\/p>\n<p><strong data-start=\"1138\" data-end=\"1199\">Social Justice Example (Standardized Testing Inequities):<\/strong><br data-start=\"1199\" data-end=\"1202\" \/>Imagine two students take different standardized exams. A Latinx student scores 480 on the SAT Math section (mean = 511, SD = 120), and a white student scores 21 on the ACT Math section (mean = 20, SD = 5). At first glance, the raw scores suggest different outcomes, but converting them into z scores allows us to compare fairly across distributions. The SAT score converts to z = \u22120.26 (slightly below average), while the ACT score converts to z = 0.20 (slightly above average). This illustrates how z scores are crucial when comparing performance across systems that use different metrics\u2014and why it is important to examine how different groups fare relative to their peers, not just in absolute terms.<\/p>\n<p class=\"Text\"><a href=\"#_idTextAnchor133\"><span class=\"Fig-table-number-underscore\">Figure 4.2<\/span><\/a> shows both the raw score and the <span class=\"italic\">z<\/span> score on their respective distributions. Notice that the red line indicating where each score lies is in the same relative spot for both. This is because transforming a raw score into a <span class=\"italic\">z<\/span> score does not change its relative location, it only makes it easier to know precisely where it is.<\/p>\n<div class=\"_idGenObjectLayout-2\">\n<div id=\"_idContainer205\" class=\"Side-legend\">\n<p class=\"Fig-legend\"><span class=\"Fig-table-number\"><a id=\"_idTextAnchor133\"><\/a>Figure 4.2.<\/span> Raw and standardized versions of a single score. <span class=\"Fig-source\">(\u201c<\/span><a href=\"https:\/\/irl.umsl.edu\/oer-img\/52\"><span class=\"Fig-source\"><span class=\"Hyperlink-underscore\">Raw and Standardized Versions of a Score<\/span><\/span><\/a><span class=\"Fig-source\">\u201d by Judy Schmitt is licensed under <\/span><a href=\"https:\/\/creativecommons.org\/licenses\/by-nc-sa\/4.0\/\"><span class=\"Fig-source\"><span class=\"Hyperlink-underscore\">CC BY-NC-SA 4.0<\/span><\/span><\/a><span class=\"Fig-source\">.)<\/span><\/p>\n<\/div>\n<\/div>\n<div class=\"_idGenObjectLayout-1\">\n<div id=\"_idContainer206\" class=\"_idGenObjectStyleOverride-1\"><img decoding=\"async\" class=\"_idGenObjectAttribute-19\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Raw_and_Standardized_Versions_of_a_Score-2.png\" alt=\"\" \/><\/div>\n<\/div>\n<p class=\"Text\"><span class=\"italic\">z<\/span> Scores are also useful for comparing scores from different distributions. Let\u2019s say we take the SAT and score 501 on both the math and critical reading sections. Does that mean we did equally well on both? Scores on the math portion are distributed normally with a mean of 511 and standard deviation of 120, so our <span class=\"italic\">z<\/span> score on the math section is<\/p>\n<p class=\"Equation\"><img decoding=\"async\" class=\"_idGenObjectAttribute-63\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn4.4-2.png\" alt=\"\" \/><\/p>\n<p class=\"Text\">which is just slightly below average (note the use of \u201cmath\u201d as a subscript; subscripts are used when presenting multiple versions of the same statistic in order to know which one is which and have no bearing on the actual calculation). The critical reading section has a mean of 495 and standard deviation of 116, so<\/p>\n<p class=\"Equation\"><img decoding=\"async\" class=\"_idGenObjectAttribute-64\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn4.5-2.png\" alt=\"\" \/><\/p>\n<p class=\"Text\">So even though we were almost exactly average on both tests, we did a little bit better on the critical reading portion relative to other people.<\/p>\n<p class=\"Text\">Finally, <span class=\"italic\">z<\/span> scores are incredibly useful if we need to combine information from different measures that are on different scales. Let\u2019s say we give a set of employees a series of tests on things like job knowledge, personality, and leadership. We may want to combine these into a single score we can use to rate employees for development or promotion, but look what happens when we take the average of raw scores from different scales, as shown in <a href=\"#_idTextAnchor134\"><span class=\"Fig-table-number-underscore\">Table 4.1<\/span><\/a>.<\/p>\n<div class=\"_idGenObjectLayout-1\">\n<div id=\"_idContainer209\" class=\"_idGenObjectStyleOverride-1\">\n<p class=\"Table-title\"><span class=\"Fig-table-number\"><a id=\"_idTextAnchor134\"><\/a>Table 4.1.<\/span> Raw test scores on different scales (ranges in parentheses).<\/p>\n<table id=\"table027\" class=\"Foster-table\">\n<colgroup>\n<col class=\"_idGenTableRowColumn-57\" \/>\n<col class=\"_idGenTableRowColumn-58\" \/>\n<col class=\"_idGenTableRowColumn-33\" \/>\n<col class=\"_idGenTableRowColumn-59\" \/>\n<col class=\"_idGenTableRowColumn-60\" \/> <\/colgroup>\n<thead>\n<tr class=\"Foster-table _idGenTableRowColumn-19\">\n<td class=\"Foster-table Table-col-hd CellOverride-19\">\n<p class=\"Table-col-hd\">Employee<\/p>\n<\/td>\n<td class=\"Foster-table Table-col-hd CellOverride-19\">\n<p class=\"Table-col-hd ParaOverride-4\">Job Knowledge<br \/>\n(0\u2013100)<\/p>\n<\/td>\n<td class=\"Foster-table Table-col-hd CellOverride-19\">\n<p class=\"Table-col-hd ParaOverride-4\">Personality<br \/>\n(1\u20135)<\/p>\n<\/td>\n<td class=\"Foster-table Table-col-hd CellOverride-19\">\n<p class=\"Table-col-hd ParaOverride-4\">Leadership<br \/>\n(1\u20135)<\/p>\n<\/td>\n<td class=\"Foster-table Table-col-hd\">\n<p class=\"Table-col-hd ParaOverride-4\">Average<\/p>\n<\/td>\n<\/tr>\n<\/thead>\n<tbody>\n<tr class=\"Foster-table _idGenTableRowColumn-6\">\n<td class=\"Foster-table Table-body CellOverride-19 _idGenCellOverride-1\">\n<p class=\"Table-body\">Employee 1<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-19 _idGenCellOverride-1\">\n<p class=\"Table-body ParaOverride-4\">98<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-19 _idGenCellOverride-1\">\n<p class=\"Table-body ParaOverride-4\">4.2<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-19 _idGenCellOverride-1\">\n<p class=\"Table-body ParaOverride-4\">1.1<\/p>\n<\/td>\n<td class=\"Foster-table Table-body _idGenCellOverride-1\">\n<p class=\"Table-body ParaOverride-4\">34.43<\/p>\n<\/td>\n<\/tr>\n<tr class=\"Foster-table _idGenTableRowColumn-7\">\n<td class=\"Foster-table Table-body CellOverride-19 _idGenCellOverride-2\">\n<p class=\"Table-body\">Employee 2<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-19 _idGenCellOverride-2\">\n<p class=\"Table-body ParaOverride-4\">96<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-19 _idGenCellOverride-2\">\n<p class=\"Table-body ParaOverride-4\">3.1<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-19 _idGenCellOverride-2\">\n<p class=\"Table-body ParaOverride-4\">4.5<\/p>\n<\/td>\n<td class=\"Foster-table Table-body _idGenCellOverride-2\">\n<p class=\"Table-body ParaOverride-4\">34.53<\/p>\n<\/td>\n<\/tr>\n<tr class=\"Foster-table _idGenTableRowColumn-11\">\n<td class=\"Foster-table Table-body-last Table-body CellOverride-19\">\n<p class=\"Table-body\">Employee 3<\/p>\n<\/td>\n<td class=\"Foster-table Table-body-last Table-body CellOverride-19\">\n<p class=\"Table-body ParaOverride-4\">97<\/p>\n<\/td>\n<td class=\"Foster-table Table-body-last Table-body CellOverride-19\">\n<p class=\"Table-body ParaOverride-4\">2.9<\/p>\n<\/td>\n<td class=\"Foster-table Table-body-last Table-body CellOverride-19\">\n<p class=\"Table-body ParaOverride-4\">3.6<\/p>\n<\/td>\n<td class=\"Foster-table Table-body-last Table-body\">\n<p class=\"Table-body ParaOverride-4\">34.50<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<\/div>\n<p class=\"Text\">Because the job knowledge scores were so big and the scores were so similar, they overpowered the other scores and removed almost all variability in the average. However, if we standardize these scores into <span class=\"italic\">z<\/span> scores, our averages retain more variability and it is easier to assess differences between employees, as shown in <a href=\"#_idTextAnchor135\"><span class=\"Fig-table-number-underscore\">Table 4.2<\/span><\/a>.<\/p>\n<p class=\"Text\">To convert all these scores into z scores we simply find the average for each category and use our z score formula to convert raw scores into z-scores.<\/p>\n<p class=\"Text\"><img decoding=\"async\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn4.2-2.png\" alt=\"image\" \/><\/p>\n<p>For Employee 1 their raw score is 98.\u00a0 The mean is 97 and the standard deviation is 1.\u00a0 The z score is 98-97\/1 = 1.00.\u00a0 You can standardize scores for each employee using the z formula as shown in table 4.2.\u00a0 You can do the same for the other categories (Personality and Leadership) using the means and standard deviations for those categories.\u00a0 Then you can sum across Employees to get the Average.\u00a0 It is now easier to compare overall scores.<\/p>\n<div class=\"_idGenObjectLayout-1\">\n<div id=\"_idContainer210\" class=\"_idGenObjectStyleOverride-1\">\n<p class=\"Table-title\"><span class=\"Fig-table-number\"><a id=\"_idTextAnchor135\"><\/a>Table 4.2.<\/span> Standardized scores.<\/p>\n<table id=\"table028\" class=\"Foster-table\">\n<colgroup>\n<col class=\"_idGenTableRowColumn-57\" \/>\n<col class=\"_idGenTableRowColumn-58\" \/>\n<col class=\"_idGenTableRowColumn-33\" \/>\n<col class=\"_idGenTableRowColumn-59\" \/>\n<col class=\"_idGenTableRowColumn-60\" \/> <\/colgroup>\n<thead>\n<tr class=\"Foster-table _idGenTableRowColumn-19\">\n<td class=\"Foster-table Table-col-hd CellOverride-19\">\n<p class=\"Table-col-hd\">Employee<\/p>\n<\/td>\n<td class=\"Foster-table Table-col-hd CellOverride-19\">\n<p class=\"Table-col-hd ParaOverride-4\">Job Knowledge<br \/>\n(0\u2013100)<\/p>\n<\/td>\n<td class=\"Foster-table Table-col-hd CellOverride-19\">\n<p class=\"Table-col-hd ParaOverride-4\">Personality<br \/>\n(1\u20135)<\/p>\n<\/td>\n<td class=\"Foster-table Table-col-hd CellOverride-19\">\n<p class=\"Table-col-hd ParaOverride-4\">Leadership<br \/>\n(1\u20135)<\/p>\n<\/td>\n<td class=\"Foster-table Table-col-hd\">\n<p class=\"Table-col-hd ParaOverride-4\">Average<\/p>\n<\/td>\n<\/tr>\n<\/thead>\n<tbody>\n<tr class=\"Foster-table _idGenTableRowColumn-6\">\n<td class=\"Foster-table Table-body CellOverride-19 _idGenCellOverride-1\">\n<p class=\"Table-body\">Employee 1<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-19 _idGenCellOverride-1\">\n<p class=\"Table-body\">1.00<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-19 _idGenCellOverride-1\">\n<p class=\"Table-body\">1.14<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-19 _idGenCellOverride-1\">\n<p class=\"Table-body\">\u22121.12<\/p>\n<\/td>\n<td class=\"Foster-table Table-body _idGenCellOverride-1\">\n<p class=\"Table-body\">0.34<\/p>\n<\/td>\n<\/tr>\n<tr class=\"Foster-table _idGenTableRowColumn-7\">\n<td class=\"Foster-table Table-body CellOverride-19 _idGenCellOverride-2\">\n<p class=\"Table-body\">Employee 2<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-19 _idGenCellOverride-2\">\n<p class=\"Table-body\">\u22121.00<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-19 _idGenCellOverride-2\">\n<p class=\"Table-body\">\u22120.43<\/p>\n<\/td>\n<td class=\"Foster-table Table-body CellOverride-19 _idGenCellOverride-2\">\n<p class=\"Table-body\">0.81<\/p>\n<\/td>\n<td class=\"Foster-table Table-body _idGenCellOverride-2\">\n<p class=\"Table-body\">\u22120.20<\/p>\n<\/td>\n<\/tr>\n<tr class=\"Foster-table _idGenTableRowColumn-11\">\n<td class=\"Foster-table Table-body-last Table-body CellOverride-19\">\n<p class=\"Table-body\">Employee 3<\/p>\n<\/td>\n<td class=\"Foster-table Table-body-last Table-body CellOverride-19\">\n<p class=\"Table-body\">0.00<\/p>\n<\/td>\n<td class=\"Foster-table Table-body-last Table-body CellOverride-19\">\n<p class=\"Table-body\">\u22120.71<\/p>\n<\/td>\n<td class=\"Foster-table Table-body-last Table-body CellOverride-19\">\n<p class=\"Table-body\">0.30<\/p>\n<\/td>\n<td class=\"Foster-table Table-body-last Table-body\">\n<p class=\"Table-body\">\u22120.14<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<\/div>\n<p><strong data-start=\"2141\" data-end=\"2198\">Social Justice Example (Pay Equity in the Workplace):<\/strong><br data-start=\"2198\" data-end=\"2201\" \/>Raw salaries often come in very different scales depending on job type (hourly vs. annual salaries, part-time vs. full-time). Simply averaging them can make inequities invisible. Standardizing into z scores lets us meaningfully compare across roles. For example, two workers may each earn $20\/hour, but in one workplace that\u2019s well above the mean (positive z score), while in another it\u2019s below average (negative z score). In social justice research, this allows us to identify whether women, people of color, or other marginalized groups are consistently clustered below the mean even when their absolute wages seem similar.<\/p>\n<h4 class=\"H2\">Setting the Scale of a Distribution<\/h4>\n<p class=\"Text-1st\">Another convenient characteristic of <span class=\"italic\">z<\/span> scores is that they can be converted into any \u201cscale\u201d that we would like. Here, the term <a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_153_651\"><a id=\"_idTextAnchor136\"><\/a><\/a><span class=\"key-term\">scale<\/span> means how far apart the scores are (their spread) and where they are located (their central tendency). This can be very useful if we don\u2019t want to work with negative numbers or if we have a specific range we would like to present. The formulas for transforming <span class=\"italic\">z <\/span>to <span class=\"italic\">x<\/span> are:<\/p>\n<p class=\"Equation\"><img decoding=\"async\" class=\"_idGenObjectAttribute-65\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn4.6-2.png\" alt=\"\" \/><\/p>\n<p class=\"Text\">for a population and<\/p>\n<p class=\"Equation\"><img decoding=\"async\" class=\"_idGenObjectAttribute-66\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn4.7-2.png\" alt=\"\" \/><\/p>\n<p class=\"Text\">for a sample. Notice that these are just simple rearrangements of the original formulas for calculating <span class=\"italic\">z <\/span>from raw scores.<\/p>\n<p class=\"Text\">Let\u2019s say we create a new measure of intelligence, and initial calibration finds that our scores have a mean of 40 and standard deviation of 7. Three people who have scores of 52, 43, and 34 want to know how well they did on the measure. We can convert their raw scores into <span class=\"italic\">z<\/span> scores:<\/p>\n<p class=\"Equation\"><img decoding=\"async\" class=\"_idGenObjectAttribute-67\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn4.8-2.png\" alt=\"\" \/><\/p>\n<p class=\"Equation\"><img decoding=\"async\" class=\"_idGenObjectAttribute-68\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn4.9-2.png\" alt=\"\" \/><\/p>\n<p class=\"Equation\"><img decoding=\"async\" class=\"_idGenObjectAttribute-69\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Eqn4.10-2.png\" alt=\"\" \/><\/p>\n<p class=\"Text\">A problem is that these new <span class=\"italic\">z<\/span> scores aren\u2019t exactly intuitive for many people. We can give people information about their relative location in the distribution (for instance, the first person scored well above average), or we can translate these <span class=\"italic\">z<\/span> scores into the more familiar metric of IQ scores, which have a mean of 100 and standard deviation of 16:<\/p>\n<p class=\"Text ParaOverride-4\">IQ = 1.71(16) + 100 = 127.36<\/p>\n<p class=\"Text ParaOverride-4\">IQ = 0.43(16) + 100 = 106.88<\/p>\n<p class=\"Text ParaOverride-4\">IQ = \u22120.80(16) + 100 = 87.20<\/p>\n<p class=\"Text\">We would also likely round these values to 127, 107, and 87, respectively, for convenience.<\/p>\n<h3 class=\"H1\"><span class=\"bold-italic CharOverride-3\">Z<\/span> Scores and the Area under the Curve<\/h3>\n<p class=\"Text-1st\"><span class=\"italic\">z<\/span> Scores and the standard normal distribution go hand-in-hand. A <span class=\"italic\">z<\/span> score will tell you exactly where in the standard normal distribution a value is located, and any normal distribution can be converted into a standard normal distribution by converting all of the scores in the distribution into <span class=\"italic\">z<\/span> scores, a process known as <a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_153_652\"><a id=\"_idTextAnchor137\"><\/a><\/a><span class=\"key-term\">standardization<\/span>.<\/p>\n<p class=\"Text\">We saw in <a href=\"https:\/\/pressbooks.palomar.edu\/introtostats\/chapter\/chapter-3\/\"><span class=\"Hyperlink-underscore\">Chapter <\/span><\/a><span class=\"Hyperlink-underscore\">3<\/span> that standard deviations can be used to divide the normal distribution: 68% of the distribution falls within 1 standard deviation of the mean, 95% within (roughly) 2 standard deviations, and 99.7% within 3 standard deviations. Because <span class=\"italic\">z<\/span> scores are in units of standard deviations, this means that 68% of scores fall between <span class=\"italic\">z <\/span>= \u22121.0 and <span class=\"italic\">z <\/span>= 1.0 and so on. We call this 68% (or any percentage we have based on our <span class=\"italic\">z<\/span> scores) the proportion of the <a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_153_649\"><a id=\"_idTextAnchor138\"><\/a><\/a><span class=\"key-term\">area under the curve<\/span>. Any area under the curve is bounded by (defined by, delineated by, etc.) by a single <span class=\"italic\">z<\/span> score or pair of <span class=\"italic\">z<\/span> scores.<\/p>\n<p class=\"Text\">An important property to point out here is that, by virtue of the fact that the total area under the curve of a distribution is always equal to 1.0 (see <a href=\"#_idTextAnchor128\"><span class=\"Hyperlink-underscore\">section on Normal Distributions<\/span><\/a> at the beginning of this chapter), these areas under the curve can be added together or subtracted from 1 to find the proportion in other areas. For example, we know that the area between <span class=\"italic\">z <\/span>= \u22121.0 and <span class=\"italic\">z <\/span>= 1.0 (i.e., within one standard deviation of the mean) contains 68% of the area under the curve, which can be represented in decimal form as .6800. (To change a percentage to a decimal, simply move the decimal point 2 places to the left.) Because the total area under the curve is equal to 1.0, that means that the proportion of the area outside <span class=\"italic\">z <\/span>= \u22121.0 and <span class=\"italic\">z <\/span>= 1.0 is equal to 1.0 \u2212 .6800 = .3200 or 32% (see <a href=\"#_idTextAnchor140\"><span class=\"Fig-table-number-underscore\">Figure 4.3<\/span><\/a>). This area is called the <a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_153_648\"><a id=\"_idTextAnchor139\"><\/a><\/a><span class=\"key-term\">area in the tails of the distribution<\/span>. Because this area is split between two tails and because the normal distribution is symmetrical, each tail has exactly one-half, or 16%, of the area under the curve.<\/p>\n<div class=\"_idGenObjectLayout-2\">\n<div id=\"_idContainer216\" class=\"Side-legend\">\n<p class=\"Fig-legend\"><span class=\"Fig-table-number\"><a id=\"_idTextAnchor140\"><\/a>Figure 4.3.<\/span> Shaded areas represent the area under the curve in the tails. <span class=\"Fig-source\">(\u201c<\/span><a href=\"https:\/\/irl.umsl.edu\/oer-img\/53\"><span class=\"Fig-source\"><span class=\"Hyperlink-underscore\">Area under the Curve in the Tails<\/span><\/span><\/a><span class=\"Fig-source\">\u201d by Judy Schmitt is licensed under <\/span><a href=\"https:\/\/creativecommons.org\/licenses\/by-nc-sa\/4.0\/\"><span class=\"Fig-source\"><span class=\"Hyperlink-underscore\">CC BY-NC-SA 4.0<\/span><\/span><\/a><span class=\"Fig-source\">.)<\/span><\/p>\n<\/div>\n<\/div>\n<div class=\"_idGenObjectLayout-1\">\n<div id=\"_idContainer217\" class=\"_idGenObjectStyleOverride-1\"><img decoding=\"async\" class=\"_idGenObjectAttribute-19\" src=\"https:\/\/pressbooks.palomar.edu\/wp-content\/uploads\/sites\/8\/2024\/10\/Area_under_the_Curve_in_the_Tails-2.png\" alt=\"\" \/><\/div>\n<\/div>\n<p><strong data-start=\"3013\" data-end=\"3061\">Social Justice Example (Health Disparities):<\/strong><br data-start=\"3061\" data-end=\"3064\" \/>In public health, z scores and the normal curve are often used to identify whether outcomes fall within expected ranges. For instance, birth weights in a healthy population typically follow a normal distribution. Babies born more than 2 standard deviations below the mean are considered low birth weight, a risk factor for health complications. Social justice researchers use these cut-offs to reveal disparities: in many U.S. cities, Black and Indigenous mothers are far more likely to have infants in the \u201ctails\u201d of the distribution, reflecting systemic inequities in access to prenatal care and safe living conditions<\/p>\n<p class=\"Text\">We will have much more to say about this concept in the coming chapters. As it turns out, this is a quite powerful idea that enables us to make statements about how likely an outcome is and what that means for research questions we would like to answer and hypotheses we would like to test. But first, we need to make a brief foray into some ideas about probability in <a href=\"https:\/\/pressbooks.palomar.edu\/introtostats\/chapter\/chapter-5\/\"><span class=\"Hyperlink-underscore\">Chapter 5<\/span><\/a>.<\/p>\n<h3 class=\"H1\">Exercises<\/h3>\n<ol>\n<li class=\"Numbered-list-Exercises-1st\">What are the two pieces of information contained in a <span class=\"italic\">z<\/span> score?<\/li>\n<li class=\"Numbered-list-Exercises\">A <span class=\"italic\">z<\/span> score takes a raw score and standardizes it into units of .<\/li>\n<li class=\"Numbered-list-Exercises\">Assume the following five scores represent a sample: 2, 3, 5, 5, 6. Transform these scores into <span class=\"italic\">z<\/span>\u00a0scores.<\/li>\n<li class=\"Numbered-list-Exercises\">True or false:\n<ol>\n<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">All normal distributions are symmetrical.<\/li>\n<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">All normal distributions have a mean of 1.0.<\/li>\n<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">All normal distributions have a standard deviation of 1.0.<\/li>\n<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">The total area under the curve of all normal distributions is equal to 1.<\/li>\n<\/ol>\n<\/li>\n<li class=\"Numbered-list-Exercises\">Interpret the location, direction, and distance (near or far) of the following <span class=\"italic\">z<\/span> scores:\n<ol>\n<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">\u22122.00<\/li>\n<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">1.25<\/li>\n<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">3.50<\/li>\n<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">\u22120.34<\/li>\n<\/ol>\n<\/li>\n<li class=\"Numbered-list-Exercises\">Transform the following <span class=\"italic\">z<\/span> scores into a distribution with a mean of 10 and standard deviation of\u00a02: \u22121.75, 2.20, 1.65, \u22120.95<\/li>\n<li class=\"Numbered-list-Exercises\">Calculate <span class=\"italic\">z<\/span> scores for the following raw scores taken from a population with a mean of 100 and standard deviation of 16: 112, 109, 56, 88, 135, 99<\/li>\n<li class=\"Numbered-list-Exercises\">What does a <span class=\"italic\">z<\/span> score of 0.00 represent?<\/li>\n<li class=\"Numbered-list-Exercises\">For a distribution with a standard deviation of 20, find <span class=\"italic\">z<\/span> scores that correspond to:\n<ol>\n<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">One-half of a standard deviation below the mean<\/li>\n<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">5 points above the mean<\/li>\n<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">Three standard deviations above the mean<\/li>\n<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">22 points below the mean<\/li>\n<\/ol>\n<\/li>\n<li class=\"Numbered-list-Exercises\">Calculate the raw score for the following <span class=\"italic\">z<\/span> scores from a distribution with a mean of 15 and standard deviation of 3:\n<ol>\n<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">4.0<\/li>\n<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">2.2<\/li>\n<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">\u22121.3<\/li>\n<li class=\"Numbered-list-Exercises-sub _idGenParaOverride-1\">0.46<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<div class=\"textbox textbox--learning-objectives\">\n<header class=\"textbox__header\">\n<h3 class=\"H1\">Answers to Odd-Numbered Exercises<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<p>1)<\/p>\n<p>The location above or below the mean (from the sign of the number) and the distance in standard deviations away from the mean (from the magnitude of the number)<\/p>\n<p>&nbsp;<\/p>\n<p>3)<\/p>\n<p><span class=\"italic\" style=\"font-size: 0.8em;font-weight: lighter\">M<\/span><span style=\"font-size: 0.8em;font-weight: lighter\"> = 4.2, <\/span><span class=\"italic\" style=\"font-size: 0.8em;font-weight: lighter\">s<\/span><span style=\"font-size: 0.8em;font-weight: lighter\"> = 1.64; <\/span><span class=\"italic\" style=\"font-size: 0.8em;font-weight: lighter\">z <\/span><span style=\"font-size: 0.8em;font-weight: lighter\">= \u22121.34, \u22120.73, 0.49, 0.49, 1.10<\/span><\/p>\n<p>&nbsp;<\/p>\n<p>5)<\/p>\n<p>a)<\/p>\n<p>2 standard deviations below the mean, far<\/p>\n<p>b)<\/p>\n<p>1.25 standard deviations above the mean, near<\/p>\n<p>c)<\/p>\n<p>3.5 standard deviations above the mean, far<\/p>\n<p>d)<\/p>\n<p>0.34 standard deviations below the mean, near<\/p>\n<p>&nbsp;<\/p>\n<p>7)<\/p>\n<p><span class=\"italic\" style=\"font-size: 0.8em;font-weight: lighter\">z <\/span><span style=\"font-size: 0.8em;font-weight: lighter\">= 0.75, 0.56, \u22122.75, \u22120.75, 2.19, \u22120.06<\/span><\/p>\n<p>&nbsp;<\/p>\n<p>9)<\/p>\n<p>a)<\/p>\n<p><span style=\"font-size: 0.8em;font-weight: lighter\">\u22120.50<\/span><\/p>\n<p>b)<\/p>\n<p><span style=\"font-size: 0.8em;font-weight: lighter\">0.25<\/span><\/p>\n<p>c)<\/p>\n<p><span style=\"font-size: 0.8em;font-weight: lighter\">3.00<\/span><\/p>\n<p>d)<\/p>\n<p><span style=\"font-size: 0.8em;font-weight: lighter\">1.10<\/span><\/p>\n<\/div>\n<\/div>\n<p>&nbsp;<\/p>\n<div class=\"glossary\"><span class=\"screen-reader-text\" id=\"definition\">definition<\/span><template id=\"term_153_650\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_153_650\"><div tabindex=\"-1\"><\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_153_653\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_153_653\"><div tabindex=\"-1\"><\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_153_654\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_153_654\"><div tabindex=\"-1\"><\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_153_651\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_153_651\"><div tabindex=\"-1\"><\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_153_652\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_153_652\"><div tabindex=\"-1\"><\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_153_649\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_153_649\"><div tabindex=\"-1\"><\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_153_648\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_153_648\"><div tabindex=\"-1\"><\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><\/div>","protected":false},"author":7,"menu_order":4,"template":"","meta":{"pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-153","chapter","type-chapter","status-publish","hentry"],"part":21,"_links":{"self":[{"href":"https:\/\/pressbooks.palomar.edu\/introtostats\/wp-json\/pressbooks\/v2\/chapters\/153","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/pressbooks.palomar.edu\/introtostats\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/pressbooks.palomar.edu\/introtostats\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/pressbooks.palomar.edu\/introtostats\/wp-json\/wp\/v2\/users\/7"}],"version-history":[{"count":8,"href":"https:\/\/pressbooks.palomar.edu\/introtostats\/wp-json\/pressbooks\/v2\/chapters\/153\/revisions"}],"predecessor-version":[{"id":969,"href":"https:\/\/pressbooks.palomar.edu\/introtostats\/wp-json\/pressbooks\/v2\/chapters\/153\/revisions\/969"}],"part":[{"href":"https:\/\/pressbooks.palomar.edu\/introtostats\/wp-json\/pressbooks\/v2\/parts\/21"}],"metadata":[{"href":"https:\/\/pressbooks.palomar.edu\/introtostats\/wp-json\/pressbooks\/v2\/chapters\/153\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/pressbooks.palomar.edu\/introtostats\/wp-json\/wp\/v2\/media?parent=153"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/pressbooks.palomar.edu\/introtostats\/wp-json\/pressbooks\/v2\/chapter-type?post=153"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/pressbooks.palomar.edu\/introtostats\/wp-json\/wp\/v2\/contributor?post=153"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/pressbooks.palomar.edu\/introtostats\/wp-json\/wp\/v2\/license?post=153"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}