Suppose you are student and you gave a 100 tests in current academic year. In first test, you scored 50%, in next you scored 75%, in next you scored 33% and so on. Suppose, we plot this on a chart as shown below.
Here each blue line represents your test score. Higher the line, more score you had. You can create a similar chart in MS Excel. The data for test scores is attached along with this lecture. Just select the "scores" column, go to insert chart and select column chart.
Now, let's talk about a different type of chart.
This is a Histogram Chart. On the horizontal axis, each column represents interval of test scores. On the vertical axis, height of the column represents number of times (out of 100) a score in the given interval was achieved. For instance, a test score between 1 and 24 was achieved 30 times out of 100 tests.
You can create this chart by selecting the data then inserting "histogram" chart under charts option.
The reason we are discussing second type of chart is because it talks about frequency of scores within a given range. We call this "probability" in mathematics. For instance, we can say that you scored marks between 1 and 24 more frequently. This also tells us about the distribution of the data.
For instance, consider following chart -
Here also blue columns represents frequency. As you can see the central value (represented by middle most value on the horizontal axis) has highest frequency. The shape is symmetric. We call this a "normal distribution". The center most value on the horizontal axis about which the shape is symmetric is mean of the data.
We can replace "frequency" with "probability". For instance, in your 100 tests, you got score of a 50 marks in 5 tests. The frequency will be 5. But the probability of getting a 50 marks in test is frequency divided by total number of tests given i.e. 5/100 = 0.05.
If we add probability for all the tests, it should be 1. Why?
In order to understand this, consider a simpler data. For instance, you scored 50 marks in 5 tests, 40 marks in 60 tests and in remaining 35 tests, you scored 70 marks. Hence, probability of getting 50 marks is 0.05. Similarly, 40 marks is 0.6 and 70 marks is 0.35. If you add all these, you should get 1.
Thus, area under a normal distribution curve is 1 because it indicates probability of occurrence.