Lectures No: 1 |

Date: 21/05/2013 |

Introduction to Statistics By Dr. V. Sutahar Assistant Professor Department of Statistics Sindh Agriculture University, Tandojam E-mail: vsutahar@yahoo.co.uk The Nature of Statistics “Statistics” First appeared in the English language in 1787. Statistical Thinking Will One Day Be As Necessary For Efficient Citizenship as the Ability to Read & Write. ~ H. G. Wells Learning Objectives Understand and be able to distinguish different meanings and uses of the word ‘statistics’; Be able to describe the nature of statistics as a scientific discipline; Be aware of the fact that statistics can be presented and used in misleading ways, unintentionally or even intentionally; Appreciate the importance of statistics in science and in society in general; Appreciate the need for statistics in a business and finance environment. Goals After completing this note, you should be able to: Explain key definitions: Population vs. Sample Primary vs. Secondary Data Parameter vs. Statistic Descriptive vs. Inferential Statistics Describe key data collection methods Describe different sampling methods Probability Samples vs. Nonprobability Samples Select a random sample using a random numbers table Identify types of data and levels of measurement Describe the different types of survey error Meaning of Statistics The word ‘statistics’ has a few different (but related) meanings, and is used in different ways, depending on the context. Most people have heard of statistics in the context of football, cricket or political opinion polls. What to Do in Statistics? 1. Collecting Data e.g. Survey 2. Presenting Data e.g., Charts & Tables 3. Characterizing Data e.g., Average Data Analysis (why) Decision-Making Key Terms 1. Population (Universe) All Items of Interest 2. Sample Portion of Population 3. Parameter Summary Measure about Population 4. Statistic Summary Measure about Sample Why Need Collect Data? To provide input to survey To provide input to research study To measure performance of service or production process To evaluate conformance to standards To assist in formulating alternative courses of action Satisfy curiosity Knowledge for the sake of knowledge To provide input to survey To provide input to study To measure performance of service or production process To evaluate conformance to standards To assist in formulating alternative courses of action Data and Data Sets Data are the facts and figures collected, summarized, analyzed, and interpreted. The data collected in a particular study are referred to as the data set. Elements, Variables, and Observations The elements are the entities on which data are collected. A variable is a characteristic of interest for the elements. The set of measurements collected for a particular element is called an observation. The total number of data values in a data set is the number of elements multiplied by the number of variables.

Lectures No: 2 |

Date: 01/06/2013 |

FREQUENCY DISTRIBUTION OBJECTIVES: Acquire knowledge on the basic concept of frequency distribution table, range, class width, class limits, class boundaries, and class marks. Identify the class size, class marks, class boundaries, and class limits for the given frequency distribution table. Construct a frequency distribution table RECALL Classify the ff as discrete or continuous data: Shoe sizes actual lengths of feet No.of students in AC – high school Male teachers in AC Temperature of the room Among campus vending machines, 14 are found to be defective. Today's records show that 5 students were absent. The car weighs 1430 kilograms. Among all SAT scores last year, 23 were perfect. Radar indicated that the driver was going 72.4 mph. Essential Questions: What is a frequency distribution table? What are the basic concepts needed in constructing a frequency distribution table? A frequency distribution table lists categories of scores along with their corresponding frequencies. The frequency for a particular category or class is the number of original scores that fall into that class. The classes or categories refer to the groupings of a frequency table The range is the difference between the highest value and the lowest value. R = highest value – lowest value The class width is the difference between two consecutive lower class limits or class boundaries. The class limits are the smallest or the largest numbers that can actually belong to different classes. Lower class limits are the smallest numbers that can actually belong to the different classes. Upper class limits are the largest numbers that can actually belong to the different classes. The class boundaries are obtained by increasing the upper class limits and decreasing the lower class limits by the same amount so that there are no gaps between consecutive under classes. The amount to be added or subtracted is ½ the difference between the upper limit of one class and the lower limit of the following class. class marks are the midpoints of the classes Essential Question : How do we construct a frequency distribution table? Process of Constructing a Frequency Table STEP 1: Determine the range. R = Highest Value – Lowest Value STEP 2. Determine the tentative number of classes (k) k = 1 + 3.322 log N Always round – off Note: The number of classes should be between 5 and 20. The actual number of classes may be affected by convenience or other subjective factors STEP 3. Find the class width by dividing the range by the number of classes. (Always round – off ) STEP 4. Write the classes or categories starting with the lowest score. Stop when the class already includes the highest score. Add the class width to the starting point to get the second lower class limit. Add the class width to the second lower class limit to get the third, and so on. List the lower class limits in a vertical column and enter the upper class limits, which can be easily identified at this stage. STEP 5. Determine the frequency for each class by referring to the tally columns and present the results in a table. When constructing frequency tables, the following guidelines should be followed. The classes must be mutually exclusive. That is, each score must belong to exactly one class. Include all classes, even if the frequency might be zero. All classes should have the same width, although it is sometimes impossible to avoid open – ended intervals such as “65 years or older”. The number of classes should be between 5 and 20. Let’s Try!!! Time magazine collected information on all 464 people who died from gunfire in the Philippines during one week. Here are the ages of 50 men randomly selected from that population. Construct a frequency distribution table. 19 18 30 40 41 33 73 25 23 25 21 33 65 17 20 76 47 69 20 31 18 24 35 24 17 36 65 70 22 25 65 16 24 29 42 37 26 46 27 63 21 27 23 25 71 37 75 25 27 23 Determine the range. R = Highest Value – Lowest Value R = 76 – 16 = 60 Determine the tentative number of classes (K). K = 1 + 3. 322 log N = 1 + 3.322 log 50 = 1 + 3.322 (1.69897) = 6.64 *Round – off the result to the next integer if the decimal part exceeds 0. K = 7 Find the class width (c). * Round – off the quotient if the decimal part exceeds 0. Write the classes starting with lowest score. Using Table: What is the lower class limit of the highest class? Upper class limit of the lowest class? Find the class mark of the class 43 – 51. What is the frequency of the class 16 – 24? CUMULATIVE FREQUENCY DISTRIBUTION The less than cumulative frequency distribution (F<) is constructed by adding the frequencies from the lowest to the highest interval while the more than cumulative frequency distribution (F>) is constructed by adding the frequencies from the highest class interval to the lowest class interval. RELATIVE FREQUENCY DISTRIBUTION A Relative frequency distribution indicates the proportion of the total number of observations that is occurring in each interval. That is, Relative frequencies may be expressed in percent. Hence a relative frequency table is also called percentage frequency distribution Note: A Relative cumulative frequency distribution may be constructed using relative frequencies of the cumulative frequency “less than” or “more than”.