Admin: Zahid Hussain Gopang 2k10-IT

## All the Lectures of Probability & Statistics

 Lecturer:   Dr. Velo Sutahar Subject Title:   Probability & Statistics Code:   STAT-501

Lectures No:   1
Date:   21/05/2013
```
Introduction to Statistics

By

Dr. V. Sutahar
Assistant Professor
Department of Statistics
Sindh Agriculture University, Tandojam
E-mail: vsutahar@yahoo.co.uk

The Nature of Statistics

“Statistics” First appeared in the English language in 1787.

Statistical Thinking Will One Day Be As Necessary
For Efficient Citizenship as the Ability to Read & Write.
~ H. G. Wells

Learning Objectives

Understand and be able to distinguish different meanings and uses of the word ‘statistics’;
Be able to describe the nature of statistics as a scientific discipline;
Be aware of the fact that statistics can be presented and used in misleading ways, unintentionally or even intentionally;
Appreciate the importance of statistics in science and in society in general;
Appreciate the need for statistics in a business and finance environment.

Goals

After completing this note, you should be able to:
Explain key definitions:

Population vs. Sample

Primary vs. Secondary Data

Parameter vs. Statistic

Descriptive vs. Inferential Statistics

Describe key data collection methods
Describe different sampling methods
Probability Samples vs. Nonprobability Samples
Select a random sample using a random numbers table
Identify types of data and levels of measurement
Describe the different types of survey error

Meaning of Statistics

The word ‘statistics’ has a few different (but related) meanings, and is used in different ways, depending on the context.
Most people have heard of statistics in the context of football, cricket or political opinion polls. What to Do in Statistics?

1.	Collecting Data
e.g. Survey
2.	Presenting Data
e.g., Charts & Tables
3.	Characterizing Data
e.g., Average

Data Analysis (why)
Decision-Making

Key Terms

1.	Population (Universe)
All Items of Interest
2.	Sample
Portion of Population
3.	Parameter
Summary Measure about Population
4.	Statistic
Summary Measure about Sample    Why Need Collect  Data?

To provide input to survey
To provide input to research study
To measure performance of service or production process
To evaluate conformance to standards
To assist in formulating alternative courses   of action
Satisfy curiosity
Knowledge for the sake of knowledge

To provide input to survey
To provide input to study
To measure performance of service or production process
To evaluate conformance to standards
To assist in formulating alternative courses   of action

Data and Data Sets

Data are the facts and figures collected, summarized,
analyzed, and interpreted.

The data collected in a particular study are referred
to as the data set.

Elements, Variables, and Observations

The elements are the entities on which data are
collected.
A variable is a characteristic of interest for the elements.
The set of measurements collected for a particular
element is called an observation.
The total number of data values in a data set is the
number of elements multiplied by the number of
variables.                                     ```

Lectures No:   2
Date:   01/06/2013
```

FREQUENCY DISTRIBUTION

OBJECTIVES:

Acquire knowledge on the basic concept of frequency distribution table, range, class width, class limits, class boundaries, and class marks.

Identify the class size, class marks, class boundaries, and class limits for the given frequency distribution table.
Construct a frequency distribution table

RECALL

Classify the ff as discrete or continuous data:

Shoe sizes
actual lengths of feet
No.of students in AC – high school
Male teachers in AC
Temperature of the room

Among campus vending machines, 14 are found to be defective.
Today's records show that 5 students were absent.
The car weighs 1430 kilograms.
Among all SAT scores last year, 23 were perfect.
Radar indicated that the driver was going 72.4 mph.

Essential Questions:

What is a frequency distribution table?
What are the basic concepts needed in constructing a frequency distribution table?

A frequency distribution table lists categories of scores along with their corresponding frequencies.

The frequency for a particular category or class is the number of original scores that fall into that class.

The classes or categories refer to the groupings of a frequency table

The range is the difference between the highest value and the lowest value.

R = highest value – lowest value

The class width is the difference between two consecutive lower class limits or class boundaries.

The class limits are the smallest or the largest numbers that can actually belong to different classes.

Lower class limits are the smallest numbers that can actually belong to the different classes.
Upper class limits are the largest numbers that can actually belong to the different classes.

The class boundaries are obtained by increasing the upper class limits
and decreasing the lower class limits by the same amount so that
there are no gaps between consecutive under classes. The amount
to be added or subtracted is ½ the difference between the upper
limit of one class and the lower limit of the following class.

class marks are the midpoints of the classes

Essential Question :

How do we construct a frequency distribution table?

Process of Constructing a Frequency Table

STEP 1:  Determine the
range.

R = Highest Value – Lowest Value

STEP 2.  Determine the tentative number of classes (k)

k = 1 + 3.322 log N

Always round – off
Note:  The number of classes should be between 5 and 20.
The actual number of classes may be affected by convenience or other subjective factors

STEP 3.  Find the class width by dividing the range by the number of classes.

(Always round – off )

STEP 4.  Write the classes or categories starting with the lowest score.
Stop when the class already includes the highest score.
Add the class width to the starting point to get the second lower class limit.
Add the class width to the second lower class limit to get the third, and so on.
List the lower class limits in a vertical column and enter the upper class limits,
which can be easily identified at this stage.

STEP 5.  Determine the frequency for each class by referring to the tally columns and present the results in a table.

When constructing frequency tables, the following guidelines should be followed.

The classes must be mutually exclusive.  That is, each score must belong to exactly one class.
Include all classes, even if the frequency might be zero.

All classes should have the same width, although it is sometimes impossible to avoid open – ended intervals such as “65 years or older”.
The number of classes should be between 5 and 20.

Let’s Try!!!

Time magazine collected information on all 464 people who died from gunfire in the
Philippines during one week.  Here are the ages of 50 men randomly selected from
that population.  Construct a frequency distribution table.

19	 18	  30  40  41  33  73 25
23  25  21   33  65	17	20	 76
47	 69  20   31	18	24	35	 24
17	 36  65   70	22 25	65	 16
24	 29  42   37	26	46	27  63
21	 27  23   25	71	37	75	 25
27  23

Determine the range.
R = Highest Value – Lowest Value
R =  76 – 16 = 60

Determine the tentative number of classes (K).
K = 1 + 3. 322 log N
= 1 + 3.322 log 50
= 1 + 3.322 (1.69897)
= 6.64
*Round – off the result to the next integer if the decimal part exceeds 0.
K = 7

Find the class width (c).  * Round – off the quotient if the decimal part exceeds 0.

Write the classes starting with lowest score. Using Table:
What is the lower class limit of the highest class? Upper class limit of the lowest class?
Find the class mark of the class 43 – 51.
What is the frequency of the class 16 – 24? CUMULATIVE FREQUENCY DISTRIBUTION

The less than cumulative frequency distribution (F<)
is constructed by adding the frequencies from the lowest
to the highest interval while the more than cumulative frequency distribution
(F>) is constructed by adding the frequencies from the highest class interval
to the lowest class interval. RELATIVE FREQUENCY DISTRIBUTION

A Relative frequency distribution indicates the proportion of the
total number of observations that is occurring in each interval.  That is, Relative frequencies may be expressed in percent.
Hence a relative frequency table is also called percentage frequency distribution Note:  A Relative cumulative frequency distribution may be
constructed using relative frequencies of the cumulative frequency “less than” or “more than”.

```