statistics
- (plural form) data or numerical computations are derived
from a set of sample data
- (singular form, more general definition) branch of science
that deals with developing methods for a more effective
way of collecting, organizing, presenting, and analyzing
data
statistical methods – existing procedures and techniques used
from the collection of data to the proper presentation and
analysis of results
statistical theory – development of the formulas used in the
computation and development of scientific procedures that
constitute the basis of statistical methods
data
- basic element of statistical analysis
- expressed either in numerical value or description by its
quality or kind
- 2 types of data:
- quantitative data – expressed in numbers
(measured or counted)
- age, height, weight, income, number of
students
- qualitative data – expressed in categories or kind
- gender, educational attainment, civil status,
vehicle plate numbers
population data
- collection of all the units from where data is collected
- unit in the population is called element
sample data
- subset of the population
- listing of all the elements is called frame
census
- info is gathered for all units in the population
- complete enumeration of all the units in the population
- expensive and time-consuming
- (PH census that occurs every 10 years)
sampling – only a part of the population is used to obtain data
parameter – numerical measure computed from the population
statistic – numerical measure computed from the sample
two major areas of statistics:
1. descriptive
- summary calculations, graphical and tabular
displays, and describing important features of a set
of data
2. inductive or inferential
- making generalizations for the population based on
the information drawn from the sample
variables – characteristics or properties measured from
objects, persons, or things
- two types:
1. discrete – counted, whole numbers
2. continuous – measured, decimal numbers
four scales of measurement
1. nominal – lowest form, identification only
2. ordinal – has both identity and order
3. interval – identity, order, and equality of scale
4. ratio – identity, order, equality of scale, and absolute zero
absolute zero – nothing of the characteristic that is being
measured
symbols used
1. summation – compact way to write the sum of the set of
variables
2. factorial – compact way of writing the product of a
sequence of positive integers
- can be done through interviews, telephone,
questionnaires, or observations
1. Interviews
- in-person interviews allow researchers to collect
more information
- telephone interviews, while less costly, restrict the
sample to those who only have phones and with
free schedules
2. Questionnaires
- can only be used when respondents are available
and willing to participate as subject
- can be mailed or handed personally
- reliability and validity of data collected depends on
the respondent’s memories and forthrightness
- low and differential response rate that leads to
loss of information
3. Observation
- researcher directly observes rather than relying on
respondent’s memory or truthfulness
- may use videotapes, audio tapes, or other data
collection equipments in combination to collect
observational data
- time-consuming, samples are small and
unrepresentative of the population, and there may
be observation errors
4. Records
- utilization of existing records
- economical and requires less cooperation from
those who dislike interviews and questionnaires
- some information needed may not be found, can
be unpublished or published
sampling methods
1. probability
a. simple random sampling
- used when homogeneous population is not
large and frame is available
- selection of samples is done where sample
size and has an equal probability of being
selected or chance of being in the sample
- steps: make a list, assign a sequential number,
choose sample size, use random number
generator
b. systematic sampling
- equal probability without being dependent on
the frame
- elements are assigned a number from 1 to N
and interval is determined by taking the ratio of
N to sample size n. random number is
selected from a list or sequential files 1 to k
(random start), unit assigned is then included
in the sample
c. stratified sampling
- extension of simple random sampling; divides
the entire population into strata then probability
samples are selected in each stratum
- simple random sampling is used in selecting
samples in each stratum
- used when the population is extremely
heterogeneous
- improves the quality of inferences made esp.
when strata formed has units that are
homogeneous as possible