• Category
  • >Economics

What is Frequency Distribution in Data Statistics?

  • Pragya Soni
  • May 06, 2022
What is Frequency Distribution in Data Statistics? title banner

Statistics is an important data analysis technique. The term statistics referred to the process of collection, classification, and comparison of data. Data statistics helps in deriving meaningful interpretation from raw data. 

 

It plays an important role in handling numerical data. Statistical techniques involve different procedures like central tendency, tabulation, frequency calculation, average, dispersion and many more. 

 

Analysis of raw data is essential for the measurement and defining interpretation of raw data. It is further used for drawing inferences, testing hypothesis, making suggestions and making recommendations. In this blog, we will read about frequency distribution, a part of statistical technique.

 

 

What is the meaning of Statistics?

 

Before actually considering the frequency distribution, first let us look at the overview of the term statistics.

 

The term statistics is defined as the techniques and process of collecting, describing, analyzing and interpreting numerical data. It is the aggregate of facts that marked extent, define causes, numerically express the system equation and problem.

 

Components of Statistics

 

There are four main components of statistics that are also known as the process of statistics. These four components are defined by Croxton and Cowden:

 

  1. Collection of data.

 

  1. Presentation of data.

 

  1. Analysis of data.

 

  1. Interpretation of data.

 

Branches of Statistics

 

Statistics are further classified into two branches, descriptive statistics and inferential statistics.

 

  1. Descriptive statistics

 

Descriptive statistics as the name suggests is more into description and details. It includes concluding, organizing, summarizing and presenting data.

 

  1. Inferential statistics

 

Inferential statistics helps in making predictions out of the collected data. It includes estimation, hypothesis testing, relationship and making inference out of the data.

 

 

Importance of Statistics

 

Statistics is an important process for serving following purposes:

 

  1. To effectively conduct research.

 

  1. For accessing numerical information.

 

  1. For reading the journals and other data easily.

 

  1. For developing critical and analytic research skills.

 

  1. For quick decision making.

 

 

How is biostatics different from statistics?

 

Biostatics is a branch of statistics. The term biostatics is used when the tool of statistics is applied to a biological science such as drug, medicines or clinical research. Biostatics is also known as mathematical biology, it is a fast growing, well designed and a recognized subject.

 

Biostatics has been widely applied to find the relative potency of a new drug, to compare the efficiency of a particular drug, to find association between two medical attributes and to identify the signs and symptoms of the diseases.

 

 

What is Frequency Distribution?

 

Frequency distribution is defined as the first method that is used to organize data in an effective way. Frequency distribution performs the systematic investigation of the raw data. The data is first arranged by frequency distribution and then set as frequency table.

 

Frequency distribution is defined as the systematic representation of different values of variables along with the corresponding frequencies; it is classified on the basis of class interval.

 

Class interval is defined as the size of each class into which a range of variables is divided and represented as histogram or bar graph.

 

Types of Class Intervals

 

Class intervals are divided into two different categories, exclusive and inclusive class intervals. Here is the example to both:

 

  1. Exclusive Class Interval

 

The class interval where the upper limit of previous data entry is the same as the lower limit of next data entry is called an exclusive data interval. For consideration,

 

S. No

Marks

No. of students

1

0-20

8

2

20-40

7

3

40-60

3

 

  1. Inclusive Class Interval

 

The class interval where the upper limit of previous data entry is the same as the lower limit of next data entry is called an exclusive data interval. For consideration,

 

S. No

Marks

Number of students

1

1-20

7

2

21-40

9

3

41-60

8

 

Also Read | Introduction to Bayesian Statistics

 

What is Discrete and Continuous Frequency Table Distribution?

 

Frequency distribution is further classified into two types based upon class interval. Named as discrete frequency table and continuous frequency table. Here are the examples:

 

  1. Discrete Frequency Table

 

If the class interval of data is not given, it is termed as a discrete frequency distribution. For example,

 

S. no.

Number of items

Number of packets

1

1

23

2

2

12

3

3

34

4

4

20

5

5

72

 

Total

163

 

  1. Continuous Frequency Table

 

When the class intervals are available within the data, it is called a continuous frequency distribution. For consideration,

 

S. No

Marks

Number of students

1

0-10

5

2

20-30

7

3

30-40

12

4

40-50

32

5

50-60

4

 

Total

60

 

Also Read | Data Democratization

 

 

Types of Frequency Distribution Methods

 

There are two types of frequency distribution methods:

 

  1. Grouped frequency distribution.

 

  1. Ungrouped frequency distribution.

 

  1. Grouped Frequency Distribution

 

As the name suggests, grouped frequency distribution is well defined and distributed into groups. When the variables are continuous the data is gathered as grouped frequency distribution. Different measures are taken during data collection, such as age, salary, etc. The entire data is classified into class intervals. For consideration,

 

Family Income

Number of persons

Below-20,000

52

20,001-30,000

14

30,001-40,000

6

40,001-50,000

8

 

  1. Ungrouped Frequency Distribution

 

As the name suggests, ungrouped frequency distribution doesn’t consist of well-distributed class intervals. Ungrouped frequency distribution is applied on discrete data rather than continuous one. Examples of such data usually include data related to gender, marital status, medical data etc. For consideration,

 

Variable

Number of persons

GENDER

 

Female

19

Male

22

MARITAL STATUS

 

Single

32

Married

4

Divorced

4

 

 

Other Types of Frequency Distribution

 

  1. Cumulative Frequency Distribution

 

Cumulative frequency distribution is also known as percentage frequency distribution. Percentage distribution reflects the percentage of samples whose scores fall in the specific group and number of scores. 

 

This type of distribution is quite useful for comparison of data with the findings of other studies having different sample sizes. In this type of distribution, percentages and frequencies are summed up in a single table. For consideration,

 

Score

Frequency

Percentage

Cumulative frequency

Cumulative percentage

1

4

8

4

8

2

14

28

32

64

4

6

12

10

20

5

8

16

18

36

7

8

16

40

80

8

6

12

46

92

9

4

8

50

100

 

  1. Bivariate Frequency Distribution

 

Bivariate frequency distribution is a frequency distribution where the number of variables is fixed to two. Bivariate distribution has two marginal distributions. For consideration,

 

Age

Salary per month

20-30

15

30-40

5

40-50

7

Total

27

 

 

  1. Multivariate Frequency Distribution

 

As the name suggests, multivariate frequency distribution is the frequency distribution where there are more than two variables in the frequency distribution table.

 

Also Read | Data Cleaning Tools

 

 

Graphical Presentation of Frequency Distribution


Graphical representation of data :1. Line frequency graph2. Histogram3. Frequency polygon4. Frequency curve

Graphical representation of frequency distribution


 Data representation is the next step of data gathering. The data gathered and maintained by frequency distribution is then represented in different forms of figures and graphs. Important forms of frequency distribution graphs are as follows:

 

  1. Histogram

 

  1. Line frequency graph

 

  1. Frequency polygon

 

  1. Frequency curve

 

Here is the brief introduction to all of them:

 

  1. Line Frequency Graph

 

Line frequency graph is the graphical representation of data in the form of lines. This graph is used to depict discrete data. The data is represented on the x-axis and frequencies are represented on the y-axis of the graph. The length of lines is drawn as per the sizes of frequency distribution.

 

  1. Histogram

 

Histogram is the representation of frequency distribution of data. The data is represented in the form of rectangular bars starting right from the origin. The classes are represented on the x-axis and frequencies on the y-axis.

 

There exist four types of histograms:

 

  • Histogram for equal class interval.
  • Histogram for unequal class interval.
  • Histogram for inclusive data.
  • Histogram for mid value series.

 

  1. Frequency Polygon

 

A graph that has more than four sides is known as a polygon. Frequency polygon is basically defined as a curve that is obtained by joining the mid-points of the top of rectangles of the graph by a straight line. Like other graphs, variables are taken on the x-axis and frequencies on the y-axis.

 

  1. Frequency Curve

 

Frequency curve is defined as a smooth curve obtained by joining the top point of frequency polygon by a free hand curve.

 

  1. Cumulative Frequency Curve

 

Cumulative frequency curve is also known as ogive. It is the cumulative frequency graph that is plotted corresponding to the upper limits of the classes. The cumulative frequency of each upper limit of the classes is joined by a free hand curve.

 

Ogive is further divided into two types:

 

  1. Less than ogive
  2. More than ogive

 

Also Read | What is Vital Statistics?

 

 

How does the Central Tendency of Data measure?

 

When we talk about statistics, we just can’t escape the term central tendency. Central tendency is used to represent the whole data series. It refers to the average of data series. 

 

Measurement of central tendency is used to measure the central value around the data concentration. It is defined as an attempt to find one single figure to describe the whole data. By calculating central tendency, we can find a particular value to represent the whole of data.

 

Objectives of Measure of Central Tendency

 

The main purposes for measuring central tendency are as follows:

 

1.     For comparing two data quantities.

 

2.     To derive a quantitative relationship between different group averages.

 

3.     For quicker decision making.

 

4.     For obtaining a single value for the entire data series.

 

 

Types of Measure of Central Tendency

 

The central tendency of data is measured in three terms, mean, mode and median.

 

Mean

 

Mean of the data also known as arithmetic mean of the data or mathematical average of the data. The arithmetic mean of data is defined as the sum of all the dividends of data divided by the total number of dividends. The mathematical average is further classified into following types:

 

  1. Simple arithmetic mean

 

  1. Weighted arithmetic mean

 

  1. Geometric mean

 

  1. Harmonic mean

 

For instance,

 

For the set of data where height of five students is given as 160, 162, 175, 158, and 166.

 

 Arithmetic mean= sum of dividends/number of dividends.

 

= (160+162+175+158+166)/5

 

=164.2

 

Median

 

Median is defined as the set of values when the set of data is arranged in either ascending or descending order of magnitude. Median is also defined as the positional average. For consideration, for the given data set 168, 173, 153, 163, and 158.

 

Ascending order of data, 153, 158, 163, 168, and 173.

 

Median of the data is obtained as 163 cm.

 

Mode

 

Mode is defined as the value which has the highest frequency. In other words, the item that has occurred the maximum number of times is called the mode of data. For example, consider the following data:

 

Height (in cm)

145

160

165

168

170

No. of students

3

16

8

20

6

 

For the given data, the mode is 160 cm as maximum observations have this height.

 

Also Read | Mean Median and Mode

 

This is all the basics of statistics that you need to know about statistics. Statistics is an important part of data management. For the well purpose of organizations and project management, statistics is an essential element for all spheres.

Latest Comments

  • elenalacefield

    May 06, 2022

    On a clear score, my credit was less than 463, and I was slightly skeptical but decided to give this an opportunity. R E M O T E R E P A I R a t C L E R K d o t C O M has done more than fix my credit. He has helped me better my knowledge of credit opportunities. I will refer him to any of my friends or family with any bit of credit discrepancies. I am now ready to continue improving my score and not worry about getting approved for the things I want and need in life