Understanding Maths used in Data Science / Machine Learning


Machine Learning or Data Science is all about understanding your data through Mathematics and Statistics concept. Before you go on writing Python codes first thing you should know how to do few basic calculations. In this blog I tried to cover Maths used in Data Science.

● Understanding Mean, Median and Mode:

Meanis nothing but the average.

Example:- Dataset  (1,2,3,4,5,6,7)


Mean= 4

Median is central value. There are two ways to calculate the median one for odd number and another is for even number. Assign the value into ascending order.

For even number:

  • Sample Dataset: 1,58,34,56,23,2,45,8
  • Arrange dataset in Ascending order (1,2,8,23,34,45,56,58)
  • Take the middle two values these are 23+34 and take a average. It will be center value for above dataset.

For odd number:

  • Sample Dataset: 1,2,8,23,34,45,56
  • Arrange dataset in ascending order.
  • 23 is dividing the dataset value into two equal part.
  • So in this case 23 is a middle value.

Mode is the number which occur most in dataset.

Sample Dataset (1,2,4,3,2,5,6,2,7,8,2)

2 are occurring most times in the dataset, so the Mode is 2.

Variance and standard deviation:

It is used to understand how your data is scattered. Variance is nothing but how your data is spread on the graph. Standard deviation is squared root of variance. Let's see how to calculate variance.

  • Calculate the mean of your dataset.
  • Subtract the mean from each data point and then take square
  • Take the average of all the answers we got in step 2.
  • This is variance of your dataset.
  • Square root of variance is Standard Deviation.

These values data Scientist use for further analysis.

Blog Written By: Shital Nagarare