*69*

### Statistical Measures and Techniques

**Description of Data:**

Describing data, such as the annual sales of a product for a company, can be done in several ways. For instance, a sales manager might want to inform the general manager about the sales from the past year. This can be done by photocopying all dispatch notes issued over the last year or by providing a list of customers and their corresponding purchases. However, these methods can be time-consuming and might not be the best presentation approach.

**Graphical Representation of Frequencies:**

Sales can be graphically represented based on their frequency. For example, annual sales to 31 customers in thousands of tons can be shown, with each bar representing the number of customers and the corresponding sales volumes. This graphical representation provides information about the distribution of the number of customers across various sales volume categories.

### Measures of Central Tendency

**Mean (Arithmetic Average):**

The mean is the sum of all data points divided by the number of data points. It is a widely used statistical measure with broad daily applications. It represents a central value for a set of data and is calculated using the formula:

\[ \mu = \frac{\sum x}{N} \]

where \( \mu \) is the mean, \( \sum x \) is the sum of data points, and \( N \) is the number of data points.

**Median:**

The median is the value that separates the higher half from the lower half of a data set. If the number of data points is even, the median is the average of the two middle numbers. It is not affected by extreme values and is a measure of the central position of the data.

**Mode:**

The mode is the value that appears most frequently in a data set. There can be more than one mode if multiple values have the highest frequency. The mode is useful for categorical data where we want to know the most common category.

### Measures of Dispersion

**Range:**

The range is the difference between the maximum and minimum values in a data set. It gives a crude measure of variability but does not provide information about the distribution of values within the range.

**Mean Deviation:**

Mean deviation is the average of the absolute differences between each data point and the mean. It indicates how much, on average, each data point deviates from the mean.

**Standard Deviation:**

Standard deviation is a measure of the amount of variation or dispersion in a set of values. It is the square root of the variance. For a population, it is denoted as \( \sigma \) and for a sample as \( s \). The standard deviation is calculated using the formula:

\[ \sigma = \sqrt{\frac{\sum (x_i – \mu)^2}{N}} \]

where \( \sigma \) is the standard deviation, \( x_i \) are the data points, \( \mu \) is the mean, and \( N \) is the number of data points.

**Coefficient of Variation:**

The coefficient of variation (CV) is the ratio of the standard deviation to the mean, expressed as a percentage. It shows the extent of variability in relation to the mean of the population:

\[ CV = \frac{\sigma}{\mu} \times 100 \]

It is useful for comparing the degree of variation from one data series to another, even if the means are drastically different.

### Other Statistical Measures

**Normal Distribution:**

In a normal distribution, the mean, median, and mode are all equal and located at the center of the distribution. The total area under the curve is 1.0, and it is symmetric about the mean. The distribution follows the empirical rule, where approximately 68% of the data falls within one standard deviation of the mean, 95% within two standard deviations, and 99.7% within three standard deviations.

### Importance of Mean and Standard Deviation:

Understanding the mean and standard deviation of a data set is crucial as they provide insights into the central tendency and dispersion of the data. These measures help in understanding how much variation exists and the reliability of the mean as a representative value of the data set.