# Overview
## 

In specific data-mining applications such as clustering, it is essential to find how similar or dissimilar objects are to each other.


A **similarity measure** for two objects $(i,j)$ will return `1` if similar and `0` if dissimilar.

A **dissimilarity measure** works just opposite to how the similarity measure works, i.e., it returns `1` if dissimilar and `0` if similar.


Similarity and dissimilarity measures help remove the outliers. Their use quickly eliminates redundant data since they help identify potential outliers as highly dissimilar objects to others.

The measure of similarity and dissimilarity is referred to as **proximity**.



The measure of similarity can often be measured as a function of a measure of dissimilarity.

 *Similarity and dissimilarity measures can be  calculated as:*
$$dis (i,j)= 1-(m/p)=p-m/p$$
$$sim(i,j)=1-dis(i,j) = m/p$$

* $i,j$ are row and column values of the **dissimilarity matrix**.
* $m$ is several matches for which $i,j$ are in the same state.
* $p$ is a total number of attributes.

        



## Example

Let's look at an example and try to find similarity and dissimilarity measures.

|  **Obj Id** |  **Grade**| **Progress**  |  **Numeric** |
| - | - | - | - |
|  1 |  A |  Excellent |  45 |
|  2 |   B|   Fair|  22 |
|   3|  C |  Good |  64 |
|4|A|Excellent|28|

While constructing a *dissimilarity matrix*, we give the value of `1` for *dissimilar* objects and `0` for *similar* things.
For a *similarity matrix*, it is vice-versa.

The proximity measure for the grade attribute is calculated below.

## Calculating proximity measures 
 The *dissimilarity matrix* values are calculated as shown below:
   $$dis(2,1)=(A,B) =1$$
   $$dis(3,1)=(C,A) =1$$
   $$ dis(3,2)=(A,B) =1$$
   $$dis(4,1)=(A,A) =0$$
   $$ dis(4,2)=(A,B) =1$$
   $$dis(4,3)=(A,C) =1$$
  

The *similarity matrix*  values for this are shown below:
$$sim(2,1)=1-dis(2,1) =0$$
$$sim(3,1)=1-dis(3,1)=0$$
$$sim(3,2)=1-dis(3,2) =0$$
$$sim(4,1)=1-dis(4,1) =1$$
$$sim(4,2)=1-dis(4,2) =0$$
$$sim(4,3)=1-dis(4,3) =0$$







What are similarity and dissimilarity measures?

Similarity measures return 1 for similar and 0 for dissimilar objects; dissimilarity measures work oppositely.

Obj Id	Grade	Progress	Numeric
1	A	Excellent	45
2	B	Fair	22
3	C	Good	64
4	A	Excellent	28

What are similarity and dissimilarity measures?

Overview

Dissimilarity matrix

Example

Calculating proximity measures