Cosine Affinity

Graph machine learning is a method in which distance and geometry of nodes play an important role in filling in the information gap with machine learning. Also known as dot product, this is an essential way for understanding the geometric spaces the vectors inhabit.

Eater graphs

During my time at Uber, I worked on building two keystone products: personalization of the home feed and relevance of search results. When we added restaurant dishes, cuisine categories, or any other bits of information, the question became understanding how one dish or cuisine could be related to another. This was essential for our recommendation algorithm — we would create and deploy different models that would predict things like: what dish you would like from a given location, or what cuisine you would like that is adjacent to the one you have searched for. The notion of “similarity” was something we needed to grasp. Enter: graph machine learning.

The organization I worked for was called the “Eater team”, where we focused on the consumer side of the product (the app you use to place orders). Our team created something called an Eater Graph, where you can picture an incredibly large network, say of cuisines, and they were each placed a certain distance from each other. Over time as the system learned more about each separate cuisine, it would adjust the distances between each cuisine. The proximity of one cuisine to another indicated how similar the cuisines were, for example an “Afghan” cuisine node would be closer to the “Persian” cuisine node than it would to “Korean”, but “Korean” would be closer to “Japanese” than to “Russian”.

This notion of similarity is what is known as cosine affinity, or cosine similarity and is how we can determine which nodes should be similar to another. Learning from the cuisine preferences of Eaters, the efficacy of the graph is based on how often someone who orders “Ethiopian” cuisine may order another cuisine, which shouldn’t come as a surprise because machine learning is all about generalizing knowledge when information is provided in aggregate, large volumes.

Cosine affinity & distance

One can derive the cosine affinity by using the Euclidean dot product formula, which takes two vectors $\textbf{A}$ and $\textbf{B}$ and applies the following formula:

\textbf{A} \cdot \textbf{B}= \|\textbf{A}\| \|\textbf{B}\|\text{cos}\theta

We are effectively taking the Euclidean norms as values in our Euclidean function to determine the missing edge among these two vectors. In applying this formula, the value we derive ranges from $-1$ to $1$ . A value of $-1$ implies the two vectors are opposite each other (at an obtuse angle), the value $0$ indicates there is no correlation between the vectors, while $1$ indicates a positive correlation (at an acute angle).

While cosine affinity measures how similar the two vectors are to each other, cosine distance measures how distant and different they are from each other. This comes in handy in machine learning as a loss function (if one chooses to use this). To derive the cosine distance, we use the formula:

\text{cosine distance}=1-\text{cosine similarity}

The values of the cosine distance ranges from $0$ to $2$ , with the $0$ meaning they are identical vectors, the $1$ indicating there is no correlation, and $2$ indicating the two vectors are absolutely different.

With this simple concept, you can begin to develop and maintain an ongoing graph that uncovers the ontology and relationship between different entities you are looking to represent. From this, you may uncover certain elements are more similar to each other than one may think, and becomes a handy tool with recommendation algorithms.