Skip to main content

Command Palette

Search for a command to run...

Embeddings

Microsoft.Extensions.AI

Published
2 min read
Embeddings
R

I am a recent graduate at the beginning of my software development career. I enjoy documenting my learnings through my blogs

Embeddings are a way to represent information, such as text or images, as numbers, also known as a Vector. This represents the concepts within the input, not the characters or pixels, etc.Vectors are an array of numbers that represent something.

Words - for a word it might represent a meaning or relationship

Image - for an image it might represent colours, shapes or patterns

I have mainly used embeddings for Search capabilities, searching indexes/text. This is a good use case in the modern world of AI because embeddings measure the relatedness of vectors using their distance. It can be used to match your query to relevant documents. A small distance suggests that the vectors share a high relativity, whereas a large distance would suggest a low relativity.

Words like “cat”, “dog”, “rabbit” will be close together because they’re all animals or like “car”, “truck”, “bus” are all vehicles. Whereas, “truck”, “apple”, “mouse”, would be far apart because they are not related in any way.

Other use cases may be:

  1. Clustering - grouping text strings by their similarity

  2. Recommendations - items can be suggested based on text strings that are closely related

  3. Classification - assigning text strings to their most related label

The embeddings would look similar to this.

 0.009120221     -0.011264438    -0.009196174    0.0696432       0.048002396     
 0.0050245933    -0.044683825    0.00040386632   -0.013437866    0.007174652     
 0.023650644     -0.008611919    0.022260118     0.027015954    0.0236857        
 0.014991985     -0.025613742    -0.033325907    -0.014162343    0.041832663     
 0.0064793886    0.03145629      0.01658116      -0.022937853    0.026712142     
 0.028324686     -0.0040342812   0.026782252     0.045478415    -0.012327782

If querying something from a HR handbook, for example, you should receive the relevant data points to your question. Below is a snippet showing the text using the embeddings and what it would return.

var candidates = new string[] { "Onboarding process for new employees", 
"Understanding Our Company Values", "Navigating the Office Layout", 
"Accessing Car Park E", "Dress Code Guidelines", "Using the Company Intranet", 
"Employee Benefits Overview", "Requesting Time Off", "Reporting Workplace Incidents", 
"Office Etiquette and Conduct" };
Query: company values

(0.6882966): Understanding Our Company Values
(0.38679683): Employee Benefits Overview
(0.31320179): Using the Company Intranet

References

.NET Conf 2024 - Day 1 https://www.youtube.com/watch?v=hM4ifrqF_lQ&t=7268s

The code samples I reference can be found here: https://github.com/rachkeenan/samples-microsoft-extensions-ai