🎤 From Text to Intelligence: My Fresno DevFest Keynote on Embeddings

At **Fresno DevFest **, I had the opportunity to present a keynote titled “Leveraging LLMs for Search: Exploring Embeddings.” In this talk, I walked through how embeddings—one of the most powerful tools behind LLMs—enable smarter, more meaningful interactions with data.

🤖 What Are Embeddings?

Embeddings show us what the model "sees" in a piece of data. By converting text (or even images) into arrays of numbers, they help LLMs understand relationships, meaning, and context.

These vector representations allow us to:

Perform semantic search
Group similar content via clustering
Power intelligent classification
Provide focused context to LLMs without full fine-tuning

🧮 “King” – “Man” + “Woman” = “Queen” is a famous example of how relationships are preserved in vector space.

🔗 Try word2vec live: https://turbomaze.github.io/word2vecjson/

🛠 Creating Embeddings: APIs, OSS, and LangChain

We explored multiple paths for generating embeddings:

OpenAI Embedding API — Simple but vulnerable to deprecations or API changes
LLAMA2 + Huggingface — Open-source, hostable, customizable
LangChain — A powerful abstraction layer for connecting and switching between models

LangChain simplifies embedding workflows by offering consistent interfaces, while Huggingface offers plug-and-play flexibility with models like CodeLlama and word2vec.

🧪 Live Demo: Pokedex Semantic Search

My personal dive into embeddings began with a simple semantic search app—a Pokedex. Rather than keyword-based querying, we can now ask things like "electric rodent" and get Pikachu back thanks to vector-based matching.

We used:

🐘 pgvector with Postgres
🔎 ElasticSearch for full-text + vector hybrid search

SQL-style similarity query:

SELECT 1 - (embedding <=> '[3,1,2]') AS cosine_similarity FROM items;

👉 Check out the live demo: https://pokedex-seven-sigma.vercel.app

🔮 What's Next?

Exciting improvements are on the horizon:

OpenAI's CLIP — Embed both images and text in the same vector space
In-browser embeddings — No API required: CLIP Demo
RAG Pipelines — Use embeddings to inject relevant context into LLM queries
Smaller, local models — Embeddings and chat agents on your laptop

This opens up creative possibilities across search, personalization, and AI workflows—especially for those working with private data.

🧠 Fine-Tuning vs Context

Embedding-powered retrieval enables a context-first approach. Instead of fine-tuning large models, we can:

Narrow the context
Inject relevant, up-to-date info
Keep cost and tokens under control

The future of search and AI is contextual, efficient, and deeply personal. Embeddings help make that future possible—and accessible.