Spectral vector search that augments nearest-neighbour search with graph Laplacian features. Computes a Laplacian over the item graph and uses the Rayleigh quotient to produce a λτ (lambda-tau) score per item, enabling search that respects both semantic similarity and structural role.
pip install arrowspace
from arrowspace import ArrowSpaceBuilder
import numpy as np
Pass an (N, d) float64 NumPy array of embedding vectors:
items = np.array([[0.1, 0.2, 0.3],
[0.0, 0.5, 0.1],
[0.9, 0.1, 0.0]], dtype=np.float64)
graph_params = {"eps": 0.2, "k": 6, "topk": 3, "p": 2.0, "sigma": 1.0}
builder = ArrowSpaceBuilder(items, graph_params=graph_params)
aspace = builder.build()
lambdas = aspace.lambdas() # array indexed by insertion order
sorted_res = aspace.lambdas_sorted() # (score, index) pairs ascending
Higher λτ values indicate items that are both semantically close and structurally central.
items = np.random.randn(100, 64).astype(np.float64)
builder = ArrowSpaceBuilder(items, graph_params={"eps": 0.5, "k": 10, "topk": 5, "p": 2.0, "sigma": None})
aspace = builder.build()
scores = aspace.lambdas()
top_indices = np.argsort(scores)[-5:]
from sklearn.metrics.pairwise import cosine_similarity
cos_sim = cosine_similarity(items)
cosine_order = np.argsort(cos_sim[0])[::-1]
spectral_order = np.argsort(aspace.lambdas())[::-1]
Problem: eps is too small, producing a disconnected graph Solution: Increase eps, or set it proportional to 1/sqrt(embedding_dim)
Problem: k is too large, producing a dense graph with washed-out spectral features Solution: Keep k ≤ 25 for most datasets
vector-database-engineer — General vector database expertiseembedding-strategies — Embedding model selection and chunkingsimilarity-search-patterns — Semantic search implementation patternshybrid-search-implementation — Combined semantic + keyword search