Tag

Vector-Search

3 articles tagged with "Vector-Search"

Graph Meets Vector

2025.01.19

Graph Meets Vector

Knowledge Arbitrage’s architecture for combining graph relationships with semantic search.

The Filesystem Problem

Traditional filesystems give you hierarchy. But knowledge doesn’t work that way — a document might belong to multiple projects, reference concepts across folders, and have relationships that evolve over time.

Graph databases model these relationships naturally. But pure graph traversal can’t answer “show me documents similar to this one.”

The Hybrid Approach

Knowledge Arbitrage combines two databases:

neo4j vector-search knowledge-management

Knowledge Arbitrage

2025.01.04

Knowledge Arbitrage

A graph-based filesystem that combines Neo4j’s relationship modeling with LanceDB’s vector search capabilities. Files and directories exist as interconnected nodes, while content is chunked and embedded for semantic querying.

Features

Graph-Based Storage — Files and directories stored as Neo4j nodes with CONTAINS relationships
Hierarchical Structure — Traditional filesystem operations (create, read, write, move, delete)
Semantic Search — LanceDB stores embeddings for content-based querying
Hybrid Queries — Combine graph traversal with vector similarity
Text Optimization — Cleans HTML, normalizes formatting, semantic chunking

Technical Highlights

Neo4j for graph structure and relationship queries
LanceDB for vector storage and similarity search
Cohere embed-english-v3.0 for embeddings
chonkie for semantic text chunking
FastAPI web framework

Tech Stack

Component	Technology
Graph Database	Neo4j 5.0+
Vector Database	LanceDB
Embeddings	Cohere
Text Chunking	chonkie
Web Framework	FastAPI

Architecture

Filesystem (Neo4j)          Embeddings (LanceDB)
├── /projects              → chunk_1 → [embedding]
│   └── /notes             → chunk_2 → [embedding]
└── /docs                  → chunk_3 → [embedding]

Query: "show me files about AI projects"
→ Vector search in LanceDB
→ Returns matching chunks with file paths from Neo4j

neo4j graph-database lancedb

Dynamic RAG

2025.01.02

Dynamic RAG

A sophisticated Retrieval-Augmented Generation system with multi-agent architecture for intelligent question-answering over custom knowledge bases.

Features

Multi-Agent Architecture — Specialized agents working together:
- QA Agent — Main agent for answering questions using the knowledge base
- Personal Agent — For personal/general questions not requiring RAG
- Validation Agent — Classifies queries as on-context, personal, bad, or unknown
- Prompt Agent — Generates diverse query perspectives for comprehensive retrieval
- Reranking Agent — Reorders retrieved chunks for better relevance
Hybrid Search — Combines FAISS vector similarity with keyword matching
Streaming Responses — Token-by-token streaming of LLM outputs
Multi-language Support — Uses multilingual embeddings via Cohere
Document Processing — PDF parsing with OCR support via Mistral

Technical Highlights

LangChain + LangGraph for orchestrating the RAG pipeline
FAISS vector store for similarity search
Cohere embeddings for multilingual support
SQLite for session persistence and analytics
Configurable agents via JSON configuration files

Tech Stack

Component	Technology
Web Framework	FastAPI + Uvicorn
AI Integration	Together AI, LangChain, LangGraph
Vector Store	FAISS
Embeddings	Cohere
Database	SQLite + SQLAlchemy 2.0

Architecture

ValidationAgent → classifies query type
       ↓
PromptAgent → generates specialized prompts
       ↓
QA Agent → retrieves chunks, reranks, generates answer
       ↓
StreamingResponse

ai rag fastapi