LLM/RAG: Knowledge Graphs, Multi-Agents, Ultrafast Fine-tuning, No Latency

2 min readJun 21, 2024

In this presentation, I explain all the components of a ground-breaking architecture with applications to local, enterprise LLMs and high-quality search targeted to advanced users and busy professionals. Some of the main features:

Multi-LLM with 2000 specialized, small sub-LLMs covering the entire human knowledge.
LLM router as top layer to decide which sub-LLM to call, or let the user choose.
Smart crawling to recover taxonomies and knowledge graphs embedded in carefully selected, high-quality input sources.
Augmented with synonyms and abbreviation maps based on glossaries, indexes, and so on.
Ultrafast: No neural network but instead parametric weights governed by few explainable parameters rather than optimizing billions of weights (the neural network approach).
Customizable relevancy scores attached to each item returned to the user (URLs, related concepts, tables, and so on). To help the user decide on what to look for.
Self-tuning (global or local to a sub-LLM) based on favorite hyperparameters chosen by users, and customized results. Local self-tuning is very fast, and a first step before global optimization.
Fast embedding search with probabilistic algorithm. Variable-length embeddings with contextual and multi-tokens. No dot product or cosine distance, but better metrics instead.
Using the model evaluation metric as your loss function to achieve better relevancy. Introducing the concept of adaptive loss function.
Augmentation and refinement based on integrating user prompt elements in back-end tables.
Application to content clustering and predictive analytics based on text only. Using nested hashes that leverage the sparsity in keyword association tables (no huge, sparse similarity matrix involved).
Model evaluation based on knowledge graph reconstruction (category assignments) and comparison with the native one.

➡️ Download the free PowerPoint presentation from here. With links to full source code on GitHub, datasets, documentation, related books & articles, and free training on the topic.

LLM/RAG: Knowledge Graphs, Multi-Agents, Ultrafast Fine-tuning, No Latency

Written by Vincent Granville

No responses yet