LLM/RAG: Knowledge Graphs, Multi-Agents, Ultrafast Fine-tuning, No Latency
2 min readJun 21, 2024
In this presentation, I explain all the components of a ground-breaking architecture with applications to local, enterprise LLMs and high-quality search targeted to advanced users and busy professionals. Some of the main features:
- Multi-LLM with 2000 specialized, small sub-LLMs covering the entire human knowledge.
- LLM router as top layer to decide which sub-LLM to call, or let the user choose.
- Smart crawling to recover taxonomies and knowledge graphs embedded in carefully selected, high-quality input sources.
- Augmented with synonyms and abbreviation maps based on glossaries, indexes, and so on.
- Ultrafast: No neural network but instead parametric weights governed by few explainable parameters rather than optimizing billions of weights (the neural network approach).
- Customizable relevancy scores attached to each item returned to the user (URLs, related concepts, tables, and so on). To help the user decide on what to look for.
- Self-tuning (global or local to a sub-LLM) based on favorite hyperparameters chosen by users, and customized results. Local self-tuning is very fast, and a first step before global optimization.
- Fast embedding search with probabilistic algorithm. Variable-length embeddings with contextual and multi-tokens. No dot product or cosine distance, but better metrics instead.
- Using the model evaluation metric as your loss function to achieve better relevancy. Introducing the concept of adaptive loss function.
- Augmentation and refinement based on integrating user prompt elements in back-end tables.
- Application to content clustering and predictive analytics based on text only. Using nested hashes that leverage the sparsity in keyword association tables (no huge, sparse similarity matrix involved).
- Model evaluation based on knowledge graph reconstruction (category assignments) and comparison with the native one.
➡️ Download the free PowerPoint presentation from here. With links to full source code on GitHub, datasets, documentation, related books & articles, and free training on the topic.