Skip to content

Version 0.5

Compare
Choose a tag to compare
@andylau-55 andylau-55 released this 04 Nov 07:27
· 61 commits to master since this release
0ff65c2

Version 0.5 (2024-10-25)

retrieval Augmentation Generation (RAG) technology promotes the integration of domain applications with large models. However, RAG has problems such as a large gap between vector similarity and knowledge reasoning correlation, and insensitivity to knowledge logic (such as numerical values, time relationships, expert rules, etc.), which hinder the implementation of professional knowledge services. On October 25, officially releasing the professional domain knowledge Service Framework for knowledge enhancement generation (KAG) .

Highlights of the Release Version:

1. KAG: Knowledge Augmented Generation

KAG aims to make full use of the advantages of Knowledge Graph and vector retrieval, and bi-directionally enhance large language models and knowledge graphs through four aspects to solve RAG challenges
(1) LLM-friendly semantic knowledge management
(2) Mutual indexing between the knowledge map and the original snippet.
(3) Logical symbol-guided hybrid inference engine
(4) Knowledge alignment based on semantic reasoning
KAG is significantly better than NaiveRAG, HippoRAG and other methods in multi-hop question and answer tasks. The F1 score on hotpotQA is relatively improved by 19.6, and the F1 score on 2wiki is relatively improved by 33.5

The KAG framework includes three parts: kg-builder, kg-solver, and kag-model. This release only involves the first two parts, kag-model will be gradually open source release in the future.

kg-builder

implements a knowledge representation that is friendly to large-scale language models (LLM). Based on the hierarchical structure of DIKW (data, information, knowledge and wisdom), IT upgrades SPG knowledge representation ability, and is compatible with information extraction without schema constraints and professional knowledge construction with schema constraints on the same knowledge type (such as entity type and event type), it also supports the mutual index representation between the graph structure and the original text block, which supports the efficient retrieval of the reasoning question and answer stage.

kg-solver

uses a logical symbol-guided hybrid solving and reasoning engine that includes three types of operators: planning, reasoning, and retrieval, to transform natural language problems into a problem-solving process that combines language and symbols. In this process, each step can use different operators, such as exact match retrieval, text retrieval, numerical calculation or semantic reasoning, so as to realize the integration of four different problem solving processes: Retrieval, Knowledge Graph reasoning, language reasoning and numerical calculation.

Version 0.5 (2024-10-25)

检索增强生成(RAG)技术推动了领域应用与大模型结合。然而,RAG 存在着向量相似度与知识推理相关性差距大、对知识逻辑(如数值、时间关系、专家规则等)不敏感等问题,这些都阻碍了专业知识服务的落地。10月25日正式发布了知识增强生成(KAG)的专业领域知识服务框架

版本亮点

1. KAG 专业领域知识服务框架

KAG 旨在充分利用知识图谱和向量检索的优势,并通过四个方面双向增强大型语言模型和知识图谱,以解决 RAG 挑战
(1) 对 LLM 友好的语义化知识管理
(2) 知识图谱与原文片段之间的互索引
(3) 逻辑符号引导的混合推理引擎
(4) 基于语义推理的知识对齐
KAG 在多跳问答任务中显著优于 NaiveRAG、HippoRAG 等方法,在 hotpotQA 上的 F1 分数相对提高了 19.6%,在 2wiki 上的 F1 分数相对提高了33.5%

kag 框架包括 kg-builder、kg-solver、kag-model 三部分。本次发布只涉及前两部分,kag-model 将在后续逐步开源发布。

kg-builder

实现了一种对大型语言模型(LLM)友好的知识表示,在 DIKW(数据、信息、知识和智慧)的层次结构基础上,升级 SPG 知识表示能力,在同一知识类型(如实体类型、事件类型)上兼容无 schema 约束的信息提取和有 schema 约束的专业知识构建,并支持图结构与原始文本块之间的互索引表示,为推理问答阶段的高效检索提供支持。

kg-solver

采用逻辑符号引导的混合求解和推理引擎,该引擎包括三种类型的运算符:规划、推理和检索,将自然语言问题转化为结合语言和符号的问题求解过程。在这个过程中,每一步都可以利用不同的运算符,如精确匹配检索、文本检索、数值计算或语义推理,从而实现四种不同问题求解过程的集成:检索、知识图谱推理、语言推理和数值计算。