Knowledge Engineering Ⅲ | Reasoning

KG reasoning is to infer new knowledge from the given KG.

logical reasoning

deductive reasoning 推导结论

若下雨，则草地会变湿。因为今天下雨了，所以今天草地是湿的。

forward reasoning

start with available data and use inference rules to extract more data until a goal is reached.

RDFS rules:

1
2
3

a rdfs:domain x .  u a y .
__________________________
      u rdf:type x .

OWL rules:
1
2
3
c⊓D⊑E,C(a),D(a)
________________
E(a)
可以借助axioms, 在现有assertions的基础上通过reasoning得出更多的assertions.

classification:

1
2
3

u rdfs:subPropertyOf v . v rdfs:subPropertyOf x.
________________________________________________
              u rdfs:subPropertyOf x.

backward reasoning

an inference method described colloquially as working backward from the goal.

经常用于问题重述：根据KG，因为只能查询问题中节点下方的实例节点，因此需要对问题进行重述。

practice(Datalog)

Datalog is a declarative logic programming language, which is designed for knowledge base and database.

syntax:
- Atom: has_child(X, Y)
- Rule: has_child(X, Y): -has_son(X, Y)
- Fact: has_child(Alice, Bob): -
1
2
3
4
5
6
7
8
9
10
11
12
%% Facts
friend(joe, sue)
friend(ann, sue)
friend(sue, max)
friend(max, ann)

%% Rules
fof(X, Y) :-friend(X, Y)
fof(X, Z) :-friend(X, Y), fof(Y, Z)

%% Query
query(X) :-fof(X, ann)
1
fof(max, ann) fof(sue, ann) fof(joe, ann) fof(ann, ann)
运算法则：
- Intersection：Q( x1, x2, …,xn) :-R( x1, x2, …,xn), S( x1, x2, …,xn)
- Union: Q( x1, x2, …,xn) :-R( x1, x2, …,xn)
  
  Q( x1, x2, …,xn) :-S( x1, x2, …,xn)
- Difference: Q( x1, x2, …,xn) :-R( x1, x2, …,xn), NOT S( x1, x2, …,xn)
Tools

RDFox, Drools, Jenas

inductive reasoning 学习规则

每次下雨，草地都会变湿。所以下雨会使草地变湿。

abductive reasoning 寻找前提

若下雨，草地会变湿。之所以草地是湿的，是因为在下雨。

statistical reasoning

find suitable statistical models to fit the samples and predicts the expected probabilities of the inferred knowledge.

knowledge graph embedding based reasoning

How to encode the meaning of a word?
1. One-hot Representation
2. Distributional Representation(fixed-size window)用一个单词的多个上下文来建立其表示
3. Word Vectors
Representation learning

objects are represented as dense, real-value and low-dimensional vector

Translation-Based Models

TransE

for each triple < h, r, t >, h is translated to t by r.

train TransE :

min|h+r-t|(Manhattan distance or Euclidean distance)

Loss Function:
$L=\sum_{(h,l,t)\in S}\sum_{(h',l,t')\in S'}[\lambda +d(h+l,t)-d(h'+l,t')]_+ \\S: true_.triples\shortparallel S':false_. triples\shortparallel \lambda:Margin\shortparallel d: distance_.function$

1. 定义已知参数和超参数
2. 初始化实体和关系（即给每一个一个向量）
3. embedding 标准化
4. 根据损失函数迭代更新参数
5. 计算指标函数：Mean Rank 和 Hits@10

问题：无法表示 symmetric relations，该关系 r=0, h=t, 重合

            无法表示 1-to-N、N-to-1、N-to-N relations 三元组确定且唯一

TransH

modeling a relation as a hyperplane together with a translation operation on it.

为每一个关系定义一个超平面W_r，将实体投影到关系超平面上，这样symmetric关系的实体只在超平面上重合，1-to-N、N-to-1、N-to-N relations 也可以解决。
$h_\perp=h-w{^T_r}hw_r \\ t_\perp=t-w{^T_r}tw_r$
TransR

each entity should have many aspects of semantics. (一词多义)
1. 分别在 entity space 和 relation spaces(每个relation一个) 建立 entity embeddings 和 relation embeddings
2. 将entity space 的实体映射到相应的 relation space，并在实体间建立翻译（向量）
  $h_r=hM_r\\ t_r=tM_r\\ min \sum f(h,t)=\parallel h_r+r-t_r\parallel{^2_2}$

inductive rule learning based reasoning
multi-hop reasoning