py2neo V4 极简使用指南:Python操作Neo4j图数据库

Neo4j的介绍可以参考这篇文章:《知识图谱技术与应用指南(转)》

其实,Python操作Neo4j,既可以用neo4j也可以用py2neo,前者是Neo4j官方的api,但是py2neo开发地更早,已经到V4了。

官方文档地址:https://py2neo.org/v4/


0、安装

下载Neo4j:https://neo4j.com/download/

使用pip安装: pip install py2neo

从github源码安装:pip install git+https://github.com/technige/py2neo.git#egg=py2neo


1、数据类型

1.1 节点Node和关系Relationship对象

图数据库,最重要的就是节点、边和属性,py2neo中最重要的就是类NodeRelationship

from py2neo.data import Node, Relationship
a = Node("Person", name="Alice")
b = Node("Person", name="Bob")
ab = Relationship(a, "KNOWS", b)
print(ab)
# (Alice)-[:KNOWS]->(Bob)

如果没有指定节点之间的关系,则默认为TO。也可以自建类Relationship的子类,如下:

c = Node("Person", name="Carol")
class WorksWith(Relationship): pass
ac = WorksWith(a, c)
print(type(ac))
# 'WORKS_WITH'

1.2 子图Subgraph对象

集合操作是创建子图最简便的方法:

s = ab | ac
print(s)
# {(alice:Person {name:"Alice"}),  (bob:Person {name:"Bob"}),  (carol:Person {name:"Carol"}),  (Alice)-[:KNOWS]->(Bob),  (Alice)-[:WORKS_WITH]->(Carol)}
print(s.nodes())
# frozenset({(alice:Person {name:"Alice"}), (bob:Person {name:"Bob"}), (carol:Person {name:"Carol"})})
print(s.relationships())
# frozenset({(Alice)-[:KNOWS]->(Bob), (Alice)-[:WORKS_WITH]->(Carol)})

1.3 路径Path对象和可遍历Walkable类型

可遍历对象是添加了遍历信息的子图。

w = ab + Relationship(b, "LIKES", c) + ac
print(w)
# (Alice)-[:KNOWS]->(Bob)-[:LIKES]->(Carol)<-[:WORKS_WITH]-(Alice)

1.4 记录Record对象

Record对象是值的有序有键的集合,和具名元组很像。

1.5 表格Table对象

Table对象是包含Record对象的列表。


2 图数据库

Graph对象是最重要的和Neo4j交互的类。

from py2neo import Graph
graph = Graph(password="password")
print(graph.run("UNWIND range(1, 3) AS n RETURN n, n * n as n_sq").to_table())
#    n | n_sq
# -----|------
#    1 |    1
#    2 |    4
#    3 |    9

2.1 数据库Database

用于连接图数据库

from py2neo import Database
db = Database("bolt://camelot.example.com:7687")

默认值是bolt://localhost:7687

default_db = Database()
>>> default_db
<Database uri='bolt://localhost:7687'>

2.2 图Graph

Graph类表示Neo4j中的图数据存储空间。

>>> from py2neo import Graph
>>> graph_1 = Graph()
>>> graph_2 = Graph(host="localhost")
>>> graph_3 = Graph("bolt://localhost:7687")

match匹配:

for rel in graph.match((alice, ), r_type="FRIEND"):
    print(rel.end_node["name"])

merge融合:

>>> from py2neo import Graph, Node, Relationship
>>> g = Graph()
>>> a = Node("Person", name="Alice", age=33)
>>> b = Node("Person", name="Bob", age=44)
>>> KNOWS = Relationship.type("KNOWS")
>>> g.merge(KNOWS(a, b), "Person", "name")

再创建第三个节点:

>>> c = Node("Company", name="ACME")
>>> c.__primarylabel__ = "Company"
>>> c.__primarykey__ = "name"
>>> WORKS_FOR = Relationship.type("WORKS_FOR")
>>> g.merge(WORKS_FOR(a, c) | WORKS_FOR(b, c))

nodes方法,找到所有符合条件的节点:

>>> graph = Graph()
>>> graph.nodes[1234]
(_1234:Person {name: 'Alice'})
>>> graph.nodes.get(1234)
(_1234:Person {name: 'Alice'})
>>> graph.nodes.match("Person", name="Alice").first()
(_1234:Person {name: 'Alice'})

2.3 事务Transactions

commit() 提交事务

create(subgraph) 创建节点和关系

>>> from py2neo import Graph, Node, Relationship
>>> g = Graph()
>>> tx = g.begin()
>>> a = Node("Person", name="Alice")
>>> tx.create(a)
>>> b = Node("Person", name="Bob")
>>> ab = Relationship(a, "KNOWS", b)
>>> tx.create(ab)
>>> tx.commit()
>>> g.exists(ab)
True

2.4 查询结果

Cursor

前进一个节点,打印节点的名字:

while cursor.forward():
    print(cursor.current["name"])

因为Cursor是可迭代对象,也可以这样:

for record in cursor:
    print(record["name"])

只关心一个节点,则:

if cursor.forward():
    print(cursor.current["name"])

或:

print(next(cursor)["name"])

从单条记录只返回一个值:

print(cursor.evaluate())

data(),提取出所有数据:

>>> from py2neo import Graph
>>> graph = Graph()
>>> graph.run("MATCH (a:Person) RETURN a.name, a.born LIMIT 4").data()
[{'a.born': 1964, 'a.name': 'Keanu Reeves'},
 {'a.born': 1967, 'a.name': 'Carrie-Anne Moss'},
 {'a.born': 1961, 'a.name': 'Laurence Fishburne'},
 {'a.born': 1960, 'a.name': 'Hugo Weaving'}]

evaluate(field=0),从下条记录返回第一个字段:

>>> from py2neo import Graph
>>> g = Graph()
>>> g.run("MATCH (a) WHERE a.email={x} RETURN a.name", x="bob@acme.com").evaluate()
'Bob Robertson'

stats(),返回查询统计:

>>> from py2neo import Graph
>>> g = Graph()
>>> g.run("CREATE (a:Person) SET a.name = 'Alice'").stats()
constraints_added: 0
constraints_removed: 0
contained_updates: True
indexes_added: 0
indexes_removed: 0
labels_added: 1
labels_removed: 0
nodes_created: 1
nodes_deleted: 0
properties_set: 1
relationships_created: 0
relationships_deleted: 0

to_data_frame(index=None, columns=None, dtype=None),将数据返回为pandas的DataFrame:

>>> from py2neo import Graph
>>> graph = Graph()
>>> graph.run("MATCH (a:Person) RETURN a.name, a.born LIMIT 4").to_data_frame()
   a.born              a.name
0    1964        Keanu Reeves
1    1967    Carrie-Anne Moss
2    1961  Laurence Fishburne
3    1960        Hugo Weaving

3 py2neo.matching – 实体匹配

3.1 节点匹配

使用NodeMatcher匹配节点:

>>> from py2neo import Graph, NodeMatcher
>>> graph = Graph()
>>> matcher = NodeMatcher(graph)
>>> matcher.match("Person", name="Keanu Reeves").first()
(_224:Person {born:1964,name:"Keanu Reeves"})

使用where子句匹配:

>>> list(matcher.match("Person").where("_.name =~ 'K.*'"))
[(_57:Person {born: 1957, name: 'Kelly McGillis'}),
 (_80:Person {born: 1958, name: 'Kevin Bacon'}),
 (_83:Person {born: 1962, name: 'Kelly Preston'}),
 (_224:Person {born: 1964, name: 'Keanu Reeves'}),
 (_226:Person {born: 1966, name: 'Kiefer Sutherland'}),
 (_243:Person {born: 1957, name: 'Kevin Pollak'})]

排序order_by()和数量limit()限制:

>>> list(matcher.match("Person").where("_.name =~ 'K.*'").order_by("_.name").limit(3))
[(_224:Person {born: 1964, name: 'Keanu Reeves'}),
 (_57:Person {born: 1957, name: 'Kelly McGillis'}),
 (_83:Person {born: 1962, name: 'Kelly Preston'})]

只统计数量,用len():

>>> len(matcher.match("Person").where("_.name =~ 'K.*'"))
6

3.2 关系匹配RelationshipMatcher

使用的方法和节点匹配很相似:

first()

order_by(*fields)

where(*conditions, **properties)

4 对象图映射Object-Graph Mapping

用于绑定Python对象和底层图数据

class Movie(GraphObject):
    __primarykey__ = "title"

    title = Property()
    tag_line = Property("tagline")
    released = Property()

    actors = RelatedFrom("Person", "ACTED_IN")
    directors = RelatedFrom("Person", "DIRECTED")
    producers = RelatedFrom("Person", "PRODUCED")


class Person(GraphObject):
    __primarykey__ = "name"

    name = Property()
    born = Property()

    acted_in = RelatedTo(Movie)
    directed = RelatedTo(Movie)
    produced = RelatedTo(Movie)

4.1 图对象

GraphObject,用作基类

4.2 属性Property()

>>> class Person(GraphObject):
...     name = Property()
...
>>> alice = Person()
>>> alice.name = "Alice Smith"
>>> alice.name
"Alice Smith"

4.3 标签Label()

标签是布尔值,默认是False

>>> class Food(GraphObject):
...     hot = Label()
...
>>> pizza = Food()
>>> pizza.hot
False
>>> pizza.hot = True
>>> pizza.hot
True

4.4 关联对象

class Person(GraphObject):
    __primarykey__ = "name"

    name = Property()

    likes = RelatedTo("Person")
for friend in person.likes:
    print(friend.name)

4.5 对象匹配

>>> Person.match(graph, "Keanu Reeves").first()
<Person name='Keanu Reeves'>
>>> list(Person.match(graph).where("_.name =~ 'K.*'"))
[<Person name='Keanu Reeves'>,
 <Person name='Kevin Bacon'>,
 <Person name='Kiefer Sutherland'>,
 <Person name='Kevin Pollak'>,
 <Person name='Kelly McGillis'>,
 <Person name='Kelly Preston'>]

4.6 对象操作

>>> alice = Person()
>>> alice.name = "Alice Smith"
>>> graph.push(alice)
>>> alice.__node__
(_123:Person {name: 'Alice Smith'})