准备执行Gremlin的图形化环境

96
苏黎世黄昏
2018.09.07 17:42* 字数 1788

背景

Gremlin是Apache TinkerPop框架下实现的图遍历语言,支持OLTP与OLAP,是目前图数据库领域主流的查询语言,可类比SQL语言之于关系型数据库。

HugeGraph是国内的一款开源图数据库,完全支持Gremlin语言。本文将讲述如何基于HugeGraph搭建一个执行Gremlin的图形化环境。

HugeGraph的github仓库下有很多子项目,我们这里只需要使用其中的两个:hugegraphhugegraph-studio

部署HugeGraphServer

准备安装包

方式一:源码编译打包

进入hugegraph项目,克隆代码库

进入终端

$ git clone git@github.com:hugegraph/hugegraph.git

完成后会在当前目录下多出来一个hugegraph的子目录,不过这个目录里面的文件是源代码,我们需要编译打包才能生成可以运行包。

进入hugegraph目录,执行命令:

$ git checkout release-0.7
$ mvn package -DskipTests

注意:一定要先切换分支,hugegraph主分支上版本已经升级到0.8.0了,但是studio似乎还没有升级,为避免踩坑我们还是使用已发布版。

经过一长串的控制台输出后,最后如果能看到BUILD SUCCESS表示打包成功。

打包成功日志

这时会在当前目录下多出来一个子目录hugegraph-0.7.4和一个压缩包hugegraph-0.7.4.tar.gz,这就是我们即将要使用可以运行的包。

本人有轻微强迫症,不喜欢源代码和二进制包放在一起,容易混淆,所以把hugegraph-0.7.4拷到上一层目录,然后删除源代码目录,这样上层目录又回归清爽了。

$ mv hugegraph-0.7.4 ../hugegraph-0.7.4
$ cd ..
$ rm -rf hugegraph

到这儿安装包就准备好了。不过,这样操作是需要你本地装了jdkgitmaven命令行工具的,如果你没有安装也没关系,我们还可以直接下载hugegraph官方的release包。

方法二:直接下载release

点击github代码的上面的导航releases

可以看到hugegraph目前有两个release,点击hugegraph-0.7.4.tar.gz就开始下载了。

releases

下载完之后解压即可

$ tar -zxvf hugegraph-0.7.4.tar.gz

解压完之后能看到一个hugegraph-0.7.4目录,这个目录和用源码包打包生成的是一样的。

下面讲解如何配置参数。

配置参数

虽然标题叫配置参数,但其实hugegraph的默认配置就已经能在大部分环境下直接使用了,不过还是说明一下几个重要的配置项。

进入hugegraph-0.7.4目录,修改HugeGraphServer提供服务的url (host + port)

$ vim conf/rest-server.properties
# bind url
restserver.url=http://127.0.0.1:8080

# gremlin url to connect
gremlinserver.url=http://127.0.0.1:8182

# graphs list with pair NAME:CONF_PATH
graphs=[hugegraph:conf/hugegraph.properties]

# authentication
#auth.require_authentication=
#auth.admin_token=
#auth.user_tokens=[]

restserver.url就是HugeGraphServer对外提供RESTful API服务的地址,host127.0.0.1时只能在本机访问的,按需要修改其中的hostport部分即可。我这里由于studio也是准备在本地启动,8080端口也没有其他服务占用,所以不修改它。

graphs是可供连接的图名与配置项的键值对列表,hugegraph:conf/hugegraph.properties表示通过HugeGraphServer可以访问到一个名为hugegraph的图实例,该图的配置文件路径为conf/hugegraph.properties。我们可以不用去管图的配置文件,按需要修改图的名字即可。我这里仍然没有修改它。

初始化后端

hugegraph启动服务之前是需要手动初始化后端的,不过大家也不要看到“手动”两个字就害怕,其实就是调一个命令的事。

$ bin/init-store.sh
Initing HugeGraph Store...
2018-09-07 16:02:12 1082  [main] [INFO ] com.baidu.hugegraph.cmd.InitStore [] - Init graph with config file: conf/hugegraph.properties
2018-09-07 16:02:12 1201  [main] [INFO ] com.baidu.hugegraph.HugeGraph [] - Opening backend store 'rocksdb' for graph 'hugegraph'
2018-09-07 16:02:12 1258  [main] [INFO ] com.baidu.hugegraph.backend.store.rocksdb.RocksDBStore [] - Opening RocksDB with data path: rocksdb-data/schema
2018-09-07 16:02:12 1417  [main] [INFO ] com.baidu.hugegraph.backend.store.rocksdb.RocksDBStore [] - Failed to open RocksDB 'rocksdb-data/schema' with database 'hugegraph', try to init CF later
2018-09-07 16:02:12 1445  [main] [INFO ] com.baidu.hugegraph.backend.store.rocksdb.RocksDBStore [] - Opening RocksDB with data path: rocksdb-data/system
2018-09-07 16:02:12 1450  [main] [INFO ] com.baidu.hugegraph.backend.store.rocksdb.RocksDBStore [] - Failed to open RocksDB 'rocksdb-data/system' with database 'hugegraph', try to init CF later
2018-09-07 16:02:12 1456  [main] [INFO ] com.baidu.hugegraph.backend.store.rocksdb.RocksDBStore [] - Opening RocksDB with data path: rocksdb-data/graph
2018-09-07 16:02:12 1461  [main] [INFO ] com.baidu.hugegraph.backend.store.rocksdb.RocksDBStore [] - Failed to open RocksDB 'rocksdb-data/graph' with database 'hugegraph', try to init CF later
2018-09-07 16:02:12 1491  [main] [INFO ] com.baidu.hugegraph.backend.store.rocksdb.RocksDBStore [] - Store initialized: schema
2018-09-07 16:02:12 1511  [main] [INFO ] com.baidu.hugegraph.backend.store.rocksdb.RocksDBStore [] - Store initialized: system
2018-09-07 16:02:12 1543  [main] [INFO ] com.baidu.hugegraph.backend.store.rocksdb.RocksDBStore [] - Store initialized: graph
2018-09-07 16:02:13 1804  [pool-3-thread-1] [INFO ] com.baidu.hugegraph.backend.Transaction [] - Clear cache on event 'store.init'

这里可以看到,hugegraph初始化了rocksdb后端,那为什么是rocksdb而不是别的呢,其实就是上一步说的conf/hugegraph.properties中配置的。

$ vim conf/hugegraph.properties
# gremlin entrence to create graph
gremlin.graph=com.baidu.hugegraph.HugeFactory

# cache config
#schema.cache_capacity=1048576
#graph.cache_capacity=10485760
#graph.cache_expire=600

# schema illegal name template
#schema.illegal_name_regex=\s+|~.*

#vertex.default_label=vertex

backend=rocksdb
serializer=binary

store=hugegraph

# rocksdb backend config
#rocksdb.data_path=/path/to/disk
#rocksdb.wal_path=/path/to/disk
...

其中backend=rocksdb就是设置后端为rocksdb的配置项。

其他的后端还包括:memorycassandrascylladbhbasemysqlpalo。我们这里不用去管它,用默认的rocksdb即可。

初始化完成之后,会在当前目录下出现一个rocksdb-data的目录,这就是存放后端数据的地方,没事千万不要随意删它或移动它。

注意:初始化后端这个操作只需要在第一次启动服务前执行一次,不要每次起服务都执行。不过即使执行了也没关系,hugegraph检测到已经初始化过了会跳过。

启动服务

终于到了启动服务了,同样也是一条命令

$ bin/start-hugegraph.sh
Starting HugeGraphServer...
Connecting to HugeGraphServer (http://127.0.0.1:8080/graphs)....OK

看到上面的OK就表示启动成功了,我们可以jps看一下进程。

$ jps
...
4101 HugeGraphServer
4233 Jps
...

如果还不放心,我们可以发个HTTP请求试试看。

$ curl http://127.0.0.1:8080/graphs
{"graphs":["hugegraph"]}

到这里HugeGraphServer的部署就完成了,接下来我们来部署HugeGraphStudio

部署HugeGraphStudio

步骤与部署HugeGraphServer大体类似,我们就不那么啰嗦了。

记得先返回最上层目录,避免目录嵌套在一起了。

准备安装包

克隆代码库

$ git clone git@github.com:hugegraph/hugegraph-studio.git
Cloning into 'hugegraph-studio'...
mux_client_request_session: read from master failed: Broken pipe
remote: Counting objects: 326, done.
remote: Compressing objects: 100% (189/189), done.
remote: Total 326 (delta 115), reused 324 (delta 113), pack-reused 0
Receiving objects: 100% (326/326), 1.60 MiB | 350.00 KiB/s, done.
Resolving deltas: 100% (115/115), done.

编译打包

studio是一个包含前端的项目,使用react.js实现,自行打包的话需要安装npmwebpack等工具。

$ cd hugegraph-studio
$ mvn package -DskipTests

studio打包的时间会稍长一点。

...
[INFO] Reactor Summary:
[INFO]
[INFO] hugegraph-studio ................................... SUCCESS [  0.003 s]
[INFO] studio-api ......................................... SUCCESS [  4.683 s]
[INFO] studio-dist ........................................ SUCCESS [01:42 min]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 01:47 min
[INFO] Finished at: 2018-09-07T16:32:44+08:00
[INFO] Final Memory: 34M/390M
[INFO] ------------------------------------------------------------------------

将打包好的目录拷到上一层,删除源码目录(纯个人喜好)。

$ mv hugegraph-studio-0.7.0 ../
$ cd ..
$ rm -rf hugegraph-studio

至此,我的最上层目录就只剩下两个安装包,如下:

$ ls
hugegraph-0.7.4        hugegraph-studio-0.7.0

配置参数

进入hugegraph-studio-0.7.0目录,修改唯一的一个配置文件。

$ cd  hugegraph-studio-0.7.0
$ vim conf/hugegraph-studio.properties
studio.server.port=8088
studio.server.host=localhost

graph.server.host=localhost
graph.server.port=8080
graph.name=hugegraph

# the directory name released by react
studio.server.ui=ui
# the file location of studio-api.war
studio.server.api.war=war/studio-api.war
# default folder in your home directory, set to a non-empty value to override
data.base_directory=~/.hugegraph-studio

show.limit.data=250
show.limit.edge.total=1000
show.limit.edge.increment=20

# separator ','
gremlin.limit_suffix=[.V(),.E(),.hasLabel(STR),.hasLabel(NUM),.path()]

需要修改的参数是graph.server.host=localhostgraph.server.port=8080graph.name=hugegraph。它们与HugeGraphServer的配置文件conf/rest-server.properties中的配置项对应,其中:

  • graph.server.host=localhostrestserver.url=http://127.0.0.1:8080host对应;
  • graph.server.port=8080与的restserver.url=http://127.0.0.1:8080port对应;
  • graph.name=hugegraphgraphs=[hugegraph:conf/hugegraph.properties]的图名对应。

因为我之前并没有修改HugeGraphServer的配置文件conf/rest-server.properties,所以这里也不需要修改HugeGraphStudio的配置文件conf/hugegraph-studio.properties

启动服务

$ bin/hugegraph-studio.sh

studio的启动默认是不会放到后台的,所以我们会在控制台上看到一大串日志,在最底下看到如下日志表示启动成功:

信息: Starting ProtocolHandler [http-nio-127.0.0.1-8088]
16:56:24.507 [main] INFO  com.baidu.hugegraph.studio.HugeGraphStudio ID:  TS: - HugeGraphStudio is now running on: http://localhost:8088

然后我们按照提示,在浏览器中输入http://localhost:8088,就进入了studio的界面:

studio界面

图中Gremlin下的框,就是我们输入gremlin语句进而操作hugegraph的入口了,下面我们给出一个例子。

创建关系图

以下内容参考CSDN博客通过Gremlin语言构建关系图并进行图分析

在输入框中输入以下代码以创建一个“TinkerPop关系图”:

// PropertyKey
graph.schema().propertyKey("name").asText().ifNotExist().create()
graph.schema().propertyKey("age").asInt().ifNotExist().create()
graph.schema().propertyKey("addr").asText().ifNotExist().create()
graph.schema().propertyKey("lang").asText().ifNotExist().create()
graph.schema().propertyKey("tag").asText().ifNotExist().create()
graph.schema().propertyKey("weight").asFloat().ifNotExist().create()

// VertexLabel
graph.schema().vertexLabel("person").properties("name", "age", "addr", "weight").useCustomizeStringId().ifNotExist().create()
graph.schema().vertexLabel("software").properties("name", "lang", "tag", "weight").primaryKeys("name").ifNotExist().create()
graph.schema().vertexLabel("language").properties("name", "lang", "weight").primaryKeys("name").ifNotExist().create()

// EdgeLabel
graph.schema().edgeLabel("knows").sourceLabel("person").targetLabel("person").properties("weight").ifNotExist().create()
graph.schema().edgeLabel("created").sourceLabel("person").targetLabel("software").properties("weight").ifNotExist().create()
graph.schema().edgeLabel("contains").sourceLabel("software").targetLabel("software").properties("weight").ifNotExist().create()
graph.schema().edgeLabel("define").sourceLabel("software").targetLabel("language").properties("weight").ifNotExist().create()
graph.schema().edgeLabel("implements").sourceLabel("software").targetLabel("software").properties("weight").ifNotExist().create()
graph.schema().edgeLabel("supports").sourceLabel("software").targetLabel("language").properties("weight").ifNotExist().create()

// TinkerPop
okram = graph.addVertex(T.label, "person", T.id, "okram", "name", "Marko A. Rodriguez", "age", 29, "addr", "Santa Fe, New Mexico", "weight", 1)
spmallette = graph.addVertex(T.label, "person", T.id, "spmallette", "name", "Stephen Mallette", "age", 0, "addr", "", "weight", 1)

tinkerpop = graph.addVertex(T.label, "software", "name", "TinkerPop", "lang", "java", "tag", "Graph computing framework", "weight", 1)
tinkergraph = graph.addVertex(T.label, "software", "name", "TinkerGraph", "lang", "java", "tag", "In-memory property graph", "weight", 1)
gremlin = graph.addVertex(T.label, "language", "name", "Gremlin", "lang", "groovy/python/javascript", "weight", 1)

okram.addEdge("created", tinkerpop, "weight", 1)
spmallette.addEdge("created", tinkerpop, "weight", 1)

okram.addEdge("knows", spmallette, "weight", 1)

tinkerpop.addEdge("define", gremlin, "weight", 1)
tinkerpop.addEdge("contains", tinkergraph, "weight", 1)
tinkergraph.addEdge("supports", gremlin, "weight", 1)

// Titan
dalaro = graph.addVertex(T.label, "person", T.id, "dalaro", "name", "Dan LaRocque ", "age", 0, "addr", "", "weight", 1)
mbroecheler = graph.addVertex(T.label, "person", T.id, "mbroecheler", "name", "Matthias Broecheler", "age", 29, "addr", "San Francisco", "weight", 1)

titan = graph.addVertex(T.label, "software", "name", "Titan", "lang", "java", "tag", "Graph Database", "weight", 1)

dalaro.addEdge("created", titan, "weight", 1)
mbroecheler.addEdge("created", titan, "weight", 1)
okram.addEdge("created", titan, "weight", 1)

dalaro.addEdge("knows", mbroecheler, "weight", 1)

titan.addEdge("implements", tinkerpop, "weight", 1)
titan.addEdge("supports", gremlin, "weight", 1)

// HugeGraph
javeme = graph.addVertex(T.label, "person", T.id, "javeme", "name", "Jermy Li", "age", 29, "addr", "Beijing", "weight", 1)
zhoney = graph.addVertex(T.label, "person", T.id, "zhoney", "name", "Zhoney Zhang", "age", 29, "addr", "Beijing", "weight", 1)
linary = graph.addVertex(T.label, "person", T.id, "linary", "name", "Linary Li", "age", 28, "addr", "Wuhan. Hubei", "weight", 1)

hugegraph = graph.addVertex(T.label, "software", "name", "HugeGraph", "lang", "java", "tag", "Graph Database", "weight", 1)

javeme.addEdge("created", hugegraph, "weight", 1)
zhoney.addEdge("created", hugegraph, "weight", 1)
linary.addEdge("created", hugegraph, "weight", 1)

javeme.addEdge("knows", zhoney, "weight", 1)
javeme.addEdge("knows", linary, "weight", 1)

hugegraph.addEdge("implements", tinkerpop, "weight", 1)
hugegraph.addEdge("supports", gremlin, "weight", 1)

点击右上角的三角按钮,这样就创建出了一个图。

图查询

在输入框中输入:

g.V()

就能查出上面创建的图的所有顶点和边。

TinkerPop关系图

至此,执行Gremlin的图形化环境就已经搭建完成,后续就可以做各种各样炫酷的gremlin查询了。

HugeGraph