Neo4j-库管理

Neo4j权威指南第五章读书笔记

监控

截图

解释:

  • Store Size: 存储容量, 调用操作系统获取Neo4j数据库存储文件的大小

    前面看过Neo4j数据的存储方式, 节点, 关系, 属性, 大段String, 索引都是分开存放的, 如图所示, 当前总容量是1.4GB

  • IDAllocation: 分配的ID数

  • PageCache 页面缓存 这个为什么木有数

  • Transcations 事务

指标 解释
ArrayStrog 数组存储容量
IndexStore 属性存储容量
LabelStore 标签存储容量
NodeStore Node存储容量
RelationshipStore 关系存储容量
PropertyStore 属性存储容量
Last Tx ID 最后提交的事务ID
Current 当前事务ID
Peak 并发事务的最高峰值
Opened 启动的事务总数
Committed 提交的事务总数

查询管理

开启超时查询保护:

dbms.transcation.timeout = 10s // 默认0表示不启用查询超时保护. 对java API访问时带的超时参数没影响

查询操作query(社区版没找到~~):

dbms.listQueries() 列出运行的语句

dbms.killQuery(queryId)

dbms.killQueries([queryId, queryId, ....])

数据收集器

dbms.udc.enabled=false 关闭数据收集(默认true)

安全管理

社区版的用户没有提供涉及角色、权限控制等安全管理功能。

只提供:

dbms.security.listUsers

dbms.security.showCurrentUsers

dbms.security.changePassword

dbms.security.createUser

dbms.security.deleteUser

运维优化

配置内存

./bin/neo4j-admin memrec 推荐配置

  • 操作系统内存

    操作系统内存 = 可用内存 - (页面缓存 + 堆空间大小)

    index 和 schema 目录需要留出足够的内存用作操作系统的文件缓冲区, 否则索引文件不能完全装载在内存中,影响查询效率, 基本计算方式:

    系统内存 = 1GB + (graph.db/index) + graph.db / schema)

    graph.db 目前我们图数据库大概1.5G

  • 页面缓存大小:

页面缓存用于缓存存储在磁盘上的Neo4j数据, 确保大部分数据缓存到内存中, 提高查询性能

dbms.memory.pagecache.size, 保证数据都能装进去, 再加20%预留

  • 堆大小

堆内存大小足够大对维持并发操作是非常有用的, 一般建议 8~ 16GB之间

dbms.memory.heap.initial_size

dbms.memory.heap.max_size

设置为同一数值(如16000), 以避免不必要的垃圾收集

  • 堆的JVM调优:

    dbms.jvm.additional 附加JVM参数

        #**************************************************************
        # JVM Parameters
        #**********************************************************
        
        # G1GC generally strikes a good balance between throughput and tail
        # latency, without too much tuning.
        dbms.jvm.additional=-XX:+UseG1GC

        # Have common exceptions keep producing stack traces, so they can be
        # debugged regardless of how often logs are rotated.
        dbms.jvm.additional=-XX:-OmitStackTraceInFastThrow
        
        # Make sure that `initmemory` is not only allocated, but committed to
        # the process, before starting the database. This reduces memory
        # fragmentation, increasing the effectiveness of transparent huge
        # pages. It also reduces the possibility of seeing performance drop
        # due to heap-growing GC events, where a decrease in available page
        # cache leads to an increase in mean IO response time.
        # Try reducing the heap memory, if this flag degrades performance.
        dbms.jvm.additional=-XX:+AlwaysPreTouch
        
        # Trust that non-static final fields are really final.
        # This allows more optimizations and improves overall performance.
        # NOTE: Disable this if you use embedded mode, or have extensions or dependencies that may use reflection or
        # serialization to change the value of final fields!
        dbms.jvm.additional=-XX:+UnlockExperimentalVMOptions
        dbms.jvm.additional=-XX:+TrustFinalNonStaticFields
        
        # Disable explicit garbage collection, which is occasionally invoked by the JDK itself.
        dbms.jvm.additional=-XX:+DisableExplicitGC
        dbms.jvm.additional=-Xss100M
        # Disable explicit garbage collection, which is occasionally invoked by the JDK itself.
        dbms.jvm.additional=-XX:+DisableExplicitGC
        dbms.jvm.additional=-Xss100M
        
        # Remote JMX monitoring, uncomment and adjust the following lines as needed. Absolute paths to jmx.access and
        # jmx.password files are required.
        # Also make sure to update the jmx.access and jmx.password files with appropriate permission roles and passwords,
        # the shipped configuration contains only a read only role called 'monitor' with password 'Neo4j'.
        # For more details, see: http://download.oracle.com/javase/8/docs/technotes/guides/management/agent.html
        # On Unix based systems the jmx.password file needs to be owned by the user that will run the server,
        # and have permissions set to 0600.
        # For details on setting these file permissions on Windows see:
        #     http://docs.oracle.com/javase/8/docs/technotes/guides/management/security-windows.html
        #dbms.jvm.additional=-Dcom.sun.management.jmxremote.port=3637
        #dbms.jvm.additional=-Dcom.sun.management.jmxremote.authenticate=true
        #dbms.jvm.additional=-Dcom.sun.management.jmxremote.ssl=false
        #dbms.jvm.additional=-Dcom.sun.management.jmxremote.password.file=/absolute/path/to/conf/jmx.password
        #dbms.jvm.additional=-Dcom.sun.management.jmxremote.access.file=/absolute/path/to/conf/jmx.access
        
        # Some systems cannot discover host name automatically, and need this line configured:
        #dbms.jvm.additional=-Djava.rmi.server.hostname=$THE_NEO4J_SERVER_HOSTNAME
        
        # Expand Diffie Hellman (DH) key size from default 1024 to 2048 for DH-RSA cipher suites used in server TLS handshakes.
        # This is to protect the server from any potential passive eavesdropping.
        dbms.jvm.additional=-Djdk.tls.ephemeralDHKeySize=2048

        # This mitigates a DDoS vector.
        dbms.jvm.additional=-Djdk.tls.rejectClientInitiatedRenegotiation=true

(1) 分配新老两代大小比例:

   -XX:NewRation=N  # 年老代/年轻代 = N, N一般 2~ 8之间 大的年轻代适合更改排序大量数据, 运行并发线程也需要给年轻代更大空间,但年轻代过大容易引起频繁fullGC

(2) 堆的大小:

   最终目的是减少GC时间, 过大的堆会造成一般不GC, 一旦GC会耗费很长时间

(3) 并发垃圾回收:

   -XX:+UseG1GC

系统调优:

Neo4j读写许多读写操作, Linux默认CFQ公平排队调度IO. 期限调度器更适合数据库的特定IO工作负载情况, 优先读操作, 减少读的等待时间, 而增加写的等待时间。

修改驱动器sda:

echo 'deadline' > /sys/block/sda/queue/scheduler

cat  /sys/block/sda/queue/scheduler

磁盘内存:

快速固态硬盘 > 固态硬盘 > 机械硬盘

启动需要预热, 将索引加载到内存: CALL apoc.warmup.run(true);

dstat或vmstat等工具

备份恢复:

企业版提供了备份工具 neo4j-backup 备份到远程或本地, 支持全量, 增量

dbms.backup.enable=true # 默认值就是true
dbms.backup.address=<主机名/IP>:6362

社区版的只好自己写备份脚本了, 直接粘贴复制数据文件也可以, 但是官方建议更安全的数据导入导出工具(停服)

neo4j-admin dump --database=graph.db --to=${file}  # 备份压缩

neo4j-admin load --from=${file_catalog}/${file} --database=graph.db --force  # 恢复数据

推荐阅读更多精彩内容