新手问答 连接主网后,nodeos 服务总是自动停掉,请问怎么最优配置?

tim · 2018年09月17日 · 最后由 JasonPan 回复于 2018年10月17日 · 1030 次阅读

以下是错误的详情:

2018-09-17T07:14:52.270 thread-0   net_plugin.cpp:2133           operator()           ] Error reading message from peer.main.alohaeos.com:9876: Bad file descriptor
2018-09-17T07:14:53.600 thread-0   net_plugin.cpp:2133           operator()           ] Error reading message from m.eosvibes.io:9876: Connection reset by peer
2018-09-17T07:14:53.693 thread-0   chain_plugin.cpp:915          log_guard_exception  ] Database has reached an unsafe level of usage, shutting down to avoid corrupting the database.  Please increase the value set for "chain-state-db-size-mb" and restart the process!
2018-09-17T07:14:53.693 thread-0   chain_plugin.cpp:921          log_guard_exception  ] Details: 3060101 database_guard_exception: Database usage is at unsafe levels
database free: 134203520, guard size: 1073741824
    {"f":134203520,"g":1073741824}
    thread-0  controller.cpp:1843 validate_db_available_size
2018-09-17T07:14:53.693 thread-0   net_plugin.cpp:3042           plugin_shutdown      ] shutdown..
2018-09-17T07:14:53.693 thread-0   net_plugin.cpp:3045           plugin_shutdown      ] close acceptor
2018-09-17T07:14:53.693 thread-0   net_plugin.cpp:3048           plugin_shutdown      ] close 35 connections
2018-09-17T07:14:53.693 thread-0   net_plugin.cpp:1366           request_next_chunk   ] Unable to continue syncing at this time
2018-09-17T07:14:53.694 thread-0   net_plugin.cpp:3056           plugin_shutdown      ] exit shutdown

共收到 4 条回复

内存够么?

解决了吗,我也碰到这个问题了,求帮助

我的也遇到这个情况,经常会死掉。

问题分析

EOSIO的状态数据库通过共享内存的方式来保存状态数据库(这也是为什么编译时要求内存大小为7G),本质上是个内存映射文件, 通过内存映射的方式来读写,目前默认为1G,代码中为了保证性能, 要求对内存进行锁定(代码位置:libraries\chainbase\src\chainbase.cpp:database::database):

#ifndef _WIN32
         int r = mlock( _segment->get_address(), _segment->get_size() );
         if (r != 0 ) {
            //we cannot use fc library here, which means that this message doesn't go to graylog even if you have configure it
            //also it doesn't looks as nice as warnings generated by fc
            //this message is for 1.0.2 because failing here would be incompatibel with 1.0
            //for 1.1 it probably will be changed to throw an exception
            std::cerr << "CHAINBASE:   Failed to pin chainbase shared memory (of size " << (_segment->get_size() / (1024.0*1024.0))
                      << " MB) in RAM. Performance degradation is possible." << std::endl;
         }
#endif
      }

综上所述,状态数据库就是一个内存文件,需要预留大一点,否则进行数据保存时,需要new时就会报错。

问题解决

状态数据库内存大小是通过节点配置文件中的chain-state-db-size-mb选项控制,如下所示:

# Maximum size (in MiB) of the chain state database (eosio::chain_plugin)
chain-state-db-size-mb = 4096

目前官方已经将此大小推荐为4G,推荐配置chain-state-db-size-mb的同时配置状态数据库守卫大小chain-state-db-guard-size-mb的值:

# Safely shut down node when free space remaining in the chain state database drops below this size (in MiB). (eosio::chain_plugin)
chain-state-db-guard-size-mb = 2048

这样配置了以后会达到守卫大小后自动安全的关闭节点。
上面的配置不是最优的,假设节点统一配置都是这样,那么有可能所有的节点在某一时间段会大部分停掉,这样风险很大,一旦一半的节点停止后,可逆块就会增多,导致不能被确认成不可逆块,节点再次启动时就会造成分叉。
所以需要一个终极解决方案。

终极解决方案

在配置文件中启动db_size_api_plugin插件,这个插件提供了一个HTTP API 接口:v1/db_size/get,用于获取目前节点的状态数据库运行情况,如下所示:

{
    "free_bytes": 1066210352,
    "indices": [
        {
            "index": "eosio::account_control_history_object",
            "row_count": 0        },
       ...

    ],
    "size": 1073741808,
    "used_bytes": 7531456
}

所以运维工程师需要设置一个定时任务,定时读取节点的当前状态数据库的使用情况,一旦达到节点守卫大小的某个警戒百分比后通知运维工程师,停止节点,修改数据库大小和守卫大小,重新启动同步区块后继续打块。

需要 登录 后方可回复, 如果你还没有账号请点击这里 注册