Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

压力测试过程中,前置服务节点由于区块高度不相同,临时变成了观察节点后,无法恢复成共识节点 #525

Open
AndyLvVip opened this issue Nov 20, 2024 · 0 comments

Comments

@AndyLvVip
Copy link

背景与现象:

  1. webase-node-manager版本:v1.5.5
  2. 区块链fisco bcos版本:v2.9.0
  3. 测试场景,有3个group: group1, group2, group3。对group3进行压力测试。group3有5个节点,5个节点都是共识节点
  4. 对前置节点node A进行数据上链的压力测试,测试过程中出现前置节点与群组的最高高度不一致(可能是允许出现暂时性的不一致,也有可能是代码bug),导致前置节点node A被更新为观察节点。

问题:

  1. node A更新为观察节点后,再也恢复不了为共识节点。

定位到的问题,以及初步原因分析:

  1. node A恢复了共识节点的前提条件为node A的区块高度需要被正确更新,更新node A高度的定时任务的执行失败,出现以下报错:
2024-11-20 14:52:55.566 [node-mgr-task-12] INFO  ChainService(ChainService.java:518) - Run task:[DeployType:0, isChainRunning:false]
2024-11-20 14:52:55.611 [node-mgr-task-12] ERROR FrontRestTools(FrontRestTools.java:382) - fail restTemplateExchange. frontList is empty groupId:3
2024-11-20 14:52:55.611 [node-mgr-task-12] ERROR NodeStatusMonitorTask(NodeStatusMonitorTask.java:103) - in checkNodeStatusByGroup checkAndUpdateNodeStatus error: []
com.webank.webase.node.mgr.base.exception.NodeMgrException: all front of group: 3 is stopped
        at com.webank.webase.node.mgr.front.frontinterface.FrontRestTools.restTemplateExchange(FrontRestTools.java:383) ~[main/:?]
        at com.webank.webase.node.mgr.front.frontinterface.FrontRestTools.getForEntity(FrontRestTools.java:343) ~[main/:?]
        at com.webank.webase.node.mgr.front.frontinterface.FrontRestTools$$FastClassBySpringCGLIB$$a5c6faad.invoke(<generated>) ~[main/:?]
        at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218) ~[spring-core-5.3.26.jar:5.3.26]
        at org.springframework.aop.framework.CglibAopProxy.invokeMethod(CglibAopProxy.java:386) ~[spring-aop-5.3.26.jar:5.3.26]
        at org.springframework.aop.framework.CglibAopProxy.access$000(CglibAopProxy.java:85) ~[spring-aop-5.3.26.jar:5.3.26]
        at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:704) ~[spring-aop-5.3.26.jar:5.3.26]
        at com.webank.webase.node.mgr.front.frontinterface.FrontRestTools$$EnhancerBySpringCGLIB$$e50c39d8.getForEntity(<generated>) ~[main/:?]
        at com.webank.webase.node.mgr.front.frontinterface.FrontInterfaceService.getConsensusStatus(FrontInterfaceService.java:411) ~[main/:?]
        at com.webank.webase.node.mgr.node.NodeService.getPeerOfConsensusStatus(NodeService.java:327) ~[main/:?]
        at com.webank.webase.node.mgr.node.NodeService.checkAndUpdateNodeStatus(NodeService.java:215) ~[main/:?]
        at com.webank.webase.node.mgr.node.NodeService$$FastClassBySpringCGLIB$$2a65b731.invoke(<generated>) ~[main/:?]
        at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218) ~[spring-core-5.3.26.jar:5.3.26]
        at org.springframework.aop.framework.CglibAopProxy.invokeMethod(CglibAopProxy.java:386) ~[spring-aop-5.3.26.jar:5.3.26]
        at org.springframework.aop.framework.CglibAopProxy.access$000(CglibAopProxy.java:85) ~[spring-aop-5.3.26.jar:5.3.26]
        at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:704) ~[spring-aop-5.3.26.jar:5.3.26]
        at com.webank.webase.node.mgr.node.NodeService$$EnhancerBySpringCGLIB$$d27355a2.checkAndUpdateNodeStatus(<generated>) ~[main/:?]
        at com.webank.webase.node.mgr.alert.task.NodeStatusMonitorTask.checkNodeStatusByGroup(NodeStatusMonitorTask.java:101) ~[main/:?]
        at com.webank.webase.node.mgr.alert.task.NodeStatusMonitorTask.lambda$checkAllNodeStatusForAlert$0(NodeStatusMonitorTask.java:88) ~[main/:?]
  1. 最后定位为以下代码逻辑导致的问题,当前置节点被临时更新为观察节点,便无法再获取到这个前置节点的信息与区块进行通讯
public class FrontGroupMapCache {
    @Transactional(isolation= Isolation.READ_COMMITTED)
    public List<FrontGroup> getSealerOrObserverMap() {
        MapListParam param = new MapListParam();
        param.setType(ConsensusType.SEALER.getValue());
        List<FrontGroup> targetMap = null;
        targetMap = mapService.getList(param);
        log.debug("get sealer map:{} param:{}", targetMap, param);
        if (targetMap == null || targetMap.isEmpty()) {
            param.setType(ConsensusType.OBSERVER.getValue());
            targetMap = mapService.getList(param);
            log.debug("get observer map:{} param:{}", targetMap, param);
        }
        log.debug("getSealerOrObserverMap targetMap:{}", targetMap);
        return targetMap;
    }
}
  1. 以上的代码存在逻辑缺陷,当我有群组1,群组2的前置节点为共识节点时,则不会进入if (targetMap == null || targetMap.isEmpty()) 逻辑,群组3的节点信息便拿不到,便无法更新群组3前置节点的区块高度,便群组3的前置节点一直为观察节点,无法自动恢复为共识节点
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant