Search Images Maps Play YouTube News Gmail Drive More »
Sign in
Screen reader users: click this link for accessible mode. Accessible mode has the same essential features but works better with your reader.

Patents

  1. Advanced Patent Search
Publication numberCN103995901 A
Publication typeApplication
Application numberCN 201410254980
Publication date20 Aug 2014
Filing date10 Jun 2014
Priority date10 Jun 2014
Publication number201410254980.8, CN 103995901 A, CN 103995901A, CN 201410254980, CN-A-103995901, CN103995901 A, CN103995901A, CN201410254980, CN201410254980.8
Inventors赵晓平, 唐超, 马丽伟, 秦波, 王 锋
Applicant北京京东尚科信息技术有限公司, 北京京东世纪贸易有限公司
Export CitationBiBTeX, EndNote, RefMan
External Links: SIPO, Espacenet
Method for determining data node failure
CN 103995901 A
Abstract
The invention discloses a method for determining a data node failure. The method is used for a distributed database. The method comprises the steps that all application nodes of the distributed database are accessed, and when any application node cannot be connected with a certain data node in the distributed database, the broadcast that the data node cannot be connected is sent to other application nodes; after other application nodes receive the broadcast, a connecting request is sent to the data node to determine whether the data node can be connected or not; when the number of the application nodes which cannot be connected with the data node reaches a set threshold value, the data node failure is determined. According to the method, the characteristic that the application nodes belong to different IPs is utilized, whether the data node is in failure or not is determined, the influence of network fluctuation on a single IP generated when a synchronous request is sent to the data node through the same IP can be avoided, and then the failure reason of the data node can be judged more accurately.
Claims(8)  translated from Chinese
1.一种确定数据节点失效的方法,用于分布式数据库,该方法包括: 在访问所述分布式数据库的所有应用节点中,当任意一个应用节点连接不上所述分布式数据库中的某个数据节点时,向其它应用节点发出连接不上该数据节点的广播; 其它应用节点收到所述广播后,均向该数据节点发出连接请求,以确定是否能够连接该数据节点; 当无法连接该数据节点的应用节点数量达到所设定的阈值时,确定该数据节点失效。 1. A method of determining the data node failure, for distributed database, the method comprising: accessing all nodes in the distributed database applications, when any one of the applications on the node is not connected to a distributed database When a data node, not on the issue of the broadcast connection node of the data to other applications node; other application nodes after receiving the broadcast, both a connection request to the data node to determine whether to connect to the data node; when unable to connect When the number of applications of the data node node reaches a set threshold, it determines that the data node failure.
2.根据权利要求1所述的确定数据节点失效的方法,其特征在于: 在访问所述分布式数据库的所有应用节点中,选出任意一个应用节点作为仲裁节点,以统计无法连接该数据节点的应用节点的数量。 The determination data node according to the failure of the method claims, characterized in that: all application nodes in the distributed database access, select any one application node as arbitration node to node statistics can not connect to the data the number of application nodes.
3.根据权利要求2所述的确定数据节点失效的方法,其特征在于: 在所述仲裁节点中设定一判定值,并将所述判定值初始化为O ; 当所述其它应用节点向该数据节点发出连接请求后,均将是否能够连接该数据节点的信息发送给所述仲裁节点; 所述仲裁节点接收所有应用节点发来的是否能够连接该数据节点的信息,且所述仲裁节点每收到一个应用节点发来的无法连接该数据节点的消息,便将所述判定值做一次加I操作; 当所述仲裁节点接收完所有应用节点发来的是否能够连接该数据节点的信息后: 若所述判定值达到所设定的阈值,则确定该数据节点失效; 若所述判定值未达到所设定的阈值,则确定该数据节点有效。 When the other node to the application; the set value is determined at a node in the arbitration, and the judgment value is initialized to O: 3. Node 2 data to determine the failure of the method according to claim, characterized in that After the data node sends a connection request, all the information is able to connect to the data sent to the arbitration node node; the arbitration node receives all incoming applications node is able to connect information for the data node, and each node of the arbitration When the node after the arbitration has received all application nodes can connect incoming information whether the data nodes; node receives an application sent to the node is unable to connect to the data message, put the decision value plus I do a operation : If the decision value reaches the set threshold, it is determined that the data node failure; if the determination value set threshold is not reached, it is determined that the data nodes.
4.根据权利要求1所述的确定数据节点失效的方法,其特征在于:所述阈值为访问所述分布式数据库的所有应用节点数量的一半。 4. Determine the failure data node 1 The method according to claim, wherein: said threshold is applied to all the nodes access to half the number of the distributed database.
5.根据权利要求1所述的确定数据节点失效的方法,其特征在于,确定该数据节点失效之后,所述方法还包括: 将该数据节点从所述分布式数据库中删除; 启用该数据节点的备份节点。 5. The determination data node according to the failure of the method claims, characterized in that, after determining that the data node failure, the method further comprises: the data node is deleted from the distributed database; enable the data node The backup node.
6.根据权利要求3所述的确定数据节点失效的方法,其特征在于,确定该数据节点有效之后,所述方法还包括: 将所述判定值恢复为初始值O ; 连接不上该数据节点的应用节点定时向该数据节点发送连接请求,以等待该数据节点恢复连接。 6. After determining the data of the failure of node 3 The method according to claim, characterized in that, for determining the valid data node, the method further comprising: the value is determined to restore the initial value O; can not connect to the data node The timing of the application node to the data node sends a connection request to wait for the data connection node recovery.
7.根据权利要求1所述的确定数据节点失效的方法,其特征在于,当任意一个应用节点连接不上所述分布式数据库中的某个数据节点时,屏蔽掉该应用节点到该数据节点的连接。 7. The determination data node according to the failure of the method claims, characterized in that when not connected to any node on an application of the distributed database in a data node, block out the application node to the data node connection.
8.根据权利要求1所述的确定数据节点失效的方法,其特征在于,各个应用节点分属于不同IP。 Node 1 is determined according to the data of the failure of the method as claimed in claim wherein each application node belong to different IP.
Description  translated from Chinese

一种确定数据节点失效的方法 A method of determining the data node failure

技术领域 Technical Field

[0001] 本发明涉及分布式数据库领域,特别涉及一种确定数据节点失效的方法。 [0001] The present invention relates to the field of distributed databases, in particular to a method for determining the data node failure.

背景技术 Background

[0002] 随着网络技术的不断发展,对数据的存储和访问的要求越来越高,由此,分布式数据库应运而生。 [0002] With the continuous development of network technology for storing and accessing data have become increasingly demanding, thus, distributed database came into being. 分布式数据库的高扩展性和高可用性为许多需要不间断工作的网站解决了难题。 High scalability and high availability distributed database for many sites require constant work to solve the problem.

[0003] 分布式数据库,是由分布在多个计算机节点上的子数据库组成,分布在各个计算机节点上的各个子数据库称为数据节点,各个数据节点在逻辑上是相关的,地位是平等的。 [0003] The distributed database is distributed across multiple computer nodes subdatabase composition, distribution on each computer node of each sub-database called a data node, each data node is logically related, are equal in status . 为了保证整个分布式数据库的正常运行,必须即时了解每个数据节点的运行状态,以确定是否能正常提供服务,即确定数据节点是否有效。 To ensure the normal operation of the distributed database, you must immediately understand the operational status of each data node to determine whether to provide normal service, namely to determine the validity of the data node. 而网络波动、硬件故障等原因,都可能导致数据节点的失效,例如,网络波动会引起数据节点的暂时性失效,而硬件故障则会到时数据节点永久失效。 The network fluctuations, hardware failures and other reasons, may lead to failure of data nodes, for example, the network fluctuations cause temporary failure data node and data node is to permanently disable a hardware failure. 因此需要一种有效的手段来确定当前数据节点是否失效。 Therefore, a need for an effective means to determine whether the current data node failure.

[0004] Cassandra是一套开源分布式NoSQL数据库系统。 [0004] Cassandra is an open source NoSQL distributed database systems. 由于Cassandra良好的可扩放性,已被众多知名网站所采纳,成为了一种流行的分布式结构化数据存储方案。 Due to Cassandra Good Scalability, has been adopted by many well-known sites, it has become a popular distributed structured data storage solutions. 在Cassandra中,判定节点失效的方法是采用基于疑似度的检测(Accrual Fai lureDetection)。 Methods Cassandra, a decision node failure is suspected of using the detection of (Accrual Fai lureDetection) based. 该方法的基本思想是在分布式环境下,通过一个代表失效疑似度的值来判断数据节点是否失效。 The basic idea of this method is that in a distributed environment, the failure suspected of value through a representative to determine whether the data node failure. 该方法是在一定的时间窗口内,不断向数据节点发送同步请求,如果数据节点未能响应同步消息一次,那么该数据节点的失效疑似度的值就加1,当失效疑似度的值达到某个设定的阈值后,就确定该数据节点的永久失效。 The method is within a certain time window, constantly sends a synchronization request to the data nodes, if a node fails to respond to data synchronization message once, then the value of the failure of the suspected data node is incremented when the value reaches a certain degree of failure suspected After a set threshold value, it is determined permanently disable the data node.

[0005] 由于采用上述基于疑似度的检测的方法,通过同一个IP向数据节点发送同步请求,不能很好的避免因网络波动对所发送同步请求的影响,在一段时间内由于网络波动可能产生同步请求数据和/或数据节点对同步请求的响应数据的丢失,进而可能造成在发送同步请求的一段时间内,数据节点失效疑似度的值显著增加,甚至使得数据节点失效疑似度的达到所设定的阈值而被判定为永久失效,但实际上在这段时间过后,数据节点仍然会处于可用状态而并非真的永久失效。 [0005] As a result of the above-described method of detecting suspected based, sends a synchronization request to the data node via the same IP, can not be good to avoid the influence of fluctuations in the network synchronization request transmitted, fluctuations may occur due to the network over a period of time synchronous request data and / or data node is lost for response data synchronization request, and thus may cause in the transmission synchronization request a period of time, the data node failure value suspected of a significant increase, even making the data node failure suspected degrees to achieve the set predetermined threshold value is determined to be a permanent failure, but in fact after this time, the data node will still be available but not really a permanent invalid. 因此,现有的上述基于疑似度的检测的方法在使用过程中可能产生数据节点失效的误判。 Therefore, the above conventional method of detecting suspected based on the data node failure may produce false positives during use.

发明内容 DISCLOSURE

[0006] 有鉴于此,本发明提供一种确定数据节点失效的方法,以准确的判断数据节点是因网络引起的暂时性失效,还是硬件原因引起的永久失效。 [0006] In view of this, the present invention provides a method of determining the data node failure, in order to accurately determine the data node is a permanent failure due to a temporary network failure caused by hardware or causes.

[0007] 本申请的技术方案是这样实现的: [0007] This application technical solution is implemented as follows:

[0008] 一种确定数据节点失效的方法,用于分布式数据库,该方法包括: [0008] A method of determining the data node failure, for distributed database, the method comprising:

[0009] 在访问所述分布式数据库的所有应用节点中,当任意一个应用节点连接不上所述分布式数据库中的某个数据节点时,向其它应用节点发出连接不上该数据节点的广播; [0009] In all application nodes to access the distributed database, when not connected to any node on an application of the distributed database in a data node, not on the issue of the broadcast connection node of the data to other applications node ;

[0010] 其它应用节点收到所述广播后,向该数据节点发出连接请求,以确定是否能够连接该数据节点; After the [0010] Other Applications node receives the broadcast, to the data node sends a connection request to determine whether to connect to the data node;

[0011] 当无法连接该数据节点的应用节点数量达到所设定的阈值时,确定该数据节点失效。 [0011] When it is not connected to the node application data node number reaches the set threshold value, determines that the data node failure.

[0012] 进一步,在访问所述分布式数据库的所有应用节点中,选出任意一个应用节点作为仲裁节点,以统计无法连接该数据节点的应用节点的数量。 [0012] Further, in all application nodes in the distributed database access, select any one of the number of applications as arbitration node node to node statistics application can not connect to the data node.

[0013]进一步: [0013] Further:

[0014] 在所述仲裁节点中设定一判定值,并将所述判定值初始化为O ; [0014] is set in the arbitration node in a decision value and the decision value initialization is O;

[0015] 当所述其它应用节点向该数据节点发出连接请求后,均将是否能够连接该数据节点的信息发送给所述仲裁节点; [0015] After application of the other node to the data node sends a connection request, all the information is able to connect to the data sent to the arbitration node node;

[0016] 所述仲裁节点接收所有应用节点发来的是否能够连接该数据节点的信息,且所述仲裁节点每收到一个应用节点发来的无法连接该数据节点的消息,便将所述判定值做一次加I操作; [0016] The arbitration node receives all incoming applications node is able to connect information for the data node, and the arbitration node each node receives a message sent by an application can not connect to the data nodes, it puts the decision I do a value added operation;

[0017] 当所述仲裁节点接收完所有应用节点发来的是否能够连接该数据节点的信息后: [0017] When the arbitration node has received all application nodes sent whether the data node connection information after:

[0018] 若所述判定值达到所设定的阈值,则确定该数据节点失效; [0018] When the judgment value reaches the set threshold value, it is determined that the data node failure;

[0019] 若所述判定值未达到所设定的阈值,则确定该数据节点有效。 [0019] If the determination value set threshold is not reached, it is determined that the data nodes.

[0020] 进一步,所述阈值为访问所述分布式数据库的所有应用节点数量的一半。 [0020] Further, the threshold value of the distributed database access to half the number of nodes for all applications.

[0021] 进一步,确定该数据节点失效之后,所述方法还包括: [0021] Further, after determining that the data node failure, the method further comprising:

[0022] 将该数据节点从所述分布式数据库中删除; [0022] The data node is deleted from the distributed database;

[0023]启用该数据节点的备份节点。 [0023] enable the backup node of the data node.

[0024] 进一步,确定该数据节点有效之后,所述方法还包括: [0024] Further, it is determined that the data is valid after the node, the method further comprising:

[0025] 将所述判定值恢复为初始值O ; [0025] The value of the decision to restore the initial value O;

[0026] 连接不上该数据节点的应用节点定时向该数据节点发送连接请求,以等待该数据节点恢复连接。 [0026] can not connect to the data node application node periodically sends a connection request to the data nodes, the nodes to wait for the data connection is restored.

[0027] 进一步,当任意一个应用节点连接不上所述分布式数据库中的某个数据节点时,屏蔽掉该应用节点到该数据节点的连接。 [0027] Further, when not connected to any node on an application of the distributed database in a data node, block out the application node to connect to the data node.

[0028] 进一步,各个应用节点分属于不同IP。 [0028] Further, various application nodes belonging to different IP.

[0029] 从上述方案可以看出,本发明的确定数据节点失效的方法中,当某一应用节点连接不上某个数据节点后,通过多个应用节点向该数据节点发出连接请求以确定是否能够连接该数据节点,进而确定该数据节点是否失效,由于各个应用节点分属于不同IP,进而可避免现有技术中通过同一个IP向数据节点发送同步请求时由于网络波动对该单一IP造成的影响。 [0029] As can be seen from the above scheme, the data node of the present invention to determine the failure of the method, when an application node can not connect to a data node, a connection request by more than one application node to the data node to determine whether able to connect to the data node, and then determine whether the data node failure, because each application nodes belonging to different IP, and thus can avoid the single IP network fluctuations caused by the prior art when a synchronization request is sent to the data node via the same IP affected. 本发明比现有技术更加准确的判断数据节点是因网络引起的暂时性失效,还是硬件原因引起的永久失效。 More accurate than the prior art to determine the data node of the present invention is a permanent failure due to transient network failures caused by hardware or causes.

附图说明 Brief Description

[0030] 图1为本发明的确定数据节点失效的方法流程图; Determine the data node [0030] FIG. 1 is a schematic flowchart of a method invention of failure;

[0031] 图2为本发明实施例流程图。 [0031] FIG. 2 is a schematic flow diagram that illustrates implemented.

具体实施方式[0032] 为了使本发明的目的、技术方案及优点更加清楚明白,以下参照附图并举实施例,对本发明作进一步详细说明。 DETAILED DESCRIPTION [0032] In order to make the objects, technical solutions and advantages of the present invention will become apparent from, the following reference to the accompanying drawings and embodiments of the present invention will be described in further detail.

[0033] 本发明的确定数据节点失效的方法用于分布式数据库,如图1所示,该方法包括: [0033] The present invention determines the data node failure method for distributed database, shown in Figure 1, the method comprising:

[0034] 在访问所述分布式数据库的所有应用节点中,当任意一个应用节点连接不上所述分布式数据库中的某个数据节点时,向其它应用节点发出连接不上该数据节点的广播; [0034] In all application nodes to access the distributed database, when not connected to any node on an application of the distributed database in a data node, not on the issue of the broadcast connection node of the data to other applications node ;

[0035] 其它应用节点收到所述广播后,向该数据节点发出连接请求,以确定是否能够连接该数据节点; After the [0035] Other Applications node receives the broadcast, to the data node sends a connection request to determine whether to connect to the data node;

[0036] 当无法连接该数据节点的应用节点数量达到所设定的阈值时,确定该数据节点失效。 [0036] When it is not connected to the node application data node number reaches the set threshold value, determines that the data node failure.

[0037] 其中,统计无法连接该数据节点的应用节点的数量是在一仲裁节点中进行。 Quantity [0037] where the statistics can not connect to the data node node application is carried out in an arbitration node. 仲裁节点的选择是:在访问所述分布式数据库的所有应用节点中,任意选出的一个应用节点作为仲裁节点。 Select the arbitration node is: all applications accessing the distributed database node, an application node arbitrarily selected as the arbitration node.

[0038] 所述仲裁节点统计无法连接该数据节点通过如下方法进行: [0038] The arbitration node statistics can not connect to the data nodes through the following methods:

[0039] 在所述仲裁节点中设定一判定值,并将所述判定值初始化为O ; [0039] is set in the arbitration node in a decision value and the decision value initialization is O;

[0040] 当所述其它应用节点向该数据节点发出连接请求后,均将是否能够连接该数据节点的信息发送给所述仲裁节点; [0040] After application of the other node to the data node sends a connection request, all the information is able to connect to the data sent to the arbitration node node;

[0041] 所述仲裁节点接收所有应用节点发来的是否能够连接该数据节点的信息,且所述仲裁节点每收到一个应用节点发来的无法连接该数据节点的消息,便将所述判定值做一次加I操作; [0041] The arbitration node receives all incoming applications node is able to connect information for the data node, and the arbitration node each node receives a message sent by an application can not connect to the data nodes, it puts the decision I do a value added operation;

[0042] 当所述仲裁节点接收完所有应用节点发来的是否能够连接该数据节点的信息后: [0042] When the arbitration node has received all application nodes sent whether the data node connection information after:

[0043] 若所述判定值达到所设定的阈值,则确定该数据节点失效; [0043] When the judgment value reaches the set threshold value, it is determined that the data node failure;

[0044] 若所述判定值未达到所设定的阈值,则确定该数据节点有效。 [0044] If the determination value set threshold is not reached, it is determined that the data nodes.

[0045] 与现有技术不同的是,本发明的方法是当某一应用节点连接不上某个数据节点后,通过多个应用节点向该数据节点发出连接请求以确定是否能够连接该数据节点,进而确定该数据节点是否失效,各个应用节点分属于不同IP,进而可避免现有技术中通过同一个IP向数据节点发送同步请求时由于网络波动对该单一IP造成的影响,进而比现有技术更加准确的判断数据节点是因网络引起的暂时性失效,还是硬件原因引起的永久失效。 [0045] Unlike the prior art, the process of the present invention is that when an application node can not connect to a data node, a connection request by more than one application node to the data node to determine whether to connect to the data node , and then determines whether or not the data node fails, each node application belong to different IP, and thus the prior art can be avoided when the synchronization request is transmitted to the impact of a single IP network fluctuations due through the same IP to the data node, and then over the prior Technical data more accurately determine the permanent node is a temporary failure due to network failures caused by hardware or causes.

[0046] 本发明的上述方法中,当确定该数据节点失效之后,还包括: [0046] The above-described method of the present invention, when it is determined after the data node failure, further comprising:

[0047] 将该数据节点从所述分布式数据库中删除; [0047] The data node is deleted from the distributed database;

[0048]启用该数据节点的备份节点。 [0048] enable the backup node of the data node.

[0049] 进而实现了对失效数据节点替换。 [0049] and then realized the replacement of the failed node data.

[0050] 当确定该数据节点有效之后,本发明的方法还包括: [0050] When it is determined that the data node after the effective method of the present invention further comprises:

[0051 ] 将所述判定值恢复为初始值O ; [0051] The value of the decision to restore the initial value O;

[0052] 连接不上该数据节点的应用节点定时向该数据节点发送连接请求,以等待该数据节点恢复连接。 [0052] can not connect to the data node application node periodically sends a connection request to the data nodes, the nodes to wait for the data connection is restored.

[0053] 在实际网络应用时,访问分布式数据库的应用节点的数量庞大,每个应用节点的IP地址各不相同,而分布式数据库中具有大量的数据节点。 [0053] In the actual network applications, a large number of application nodes of the distributed database access, IP address of each application node is not the same, but in a distributed database with a large number of data nodes. 以下结合一个具体实施例,对本发明的方法进行说明。 Below with a specific embodiment of the method of the present invention will be described. 该实施例中,假设访问分布式数据库的应用节点共有N个,N>1,分布式数据库中具有M个数据节点(M>1),其中出现N个应用节点中的应用节点i(l < i < N)连接不上分布式数据库中的数据节点j (数据节点j为M个数据节点中的任意一个)。 In this embodiment, it is assumed access to distributed database application nodes total of N, N> 1, the distributed database data node has M (M> 1), which appears in the application nodes of N application nodes i (l < i <N) connection is not a distributed database data node j (j is any one data node M data nodes). 如图2所示,该实施例包括以下步骤: 2, this embodiment comprises the steps of:

[0054] 步骤1、从N个应用节点中任意选出一个应用节点作为仲裁节点,并在仲裁节点中设定一判定值,并将判定值初始化为“0”,设定一阈值,并将阈值设置为N/2,之后进入步骤2。 [0054] Step 1, from the N applied arbitrarily selected nodes as an application node arbitration node, and set a value judgment in the arbitration node, and the determination value is initialized to "0", set a threshold value, and the threshold value is set to N / 2, after entering the Step 2.

[0055] 步骤2、当应用节点i连接不上分布式数据库中的数据节点j时,向其它应用节点发出连接不上数据节点j的广播,之后进入步骤3。 [0055] Step 2, when the application node i can not connect distributed data in the database node j, issued a broadcast data nodes j not connected to the other application nodes, and then proceeds to step 3.

[0056] 所有应用节点中的任意一个应用节点连接不上分布式数据库中的某个数据节点时,还可进一步包括,屏蔽掉该应用节点到该数据节点的连接。 When the [0056] all application nodes any one application node can not connect to a distributed database data nodes, may further comprise, block out the application node to connect to the data node. 例如本步骤2中,当应用节点i连接不上数据节点j时,应用节点i屏蔽掉其到数据节点j的连接,进而可避免应用节点i 一直发起对数据节点j的连接但连接不上数据节点j所造成的网络资源开销。 For example this step 2, when the application is not connected to the data node i node j, node i shield its application to connect the data node j, and thus can avoid the application has been launched on the data node i node j is connected but not connected to the data Node j caused by the overhead of network resources.

[0057] 步骤3、其它应用节点收到连接不上数据节点j的广播后,向数据节点j发出连接请求,之后进入步骤4。 [0057] Step 3, other applications can not connect nodes after receiving the broadcast data node j, and a connection request to the data node j, then proceeds to step 4.

[0058] 步骤4、其它应用节点均将是否能够连接数据节点j的信息发送给所述仲裁节点,之后进入步骤5。 [0058] Step 4, the other application is able to connect nodes to transmit data information to the arbitration node j node, then go to step 5.

[0059] 步骤5、仲裁节点接收所有应用节点发来的是否能够连接数据节点j的信息,且仲裁节点每收到I个应用节点发来的无法连接数据节点j的消息,便将判定值进行加I操作,之后进入步骤6。 [0059] Step 5, the arbitration node receives all incoming applications node is able to connect information data node j, and the arbitration node each node receives a message I sent an application can not connect to the data node j, and put the decision value plus I operate, after entering the step 6.

[0060] 步骤6、仲裁节点判定累加的判定值是否达到设定的阈值N/2:若累加的判定值达到设定的阈值N/2,则确定数据节点j失效,之后进入步骤7;若累加的判定值未达到所设定的阈值N/2,则确定该数据节点有效,之后进入步骤9。 [0060] Step 6, the arbitration node determines whether the accumulated value reaches the set determination threshold value N / 2: determining if the accumulated value reaches the set threshold value N / 2, it is determined that the data node j fails, then proceeds to step 7; if accumulated value does not reach the set determination threshold value N / 2, it is determined that the data node is valid, then proceeds to step 9.

[0061] 步骤7、将数据节点j从所述分布式数据库中删除,之后进入步骤8。 [0061] Step 7, the data is deleted from the node j distributed database, and then proceeds to step 8.

[0062] 步骤8、启用数据节点j的备份节点j',以替代数据节点j。 [0062] Step 8, to enable data backup node node j j ', to replace data node j.

[0063] 步骤9、仲裁节点将所述判定值恢复为初始值0,并通知应用节点i数据节点j有效,之后进入步骤10; [0063] Step 9, the arbitration node will restore the judgment value to an initial value 0, and informs the application data node node i j is valid, then proceeds to step 10;

[0064] 步骤10、应用节点i接收到仲裁节点发来的数据节点j有效的消息后,定时向数据节点j发送连接请求,以等待数据节点j恢复连接。 After the [0064] Step 10, the application node i receives the data sent by the arbitration node node j valid message, time to send a connection request to the data node j, in order to wait for the data node j restore the connection.

[0065] 采用本发明的确定数据节点失效的方法,当某一应用节点连接不上某个数据节点后,通过多个应用节点向该数据节点发出连接请求以确定是否能够连接该数据节点,进而确定该数据节点是否失效,由于各个应用节点分属于不同IP,进而可避免现有技术中通过同一个IP向数据节点发送同步请求时由于网络波动对该单一IP造成的影响。 [0065] The method of the present invention to determine the data node failure, and when an application node can not connect to a data node, a connection request by more than one application node to the data node to determine whether to connect to the data node, and then determining whether the data node failure, since the respective application nodes belong to different IP, and thus can avoid the influence of the prior art when a synchronization request to the data transmission node through the same IP network since a single IP fluctuations. 本发明比现有技术更加准确的判断数据节点是因网络引起的暂时性失效,还是硬件原因引起的永久失效。 More accurate than the prior art to determine the data node of the present invention is a permanent failure due to transient network failures caused by hardware or causes.

[0066] 以上所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本发明保护的范围之内。 [0066] The foregoing is only preferred embodiments of the present invention, it is not intended to limit the invention within the spirit and principles of the present invention, made any modifications, equivalents, improvements should be included Within the scope of protection of the invention.

Patent Citations
Cited PatentFiling datePublication dateApplicantTitle
CN102231681A *27 Jun 20112 Nov 2011中国建设银行股份有限公司High availability cluster computer system and fault treatment method thereof
CN102882792A *20 Jun 201216 Jan 2013杜小勇Method for simplifying internet propagation path diagram
US20120101987 *25 Oct 201026 Apr 2012Paul Allen BottorffDistributed database synchronization
US20130246608 *15 Mar 201219 Sep 2013Microsoft CorporationCount tracking in distributed environments
US20130297976 *7 Mar 20137 Nov 2013Paraccel, Inc.Network Fault Detection and Reconfiguration
Classifications
International ClassificationG06F17/30
Cooperative ClassificationG06F17/30575
Legal Events
DateCodeEventDescription
20 Aug 2014C06Publication
17 Sep 2014C10Entry into substantive examination