1、集群容错的配置项
- failover - 失败自动切换,当出现失败,重试其他服务器(缺省),通常用于读操作,但重试会带来更长的延时。
- failfast - 快速失效,只发起一次调用,失败立即报错。通常用于非幂等性写操作,比如说新增记录
- failsafe - 失败安全,出现异常时,直接忽略,通常用于写入审计日志等操作
- failback - 失败自动恢复,后台记录失败请求,定时重发,通常用于消息通知操作
- forking - 并行调用多个服务器,只要一个成功即返回。通常用于实时性要求较高的读操作,但需要浪费更多的服务器资源。
如果没有设置集群的cluster,则默认值为:failover;在cluster默认为failover时如果没有设置retries的值,则默认使用default.retries+1 = 0+1,即重试1次,如果设置了retries的值,则使用配置的值(比如:retries=2则重试2次)。
如果dubbo的provider所提供的服务中涉及到数据库操作,最好使用failfast,避免幂等操作抛出异常,或者使用默认cluster时可能导致的数据重试插入多条等情况。
FailoverCluster相关代码:@SuppressWarnings({"unchecked", "rawtypes"}) public Result doInvoke(Invocation invocation, final List<Invoker<T>> invokers, LoadBalance loadbalance) throws RpcException { List<Invoker<T>> copyinvokers = invokers; checkInvokers(copyinvokers, invocation); int len = getUrl().getMethodParameter(invocation.getMethodName(), Constants.RETRIES_KEY, Constants.DEFAULT_RETRIES) + 1; if (len <= 0) { len = 1; } // retry loop. RpcException le = null; // last exception. List<Invoker<T>> invoked = new ArrayList<Invoker<T>>(copyinvokers.size()); // invoked invokers. Set<String> providers = new HashSet<String>(len); for (int i = 0; i < len; i++) { //Reselect before retry to avoid a change of candidate `invokers`. //NOTE: if `invokers` changed, then `invoked` also lose accuracy. if (i > 0) { checkWhetherDestroyed(); copyinvokers = list(invocation); // check again checkInvokers(copyinvokers, invocation); } Invoker<T> invoker = select(loadbalance, invocation, copyinvokers, invoked); invoked.add(invoker); RpcContext.getContext().setInvokers((List) invoked); try { Result result = invoker.invoke(invocation); if (le != null && logger.isWarnEnabled()) { logger.warn("Although retry the method " + invocation.getMethodName() + " in the service " + getInterface().getName() + " was successful by the provider " + invoker.getUrl().getAddress() + ", but there have been failed providers " + providers + " (" + providers.size() + "/" + copyinvokers.size() + ") from the registry " + directory.getUrl().getAddress() + " on the consumer " + NetUtils.getLocalHost() + " using the dubbo version " + Version.getVersion() + ". Last error is: " + le.getMessage(), le); } return result; } catch (RpcException e) { if (e.isBiz()) { // biz exception. throw e; } le = e; } catch (Throwable e) { le = new RpcException(e.getMessage(), e); } finally { providers.add(invoker.getUrl().getAddress()); } } throw new RpcException(le != null ? le.getCode() : 0, "Failed to invoke the method " + invocation.getMethodName() + " in the service " + getInterface().getName() + ". Tried " + len + " times of the providers " + providers + " (" + providers.size() + "/" + copyinvokers.size() + ") from the registry " + directory.getUrl().getAddress() + " on the consumer " + NetUtils.getLocalHost() + " using the dubbo version " + Version.getVersion() + ". Last error is: " + (le != null ? le.getMessage() : ""), le != null && le.getCause() != null ? le.getCause() : le); }
2、集群容错的配置
- failover
<!-- Provider side --> <dubbo:service retries="2" /> <!-- Consumer side --> <dubbo:reference retries="2" /> <!-- Consumer side,Method --> <dubbo:reference> <dubbo:method name="findFoo" retries="2" /> </dubbo:reference>
- failfast
<!-- Provider side --> <dubbo:service cluster="failfast"/> <!-- Consumer side --> <dubbo:reference cluster="failfast"/>
- failsafe
<!-- Provider side --> <dubbo:service cluster="failsafe"/> <!-- Consumer side --> <dubbo:reference cluster="failsafe"/>
- failback
<!-- Provider side --> <dubbo:service cluster="failback"/> <!-- Consumer side --> <dubbo:reference cluster="failback"/>
- forking
<!-- Provider side --> <dubbo:service cluster="forking"/> <!-- Consumer side --> <dubbo:reference cluster="forking"/>