On the importance of unified configuration: Eureka cluster address farce

root of the problem

A project uses the Eureka Client way to interact with the service party, before running well, until the operation and maintenance shutdown maintenance, the result is untuned service As if the entire server is hung up. The server is a cluster consisting of three servers. The client is called through the Ribbon load balancing. The operation and maintenance is stopped and maintained one by one. It does not close all the servers at once. It should not be that all services are unavailable. Looking for a problem for a long time, I finally found out that the Eureka cluster address was wrong, and the address from the first day into the production environment was wrong.

The correct address is

eureka.client.serviceUrl.defaultZone=http://ip1:11111/eureka/,http://ip2:11111/eureka/,http://ip3:11111/eureka/

The caller is configured to be

eureka.client.serviceUrl.defaultZone=http://ip1:11111/eureka/,http://ip2:22222/eureka/,http://ip3:33333/eureka/

. The ports of the two servers are all configured incorrectly. Because the switch is usually routed to the service that can be tuned through the Ribbon load balancing mechanism, I didn't find the problem for a while. Until the previous service downtime maintenance, the services available to the client are gone. With the increase of configuration, more similar problems are exposed, and the operation and maintenance and development personnel are confused. For example, the ActiveMq cluster address is configured as:

tcp://ip1::61616,tcp://ip2:61616,tcp://ip3:61616

, there is a colon in the address, which is normal under normal load balancing. Once the latter two services are hung up, the service will not work. This reminds me of a joke, extracting the original text here, and visually showing how the information develops in a strange direction in the process of transmission.

 It is said that the command transmission of a unit is like this:
The battalion commander on duty: about 8 o'clock tomorrow night, Halley's comet will probably be seen in this area, this comet can only be seen once every 76 years. Command all the soldiers to gather in the field and gather them on the playground. I will explain this rare phenomenon to them. If it rains, it will be gathered in the auditorium. I will put a film about them on the comet.

The officer on duty to the company commander: According to the command of the battalion commander, Halley's comet will appear on the playground at 8 o'clock tomorrow night. If it rains, let the soldiers line up in the field uniforms and go to the auditorium. This rare phenomenon will appear there.

The company commander of the platoon leader: According to the command of the battalion commander, at 8 o'clock tomorrow night, the extraordinary Halley comet will appear in the auditorium wearing a field suit. If the playground rains, the battalion commander will issue another order, which will only appear once every 76 years.

The platoon leader is the squad leader: At 8 o'clock tomorrow night, the battalion commander will appear with Halley's comet in the auditorium. This is something that happens every 76 years. If it rains, the battalion commander will order the comet to wear the field suit to the playground.

The squad leader told the soldiers: At 8 o'clock tomorrow night, the famous 76-year-old General Harley will be accompanied by the battalion commander in a field suit, driving his star-studded car and going to the auditorium through the playground. 

First this (ip1:port1, ip2:port1, ip3:port1) address + port + English comma separated form does not meet people's reading habits, and the second operation and development and paste between the paste to paste, it is inevitable An error occurred. Now it is three services. Imagine that there are 30 services in the follow-up. I am afraid that everyone will be big.

解决方法

All callers no longer maintain the cluster address, and are uniformly encapsulated by the company's internal framework. The internal framework of the company then obtains the cluster address from the distributed configuration center. The cluster address is maintained by a dedicated person. Spring Cloud already provides the distributed configuration center component Spring Cloud Config, but we use Ctrip open source distributed configuration center Apollo. Centralized management and application of different environments and different cluster configurations. After the configuration is modified, it can be pushed to the application end in real time. It has the features of standardized authority and process management, and is applicable to the micro service configuration management scenario.