Step16-Scripts-服务检查
# 1. metainfo.xml 中的服务检查声明
在 Ambari 的集成体系下,服务的健康检查脚本需要在 metainfo.xml 中提前声明,这样前端 UI 和后端调度才会自动关联对应的检测方法。
配置片段示例:
<commandScript>
<script>scripts/service_check.py</script>
<scriptType>PYTHON</scriptType>
<timeout>300</timeout>
</commandScript>
1
2
3
4
5
6
2
3
4
5
6
# 1.1 视觉展示:后台配置结构
- 上图展示了 metainfo.xml 文件片段,明确指定了 service_check.py 作为服务检查入口,并声明脚本类型、超时时间。
- 配置生效后,Ambari 会自动在服务“检查”环节调用指定的 Python 脚本。
# 2. service_check.py 代码实现与联动原理
# 2.1 脚本目录与作用
代码路径:
ambari-server/src/main/resources/stacks/BIGTOP/3.2.0/services/REDIS/package/scripts/service_check.py
1
该脚本在服务安装后、集群变更或人工点检时,由 Ambari 后台自动调度,用于健康状态自动校验。
# 2.2 典型实现示例
from resource_management.core.exceptions import Fail
from resource_management.core.logger import Logger
from resource_management.core.resources.system import Execute
from resource_management.libraries.functions.format import format
from resource_management.libraries.script.script import Script
class RedisServiceCheck(Script):
def service_check(self, env):
import params
env.set_params(params)
redis_port = params.redis_port
client_bin = params.client_bin
redis_password = params.redis_password
retries = 5
retry_delay = 5
self.check_redis_cluster(redis_port, client_bin, redis_password, retries, retry_delay)
def check_redis_cluster(self, redis_port, client_bin, redis_password, retries, retry_delay):
Logger.info("Checking Redis cluster health...")
if redis_password:
password_option = format("-a {redis_password}")
else:
password_option = ""
for attempt in range(retries):
try:
Execute(format("{client_bin}/redis-cli -p {redis_port} {password_option} cluster nodes"), user="redis")
Logger.info("Redis cluster nodes check passed.")
Execute(format("{client_bin}/redis-cli -p {redis_port} {password_option} cluster info"), user="redis")
Logger.info("Redis cluster info check passed.")
return
except Exception as e:
Logger.warning("Redis cluster health check failed (attempt {0}): {1}".format(attempt + 1, str(e)))
if attempt < retries - 1:
Logger.info("Retrying in {0} seconds...".format(retry_delay))
time.sleep(retry_delay)
else:
Logger.error("Failed to pass Redis cluster health check after {0} attempts.".format(retries))
raise Fail("Redis cluster health check failed.")
if __name__ == "__main__":
RedisServiceCheck().execute()
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
# 3. 配置与实现的联动机制
- service_check.py 只需实现 service_check 方法,Ambari 后台即可自动调度,无需手动调用。
- 01
- bigtop-select 打包缺 compat 报错修复 deb07-16
- 02
- bigtop-select 打包缺 control 文件报错修复 deb07-16
- 03
- 首次编译-环境初始化 必装07-16