Ambari-Metrics Monitor 启动失败redhat8
已知注意事项
本文场景发生在 RedHat8 系统,其他系统是否完全一致需自行验证。
# 一、背景说明
Ambari-Metrics Monitor
是 AMS 架构中的节点代理,负责采集节点级 CPU、内存、磁盘、网络等系统指标 并上报给
Collector。
运行时依赖 Python2.6/2.7 与 psutil 库。由于 psutil 会在运行时触发 C 语言扩展编译,在 RedHat8 环境下常因 依赖缺失或目录权限异常 导致启动失败。
⚠️ RedHat8 默认不再自带
python2
及开发库,需要手动补齐。
# 二、问题复现
# 报错一:缺失 Python.h
resource_management.core.exceptions.ExecutionFailed: Execution of '/usr/sbin/ambari-metrics-monitor --config /etc/ambari-metrics-monitor/conf start' returned 255. ls: cannot access '/usr/lib/python2.6/site-packages/resource_monitoring/psutil/build': No such file or directory
Building psutil...
psutil/_psutil_linux.c:12:10: fatal error: Python.h: No such file or directory
#include <Python.h>
^~~~~~~~~~
compilation terminated.
error: command 'gcc' failed with exit status 1
Verifying Python version compatibility...
Using python /usr/bin/python2
Checking for previously running Metric Monitor...
/var/run/ambari-metrics-monitor/ambari-metrics-monitor.pid found with no process. Removing 958553...
Starting ambari-metrics-monitor
Verifying ambari-metrics-monitor process status with PID : 962294
Output of PID check :
ERROR: ambari-metrics-monitor start failed. For more details, see /var/log/ambari-metrics-monitor/ambari-metrics-monitor.out:
====================
from core.controller import Controller
File "/usr/lib/python2.6/site-packages/resource_monitoring/core/controller.py", line 29, in <module>
from host_info import HostInfo
File "/usr/lib/python2.6/site-packages/resource_monitoring/core/host_info.py", line 25, in <module>
import psutil
File "/usr/lib/python2.6/site-packages/resource_monitoring/psutil/build/lib.linux-x86_64-2.7/psutil/__init__.py", line 89, in <module>
import psutil._pslinux as _psplatform
File "/usr/lib/python2.6/site-packages/resource_monitoring/psutil/build/lib.linux-x86_64-2.7/psutil/_pslinux.py", line 20, in <module>
from psutil import _common
ImportError: cannot import name _common
====================
Monitor out at: /var/log/ambari-metrics-monitor/ambari-metrics-monitor.out
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# 报错二:缺失 redhat-hardened-cc1
gcc: error: /usr/lib/rpm/redhat/redhat-hardened-cc1: No such file or directory
error: command 'gcc' failed with exit status 1
1
2
2
# 报错三:目录权限不足
可忽略
该错误出现在 start 阶段二次触发编译,中途残留导致,通常再次执行会失败,需要清理。
Task Log
File "/usr/lib/ambari-agent/lib/ambari_commons/os_family_impl.py", line 91, in thunk
File "/var/lib/ambari-agent/cache/common-services/AMBARI_METRICS/3.0.0/package/scripts/ams_service.py", line 114, in ams_service
File "/usr/lib/ambari-agent/lib/resource_management/core/base.py", line 168, in __init__
File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 171, in run
File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 137, in run_action
File "/usr/lib/ambari-agent/lib/resource_management/core/providers/system.py", line 350, in action_run
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 95, in inner
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 161, in checked_call
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 278, in _call_wrapper
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 493, in _call
raise ExecutionFailed(err_msg, code, out, err)
resource_management.core.exceptions.ExecutionFailed: Execution of '/usr/sbin/ambari-metrics-monitor --configs /etc/ambari-metrics-monitor/conf start' returned 255.
ls: cannot access '/usr/lib/python2.6/site-packages/resource_monitoring/psutil/build': No such file or directory
Building psutil...
error: could not create `build`: Permission denied
Verifying Python version compatibility...
Using python /usr/bin/python2
Checking for previously running Metric Monitor...
/var/run/ambari-metrics-monitor/ambari-metrics-monitor.pid found with no process. Removing 650572...
Starting ambari-metrics-monitor
Verifying ambari-metrics-monitor process status with PID : 657311
Output of PID check :
ERROR: ambari-metrics-monitor start failed. For more details, see /var/log/ambari-metrics-monitor/ambari-metrics-monitor.out
====================
running build
running build_py
running build_ext
-- Finished building psutil
Traceback (most recent call last):
File "/usr/lib/python2.6/site-packages/resource_monitoring/core/controller.py", line 26, in <module>
from core.controller import Controller
File "/usr/lib/python2.6/site-packages/resource_monitoring/core/__init__.py", line 29, in <module>
for dir in os.walk(path).next()[1]:
StopIteration
====================
Monitor out at: /var/log/ambari-metrics-monitor/ambari-metrics-monitor.out
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
# 三、原因拆解
报错现象 | 根本原因 | 涉及组件 |
---|---|---|
Python.h: No such file | 缺少 python2-devel,无法编译 psutil | Python 开发头文件 |
redhat-hardened-cc1 缺失 | 缺少 redhat-rpm-config / annobin | GCC 附属工具链 |
ImportError: cannot import name _common | psutil 部分模块编译失败,遗留半成品 | psutil 构建 |
Permission denied | /usr/lib/python2.6 属主错误,ams 用户无写权限 | Ambari-Metrics 运行用户 |
# 四、修复步骤
# 1. 清理残留目录并重装 Monitor
rm -rf /usr/lib/python2.6/*
yum reinstall -y ambari-metrics-monitor
1
2
2
# 2. 安装完整依赖工具链
需在 所有节点 执行:
yum install -y gcc gcc-c++ make \
redhat-rpm-config annobin \
glibc-headers glibc-devel libstdc++-devel \
python2 python2-devel
1
2
3
4
2
3
4
# 3. 修复目录权限
提醒
ttr-2.1.0 版本以上的包已解决,无需处理,下述适用于旧版本
chown -R ams:hadoop /usr/lib/python2.6
1
# 五、修复结果
完成上述操作后,点击 Ambari 页面 Restart,系统会重新编译并加载 psutil,服务可正常启动。
- 01
- [/metrics/aggregated] — 聚合数据范围 检查点09-19
- 02
- [/metrics] — 反向分析接口参数 请求抓包09-17
- 03
- [/metrics] — 普通指标写入方法 POST09-17