解决-HDFS报snappy-devel包缺失现象
# 现象回顾
在 Rocky Linux 8.x 环境,通过 Ambari 安装 HDFS 服务,执行 DataNode 初始化时,自动拉取依赖包,却出现了如下典型报错:
主要症状 Ambari-agent 调用 yum/dnf 过程直接失败,提示 snappy-devel 包无法找到,DataNode 组件初始化中断。
# 报错堆栈溯源
截取核心异常堆栈:
Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/stacks/BIGTOP/3.2.0/services/HDFS/package/scripts/datanode.py", line 175, in <module>
DataNode().execute()
File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 413, in execute
method(env)
File "/var/lib/ambari-agent/cache/stacks/BIGTOP/3.2.0/services/HDFS/package/scripts/datanode.py", line 52, in install
self.install_packages(env)
File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 1005, in install_packages
retry_count=agent_stack_retry_count,
File "/usr/lib/ambari-agent/lib/resource_management/core/base.py", line 168, in __init__
self.env.run()
File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 171, in run
self.run_action(resource, action)
File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 137, in run_action
provider_action()
File "/usr/lib/ambari-agent/lib/resource_management/core/providers/packaging.py", line 32, in action_install
self._pkg_manager.install_package(package_name, self.__create_context())
File "/usr/lib/ambari-agent/lib/ambari_commons/repo_manager/yum_manager.py", line 254, in install_package
shell.repository_manager_executor(cmd, self.properties, context)
File "/usr/lib/ambari-agent/lib/ambari_commons/shell.py", line 823, in repository_manager_executor
raise RuntimeError(message)
RuntimeError: Failed to execute command '/usr/bin/yum -y install snappy-devel', exited with code '1', message: 'Error: Unable to find a match: snappy-devel
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
警告
定位要点: 不是 Python 脚本或大数据组件本身报错,而是底层包管理工具在安装依赖时失败,直接阻断了集群自动化部署。
# 场景分析与排查路径
# 1. 问题高发于哪些环境?
- Rocky Linux 8.x(如 Rocky 8.6、8.8、8.10 等主流大数据部署环境)
- 通过 Ambari、Bigtop 等平台自动安装 HDFS/Hive/HBase/Spark 等需要 snappy 支持的组件
- 线上/离线镜像环境初装,或云厂商最小化基础镜像
# 2. snappy-devel 包为何失踪?
snappy-devel 属于 Rocky8 的 Powertools 仓库(部分新版本已更名为 CRB)。 默认情况下,很多镜像或云市场镜像不会自动启用 powertools,这会直接导致大数据生态一堆依赖包找不到。
补充举例:
- Hadoop 生态中的 Parquet/Snappy 压缩支持依赖此包
- 依赖 snappy-devel 的还有 Hive、HBase、Impala、Spark 等多组件
- 只要你遇到
Unable to find a match: snappy-devel
,80%概率就是源未启用
# 3. 如何精准验证和定位
你可以这样排查:
sudo dnf repolist all | grep -i powertools
1
若发现 powertools 没有启用 (enabled = 0
),或者仓库列表中根本没有该仓库,则说明系统根本看不到 snappy-devel 包。
# 仓库结构举例与补全配置
本次问题排查中,正确的 powertools 源配置应类似下方内容(以阿里云 Rocky8.10 镜像为例):
[powertools]
baseurl=https://mirrors.aliyun.com/rockylinux/8.10/PowerTools/x86_64/os/
enabled=1
gpgcheck=1
gpgkey=https://mirrors.aliyun.com/rockylinux/RPM-GPG-KEY-rockyofficial
1
2
3
4
5
2
3
4
5
如图实际配置:
# 一键修复命令
补全仓库后,需清理本地缓存并重新生成仓库索引,再安装缺失包:
sudo dnf clean all
sudo dnf makecache
sudo dnf install -y snappy-devel
1
2
3
2
3
实际效果如下:
- 01
- bigtop-select 打包缺 compat 报错修复 deb07-16
- 02
- bigtop-select 打包缺 control 文件报错修复 deb07-16
- 03
- 首次编译-环境初始化 必装07-16