[O] Impala 版本适配改造(二)
# 适配背景与核心目标
随着 Hadoop 3.3.4 在大数据平台中的应用增多,Impala 原生部分依赖包、类路径及三方库引用方式与新版本 Hadoop 产生了直接冲突或兼容性问题。本次 patch 主要聚焦于解决和 Hadoop 3.3.4 相关的不兼容问题,确保 Impala 能顺利在新版环境下编译与运行。
提示
Hadoop 3.x 的依赖体系与部分 shading 策略相比老版本有较大差异,适配时务必关注三方包重定位与类引用路径的同步调整。
# 主要改造点与工程价值
# 1. 依赖引用路径同步更新
- 原有部分 shaded 包或 relocated 包路径已不再适用,需同步更新为标准三方库路径。
- 例如,
org.apache.iceberg.relocated.com.google.common.collect.Lists
替换为com.google.common.collect.Lists
,确保代码与新版依赖树一致。 - 同理,protobuf、commons-lang 等引用路径均需统一到主流三方库标准位置。
避免 shaded 机制依赖路径重构变化导致的编译或运行期 NoClassDefFound 错误。
# 2. 依赖包转移与 pom 补全
- 对于新版依赖必须单独声明的三方包(如
protobuf-java
),直接在fe/pom.xml
明确添加依赖, 指定合理版本并防止与主 Hadoop 包冲突。 - 同时确保各依赖的
scope
和version
匹配主依赖链,杜绝因多版本冲突引发运行时异常。
笔记
这种直接在 pom 明确声明的方式,有效避免了依赖缺失或版本不一致问题,编译和运行都更稳定。
# 3. 新老 API 差异与包迁移
- 某些 Avro、Iceberg 等三方库在大版本变更后常有内部 API、Exception 包名调整。例如:
org.apache.commons.lang.NotImplementedException
→org.apache.commons.lang3.NotImplementedException
- 本次 patch 同步修正相关包引用,确保兼容新版依赖。
实际生产中类似迁移要重点关注,API同步尤其是涉及 shaded 包或 relocated class。
完整 diff 如下:
Subject: [PATCH] optimized: 处理hadoop 3.3.4 不兼容问题
---
Index: java/TableFlattener/src/main/java/org/apache/impala/infra/tableflattener/SchemaFlattener.java
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/java/TableFlattener/src/main/java/org/apache/impala/infra/tableflattener/SchemaFlattener.java b/java/TableFlattener/src/main/java/org/apache/impala/infra/tableflattener/SchemaFlattener.java
--- a/java/TableFlattener/src/main/java/org/apache/impala/infra/tableflattener/SchemaFlattener.java (revision 57b8347ef66abdb645badf925d8150e2a32f0e44)
+++ b/java/TableFlattener/src/main/java/org/apache/impala/infra/tableflattener/SchemaFlattener.java (date 1740982071128)
@@ -23,7 +23,7 @@
import org.apache.avro.Schema.Field;
import org.apache.avro.Schema.Type;
import org.apache.avro.generic.GenericRecord;
-import org.apache.commons.lang.NotImplementedException;
+import org.apache.commons.lang3.NotImplementedException;
import org.apache.hadoop.conf.Configuration;
import org.kitesdk.data.Dataset;
import org.kitesdk.data.DatasetDescriptor;
Index: fe/src/main/java/org/apache/impala/util/IcebergMetadataScanner.java
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/fe/src/main/java/org/apache/impala/util/IcebergMetadataScanner.java b/fe/src/main/java/org/apache/impala/util/IcebergMetadataScanner.java
--- a/fe/src/main/java/org/apache/impala/util/IcebergMetadataScanner.java (revision 57b8347ef66abdb645badf925d8150e2a32f0e44)
+++ b/fe/src/main/java/org/apache/impala/util/IcebergMetadataScanner.java (date 1740966368026)
@@ -17,7 +17,7 @@
package org.apache.impala.util;
-import com.cloudera.cloud.storage.relocated.protobuf.Struct;
+import com.google.protobuf.Struct;;
import com.google.common.base.Preconditions;
import java.util.Iterator;
Index: fe/src/main/java/org/apache/impala/util/IcebergSchemaConverter.java
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/fe/src/main/java/org/apache/impala/util/IcebergSchemaConverter.java b/fe/src/main/java/org/apache/impala/util/IcebergSchemaConverter.java
--- a/fe/src/main/java/org/apache/impala/util/IcebergSchemaConverter.java (revision 57b8347ef66abdb645badf925d8150e2a32f0e44)
+++ b/fe/src/main/java/org/apache/impala/util/IcebergSchemaConverter.java (date 1740970508500)
@@ -30,7 +30,7 @@
import org.apache.iceberg.PartitionSpec;
import org.apache.iceberg.Schema;
import org.apache.iceberg.hive.HiveSchemaUtil;
-import org.apache.iceberg.relocated.com.google.common.collect.Lists;
+import com.google.common.collect.Lists;
import org.apache.iceberg.types.Types;
import org.apache.impala.analysis.IcebergPartitionField;
import org.apache.impala.analysis.IcebergPartitionSpec;
Index: fe/src/main/java/org/apache/impala/util/MigrateTableUtil.java
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/fe/src/main/java/org/apache/impala/util/MigrateTableUtil.java b/fe/src/main/java/org/apache/impala/util/MigrateTableUtil.java
--- a/fe/src/main/java/org/apache/impala/util/MigrateTableUtil.java (revision 57b8347ef66abdb645badf925d8150e2a32f0e44)
+++ b/fe/src/main/java/org/apache/impala/util/MigrateTableUtil.java (date 1740970508491)
@@ -40,7 +40,7 @@
import org.apache.iceberg.Table;
import org.apache.iceberg.catalog.TableIdentifier;
import org.apache.iceberg.data.TableMigrationUtil;
-import org.apache.iceberg.relocated.com.google.common.collect.Lists;
+import com.google.common.collect.Lists;
import org.apache.impala.analysis.TableName;
import org.apache.impala.catalog.FeCatalog;
import org.apache.impala.catalog.FeFsPartition;
Index: fe/pom.xml
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/fe/pom.xml b/fe/pom.xml
--- a/fe/pom.xml (revision 57b8347ef66abdb645badf925d8150e2a32f0e44)
+++ b/fe/pom.xml (date 1740984830952)
@@ -613,6 +613,12 @@
<version>1.72</version>
<scope>test</scope>
</dependency>
+
+ <dependency>
+ <groupId>com.google.protobuf</groupId>
+ <artifactId>protobuf-java</artifactId>
+ <version>3.19.1</version>
+ </dependency>
</dependencies>
<reporting>
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
- 01
- [/metrics/aggregated] — 聚合数据范围 检查点09-19
- 02
- [/metrics] — 反向分析接口参数 请求抓包09-17
- 03
- [/metrics] — 普通指标写入方法 POST09-17