0%

Ranger在HDFS上的使用

HDFS是整个Hadoop集群的基础,为了保证Hadoop之上数据的安全,必须处理首先维护好Hadoop集群的安全性。

HDFS上安全性问题分为三个部分:用户认证、用户映射、权限认证。

用户认证

HDFS支持两种认证方式:Simple和Kerberos两种

  • Simple认证并非user/password认证方式,而是关闭认证
  • Kerberos认证为票证认证中心结构的方式,需要定期刷新票证以保证认证的有效性。

用户映射

HDFS用户映射方式与其它大数据套件的映射方式类似,主要有以下几类:

  • Native,也即是服务器本地用户组映射,会将Unix或文件系统内的用户组结构同步到HDFS用户上
  • LDAP,支持LDAP协议下的用户组同步

权限认证

HDFS权限认证作用于NameNode查询过程中,默认使用自带的ACLs权限系统,通过org.apache.hadoop.hdfs.server.namenode.INodeAttributeProvider提供扩展,使用比较广泛的权限认证系统如Sentry和Ranger,通过插件实现了此接口。

通过分析Ranger的HDFS插件,可见HDFS权限系统扩展细节,Ranger的HDFS插件实现在其子模块ranger-hdfs-plugin中,接口实现类为org.apache.ranger.authorization.hadoop.INodeAttributeProvider,其中AccessControlEnforcer的实现为权限认证的过程,HDFS权限认证使用的是委派模型。

1
2
3
4
5
@Override
public AccessControlEnforcer getExternalAccessControlEnforcer(AccessControlEnforcer defaultEnforcer) {
RangerAccessControlEnforcer rangerAce = new RangerAccessControlEnforcer(defaultEnforcer);
return rangerAce;
}
1
2
3
4
5
6
7
class RangerAccessControlEnforcer implements AccessControlEnforcer {
// 委派类
private INodeAttributeProvider.AccessControlEnforcer defaultEnforcer = null;
public RangerAccessControlEnforcer(AccessControlEnforcer defaultEnforcer) {
this.defaultEnforcer = defaultEnforcer;
}
}

HDFS默认ACLs使用的是org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker,委派过程为:

1
2
3
4
5
AccessControlEnforcer enforcer = 
getAttributesProvider().getExternalAccessControlEnforcer(this);
enforcer.checkPermission(fsOwner, supergroup, callerUgi, inodeAttrs, inodes,
pathByNameArr, snapshotId, path, ancestorIndex, doCheckOwner,
ancestorAccess, parentAccess, access, subAccess, ignoreEmptyDir);

在配置了Ranger Authorization时,会通过RangerHdfsAuthorizer委派,可以看到,在RangerHdfsAuthorizer中贯彻了委派模型, checkPermission的过程如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
@Override
public void checkPermission(String fsOwner, String superGroup, UserGroupInformation ugi,
INodeAttributes[] inodeAttrs, INode[] inodes,
byte[][]pathByNameArr, int snapshotId, String path,
int ancestorIndex, boolean doCheckOwner,
FsAction ancestorAccess, FsAction parentAccess,
FsAction access, FsAction subAccess, boolean ignoreEmptyDir)
throws AccessControlException {
final boolean isTraverseOnlyCheck = access == null && parentAccess == null && ancestorAccess == null && subAccess == null;
INode ancestor = null;
INode parent = null;
INode inode = null;

if(plugin != null && !ArrayUtils.isEmpty(inodes)) {
int sz = inodeAttrs.length;
byte[][] components = new byte[sz][];

int i = 0;
for (; i < sz; i++) {
if (inodeAttrs[i] != null) {
components[i] = inodeAttrs[i].getLocalNameBytes();
} else {
break;
}
}

resourcePath = DFSUtil.byteArray2PathString(components, 0, i);

if(ancestorIndex >= inodes.length) {
ancestorIndex = inodes.length - 1;
}

for(; ancestorIndex >= 0 && inodes[ancestorIndex] == null; ancestorIndex--);

authzStatus = AuthzStatus.ALLOW;

ancestor = inodes.length > ancestorIndex && ancestorIndex >= 0
? inodes[ancestorIndex]
: null;
parent = inodes.length > 1
? inodes[inodes.length - 2]
: null;
inode = inodes[inodes.length - 1]; // could be null while creating a new file
auditHandler = new RangerHdfsAuditHandler(resourcePath, isTraverseOnlyCheck);

/*
此处解释说明了,Ranger针对Hadoop2.8.0前后版本的处理方式上的不同
Hadoop2.8.0之前的没有对父级目录权限的检查,而2.8.0之后有
*/
if(isTraverseOnlyCheck) {
authzStatus = traverseOnlyCheck(inode, inodeAttrs, resourcePath,
components, parent, ancestor,
ancestorIndex, user, groups, plugin,
auditHandler);
}

// checkStickyBit
if (authzStatus == AuthzStatus.ALLOW && parentAccess != null
&& parentAccess.implies(FsAction.WRITE)
&& parent != null && inode != null) {
if (parent.getFsPermission() != null
&& parent.getFsPermission().getStickyBit()) {
// user should be owner of the parent or the inode
authzStatus = (StringUtils.equals(parent.getUserName(), user)
|| StringUtils.equals(inode.getUserName(), user))
? AuthzStatus.ALLOW
: AuthzStatus.NOT_DETERMINED;
}
}

// checkAncestorAccess
if(authzStatus == AuthzStatus.ALLOW
&& ancestorAccess != null
&& ancestor != null) {
INodeAttributes ancestorAttribs =
inodeAttrs.length > ancestorIndex ? inodeAttrs[ancestorIndex] : null;
String ancestorPath = ancestorAttribs != null
? DFSUtil.byteArray2PathString(components, 0, ancestorIndex + 1)
: null;

authzStatus = isAccessAllowed(ancestor, ancestorAttribs, ancestorPath,
ancestorAccess, user, groups, plugin,
auditHandler);
if (authzStatus == AuthzStatus.NOT_DETERMINED) {
authzStatus = checkDefaultEnforcer(fsOwner, superGroup, ugi,
inodeAttrs, inodes,
pathByNameArr, snapshotId, path,
ancestorIndex, doCheckOwner,
ancestorAccess, null, null, null,
ignoreEmptyDir,
isTraverseOnlyCheck, ancestor,
parent, inode, auditHandler);
}
}

// checkParentAccess
if(authzStatus==AuthzStatus.ALLOW && parentAccess!=null && parent != null){
INodeAttributes parentAttribs = inodeAttrs.length > 1
? inodeAttrs[inodeAttrs.length - 2] : null;
String parentPath = parentAttribs != null
? DFSUtil.byteArray2PathString(components, 0, inodeAttrs.length - 1)
: null;

authzStatus = isAccessAllowed(parent, parentAttribs, parentPath, parentAccess, user, groups, plugin, auditHandler);
if (authzStatus == AuthzStatus.NOT_DETERMINED) {
authzStatus = checkDefaultEnforcer(fsOwner, superGroup, ugi, inodeAttrs, inodes,
pathByNameArr, snapshotId, path, ancestorIndex, doCheckOwner,
null, parentAccess, null, null, ignoreEmptyDir,
isTraverseOnlyCheck, ancestor, parent, inode, auditHandler);
}
}

// checkINodeAccess
if(authzStatus == AuthzStatus.ALLOW && access != null && inode != null) {
INodeAttributes inodeAttribs = inodeAttrs.length > 0
? inodeAttrs[inodeAttrs.length - 1]
: null;

authzStatus = isAccessAllowed(inode, inodeAttribs, resourcePath, access,
user, groups, plugin, auditHandler);
if (authzStatus == AuthzStatus.NOT_DETERMINED)
authzStatus = checkDefaultEnforcer(fsOwner, superGroup, ugi,
inodeAttrs, inodes,
pathByNameArr, snapshotId, path,
ancestorIndex, doCheckOwner,
null, null, access, null,
ignoreEmptyDir,
isTraverseOnlyCheck, ancestor,
parent, inode, auditHandler);
}

// checkSubAccess
if(authzStatus == AuthzStatus.ALLOW && subAccess != null
&& inode != null && inode.isDirectory()) {
Stack<SubAccessData> directories = new Stack<>();

for(directories.push(new SubAccessData(inode.asDirectory(),
resourcePath));
!directories.isEmpty(); ) {
SubAccessData data = directories.pop();
ReadOnlyList<INode> cList = data.dir.getChildrenList(snapshotId);

if (!(cList.isEmpty() && ignoreEmptyDir)) {
INodeAttributes dirAttribs=data.dir.getSnapshotINode(snapshotId);
authzStatus = isAccessAllowed(data.dir, dirAttribs,
data.resourcePath, subAccess, user,
groups, plugin, auditHandler);

if(authzStatus != AuthzStatus.ALLOW) {
break;
}

AuthzStatus subDirAuthStatus = AuthzStatus.NOT_DETERMINED;

boolean optimizeSubAccessAuthEnabled = RangerHdfsPlugin.
isOptimizeSubAccessAuthEnabled();

if (optimizeSubAccessAuthEnabled)
subDirAuthStatus =
isAccessAllowedForHierarchy(data.dir, dirAttribs,
data.resourcePath, subAccess,
user, groups, plugin);

if (subDirAuthStatus != AuthzStatus.ALLOW)
for(INode child : cList) {
if (child.isDirectory())
directories.push(
new SubAccessData(child.asDirectory(),
resourcePath + org.apache.hadoop.fs.Path.SEPARATOR_CHAR + child.getLocalName()));
}
}
if (authzStatus == AuthzStatus.NOT_DETERMINED) {
authzStatus = checkDefaultEnforcer(fsOwner, superGroup, ugi,
inodeAttrs, inodes,
pathByNameArr, snapshotId, path,
ancestorIndex, doCheckOwner,
null, null, null, subAccess,
ignoreEmptyDir,
isTraverseOnlyCheck, ancestor,
parent, inode, auditHandler);

}
}

// checkOwnerAccess
if(authzStatus == AuthzStatus.ALLOW && doCheckOwner) {
INodeAttributes inodeAttribs = inodeAttrs.length > 0
? inodeAttrs[inodeAttrs.length - 1]
: null;
String owner = inodeAttribs != null
? inodeAttribs.getUserName()
: null;

authzStatus = StringUtils.equals(user, owner)
? AuthzStatus.ALLOW
: AuthzStatus.NOT_DETERMINED;
}
}

// 当Ranger未定义此权限,则由上级委派规则进行权限检查。
if (authzStatus == AuthzStatus.NOT_DETERMINED) {
authzStatus = checkDefaultEnforcer(fsOwner, superGroup, ugi, inodeAttrs,
inodes, pathByNameArr, snapshotId, path,
ancestorIndex, doCheckOwner,
ancestorAccess, parentAccess, access,
subAccess, ignoreEmptyDir,
isTraverseOnlyCheck, ancestor, parent,
inode, auditHandler);
}

if(authzStatus != AuthzStatus.ALLOW) {
FsAction action = access;

if(action == null) {
if(parentAccess != null) {
action = parentAccess;
} else if(ancestorAccess != null) {
action = ancestorAccess;
} else {
action = FsAction.EXECUTE;
}
}

throw new RangerAccessControlException("Permission denied: user=" + user + ", access=" + action + ", inode=\"" + resourcePath + "\"");
}
}

总的来说,即是

img

(Ranger的作用与具体实现有关,如Hive和HBase,Ranger实现为唯一的授权规则)

Ranger在HDFS上的使用

Ranger的HDFS插件之所以这样设计,想来是考虑到易用性及扩展性。对于安全性要求较高的数据目录,可以使用Ranger进行管理,而/tmp、/user等目录,则可Fallback到HDFS ACL。

Fallback模式配置项在hive-site.xml中

xzsecure.add-hadoop-authorization:true
同时,需要对Ranger管理的目录的权限进行一下修改,关闭默认的HDFS ACL。过程如下:

  1. 修改hdfs-site.xml中的配置项fs.permissions.umask-mode077(默认为022),以防止新创建的文件被除owner外的其它账户获取。

  2. 确定被Ranger管理的目录,修改其HDFS文件系统权限

    1
    2
    $ hdfs dfs -chmod -R 000 /app/hive
    $ hdfs dfs -chown -R hdfs:hdfs app/hive
  3. 通过Ranger Admin UI,给数据赋予相应用户权限
    img

  4. 确定会被Fallback到HDFS ACL的目录,查看是否受影响

  5. 检查各自的Audit结果