Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gssapi dynamic lib #127

Merged
merged 8 commits into from
Jun 17, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 2 additions & 15 deletions .github/workflows/python-release.yml
Original file line number Diff line number Diff line change
@@ -28,19 +28,12 @@ jobs:
target: [x86_64, aarch64]
steps:
- uses: actions/checkout@v4
- name: Setup QEMU
uses: docker/setup-qemu-action@v3
if: ${{ matrix.target }} != 'x86_64'
- name: Build wheels
uses: PyO3/maturin-action@v1
with:
target: ${{ matrix.target }}
args: --release --out dist --find-interpreter --manifest-path python/Cargo.toml --features kerberos
args: --release --out dist --find-interpreter --manifest-path python/Cargo.toml
sccache: 'true'
container: quay.io/pypa/manylinux2014_${{ matrix.target }}:latest
docker-options: -e LD_LIBRARY_PATH=/opt/rh/llvm-toolset-7.0/root/usr/lib64 -e LLVM_CONFIG_PATH=/opt/rh/llvm-toolset-7.0/root/usr/bin/llvm-config
before-script-linux: |
yum install -y epel-release && yum install -y krb5-devel llvm-toolset-7.0-clang llvm-toolset-7.0-llvm-devel
- name: Upload wheels
uses: actions/upload-artifact@v4
with:
@@ -54,18 +47,12 @@ jobs:
target: [x86_64, aarch64]
steps:
- uses: actions/checkout@v4
- name: Install native libs
run:
brew install krb5
- name: Build wheels
uses: PyO3/maturin-action@v1
with:
target: ${{ matrix.target }}
args: --release --out dist --find-interpreter --manifest-path python/Cargo.toml --features kerberos
args: --release --out dist --find-interpreter --manifest-path python/Cargo.toml
sccache: 'true'
env:
BINDGEN_EXTRA_CLANG_ARGS: "-I/usr/local/include"
LIBRARY_PATH: /usr/local/lib
- name: Upload wheels
uses: actions/upload-artifact@v4
with:
12 changes: 3 additions & 9 deletions .github/workflows/python-test.yml
Original file line number Diff line number Diff line change
@@ -29,9 +29,6 @@ jobs:
distribution: "temurin"
java-version: "17"

- name: Install native libs
run: sudo apt-get install -y libkrb5-dev krb5-user

- name: Download Hadoop
run: |
wget -q https://dlcdn.apache.org/hadoop/common/hadoop-3.3.6/hadoop-3.3.6.tar.gz
@@ -53,7 +50,7 @@ jobs:
sccache: 'true'
container: 'off'
working-directory: ./python
args: --features kerberos --extras devel
args: --extras devel

- name: Run lints
run: |
@@ -74,12 +71,9 @@ jobs:
- name: Build wheel
uses: PyO3/maturin-action@v1
with:
args: --release --out dist --find-interpreter --manifest-path python/Cargo.toml --features kerberos
args: --release --out dist --find-interpreter --manifest-path python/Cargo.toml
sccache: 'true'
manylinux: '2014'
docker-options: -e LD_LIBRARY_PATH=/opt/rh/llvm-toolset-7.0/root/usr/lib64 -e LLVM_CONFIG_PATH=/opt/rh/llvm-toolset-7.0/root/usr/bin/llvm-config
before-script-linux: |
yum install -y epel-release && yum install -y krb5-devel llvm-toolset-7.0-clang llvm-toolset-7.0-llvm-devel

- name: Upload wheels
if: github.ref == 'refs/heads/master'
uses: actions/upload-artifact@v4
12 changes: 3 additions & 9 deletions .github/workflows/rust-test.yml
Original file line number Diff line number Diff line change
@@ -40,23 +40,17 @@ jobs:

- uses: Swatinem/rust-cache@v2

- name: Install native libs
run: sudo apt-get install -y libkrb5-dev

- name: build and lint with clippy
run: cargo clippy --all-targets --features kerberos,integration-test,benchmark -- -D warnings
run: cargo clippy --all-targets --features integration-test,benchmark -- -D warnings

- name: Check docs
run: cargo doc

- name: Check no features
run: cargo check --tests

- name: Check kerberos
run: cargo check --tests --features kerberos

- name: Check all features
run: cargo check --all-targets --features kerberos,integration-test,benchmark
run: cargo check --all-targets --features integration-test,benchmark

test:
strategy:
@@ -99,4 +93,4 @@ jobs:
echo "$GITHUB_WORKSPACE/hadoop-3.4.0/bin" >> $GITHUB_PATH

- name: Run tests
run: cargo test --features kerberos,integration-test
run: cargo test --features integration-test
56 changes: 3 additions & 53 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

38 changes: 21 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
@@ -22,13 +22,31 @@ Here is a list of currently supported and unsupported but possible future featur
- RS schema only, no support for RS-Legacy or XOR

### Security Features
- [x] Kerberos authentication (GSSAPI SASL support)
- [x] Kerberos authentication (GSSAPI SASL support) (requires libgssapi_krb5, see below)
- [x] Token authentication (DIGEST-MD5 SASL support)
- [x] NameNode SASL connection
- [x] DataNode SASL connection
- [x] DataNode data transfer encryption
- [ ] Encryption at rest (KMS support)

### Kerberos Support
Kerberos (SASL GSSAPI) mechanism is supported through a runtime dynamic link to `libgssapi_krb5`. This must be installed separately, but is likely already installed on your system. If not you can install it by:

#### Debian-based systems
```bash
apt-get install libgssapi-krb5-2
```

#### RHEL-based systems
```bash
yum install krb5-libs
```

#### MacOS
```bash
brew install krb5
```

## Supported HDFS Settings
The client will attempt to read Hadoop configs `core-site.xml` and `hdfs-site.xml` in the directories `$HADOOP_CONF_DIR` or if that doesn't exist, `$HADOOP_HOME/etc/hadoop`. Currently the supported configs that are used are:
- `fs.defaultFS` - Client::default() support
@@ -41,32 +59,18 @@ All other settings are generally assumed to be the defaults currently. For insta

## Building

### Mac
```
brew install krb5
# You might need these env vars on newer Macs
export BINDGEN_EXTRA_CLANG_ARGS="-I/opt/homebrew/include"
export LIBRARY_PATH=/opt/homebrew/lib
cargo build --features kerberos
```

### Ubuntu
```
apt-get install clang libkrb5-dev
cargo build --features kerberos
cargo build
```

## Crate features
- `kerberos` - enables kerberos GSSAPI authentication support. This uses the `libgssapi` crate and supports integrity as well as confidentiality

## Object store implementation
An object_store implementation for HDFS is provided in the [hdfs-native-object-store](./crates/hdfs-native-object-store/) crate.

## Running tests
The tests are mostly integration tests that utilize a small Java application in `rust/mindifs/` that runs a custom `MiniDFSCluster`. To run the tests, you need to have Java, Maven, Hadoop binaries, and Kerberos tools available and on your path. Any Java version between 8 and 17 should work.

```bash
cargo test -p hdfs-native --features kerberos,intergation-test
cargo test -p hdfs-native --features intergation-test
```

### Python tests
3 changes: 0 additions & 3 deletions python/Cargo.toml
Original file line number Diff line number Diff line change
@@ -32,6 +32,3 @@ log = { workspace = true }
pyo3 = { version = "0.20", features = ["extension-module", "abi3", "abi3-py38"] }
thiserror = { workspace = true }
tokio = { workspace = true, features = ["rt-multi-thread"] }

[features]
kerberos = ["hdfs-native/kerberos"]
18 changes: 18 additions & 0 deletions python/README.md
Original file line number Diff line number Diff line change
@@ -15,6 +15,24 @@ client = Client("hdfs://localhost:9000")
status = client.get_file_info("/file.txt")
```

## Kerberos support
Kerberos (SASL GSSAPI) is supported through a runtime dynamic link to `libgssapi_krb5`. This must be installed separately, but is likely already installed on your system. If not you can install it by:

#### Debian-based systems
```bash
apt-get install libgssapi-krb5-2
```

#### RHEL-based systems
```bash
yum install krb5-libs
```

#### MacOS
```bash
brew install krb5
```

## Running tests
The same requirements apply as the Rust tests, requiring Java, Maven, Hadoop, and Kerberos tools to be on your path. Then you can:

5 changes: 2 additions & 3 deletions rust/Cargo.toml
Original file line number Diff line number Diff line change
@@ -13,6 +13,7 @@ license = "Apache-2.0"
[dependencies]
aes = "0.8"
base64 = "0.21"
bitflags = "2"
bytes = { workspace = true }
cbc = "0.1"
chrono = "0.4"
@@ -27,7 +28,7 @@ g2p = "1"
hex = "0.4"
hmac = "0.12"
libc = "0.2"
libgssapi = { version = "0.7", default-features = false, optional = true }
libloading = "0.8"
log = { workspace = true }
md-5 = "0.10"
num-traits = "0.2"
@@ -57,8 +58,6 @@ tempfile = "3"
which = "4"

[features]
kerberos = ["libgssapi"]

generate-protobuf = ["prost-build", "protobuf-src"]
integration-test = ["which"]
benchmark = ["fs-hdfs3", "which"]
34 changes: 34 additions & 0 deletions rust/c_src/gssapi_mit.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
#include <gssapi/gssapi.h>
#include <gssapi/gssapi_krb5.h>

const OM_uint32 _GSS_C_INDEFINITE = GSS_C_INDEFINITE;
const OM_uint32 _GSS_C_CALLING_ERROR_MASK = GSS_C_CALLING_ERROR_MASK;
const OM_uint32 _GSS_C_ROUTINE_ERROR_MASK = GSS_C_ROUTINE_ERROR_MASK;
const OM_uint32 _GSS_C_SUPPLEMENTARY_MASK = GSS_C_SUPPLEMENTARY_MASK;
const OM_uint32 _GSS_S_CALL_INACCESSIBLE_READ = GSS_S_CALL_INACCESSIBLE_READ;
const OM_uint32 _GSS_S_CALL_INACCESSIBLE_WRITE = GSS_S_CALL_INACCESSIBLE_WRITE;
const OM_uint32 _GSS_S_CALL_BAD_STRUCTURE = GSS_S_CALL_BAD_STRUCTURE;
const OM_uint32 _GSS_S_BAD_MECH = GSS_S_BAD_MECH;
const OM_uint32 _GSS_S_BAD_NAME = GSS_S_BAD_NAME;
const OM_uint32 _GSS_S_BAD_NAMETYPE = GSS_S_BAD_NAMETYPE;
const OM_uint32 _GSS_S_BAD_BINDINGS = GSS_S_BAD_BINDINGS;
const OM_uint32 _GSS_S_BAD_STATUS = GSS_S_BAD_STATUS;
const OM_uint32 _GSS_S_BAD_SIG = GSS_S_BAD_SIG;
const OM_uint32 _GSS_S_BAD_MIC = GSS_S_BAD_SIG;
const OM_uint32 _GSS_S_NO_CRED = GSS_S_NO_CRED;
const OM_uint32 _GSS_S_NO_CONTEXT = GSS_S_NO_CONTEXT;
const OM_uint32 _GSS_S_DEFECTIVE_TOKEN = GSS_S_DEFECTIVE_TOKEN;
const OM_uint32 _GSS_S_DEFECTIVE_CREDENTIAL = GSS_S_DEFECTIVE_CREDENTIAL;
const OM_uint32 _GSS_S_CREDENTIALS_EXPIRED = GSS_S_CREDENTIALS_EXPIRED;
const OM_uint32 _GSS_S_CONTEXT_EXPIRED = GSS_S_CONTEXT_EXPIRED;
const OM_uint32 _GSS_S_FAILURE = GSS_S_FAILURE;
const OM_uint32 _GSS_S_BAD_QOP = GSS_S_BAD_QOP;
const OM_uint32 _GSS_S_UNAUTHORIZED = GSS_S_UNAUTHORIZED;
const OM_uint32 _GSS_S_UNAVAILABLE = GSS_S_UNAVAILABLE;
const OM_uint32 _GSS_S_DUPLICATE_ELEMENT = GSS_S_DUPLICATE_ELEMENT;
const OM_uint32 _GSS_S_NAME_NOT_MN = GSS_S_NAME_NOT_MN;
const OM_uint32 _GSS_S_CONTINUE_NEEDED = GSS_S_CONTINUE_NEEDED;
const OM_uint32 _GSS_S_DUPLICATE_TOKEN = GSS_S_DUPLICATE_TOKEN;
const OM_uint32 _GSS_S_OLD_TOKEN = GSS_S_OLD_TOKEN;
const OM_uint32 _GSS_S_UNSEQ_TOKEN = GSS_S_UNSEQ_TOKEN;
const OM_uint32 _GSS_S_GAP_TOKEN = GSS_S_GAP_TOKEN;
10 changes: 10 additions & 0 deletions rust/generate_gssapi_bindings.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
#!/usr/bin/env bash
# Need main branch until next release for var support in dynamic loading
cargo install --git https://github.com/rust-lang/rust-bindgen --branch main bindgen-cli

bindgen c_src/gssapi_mit.h \
--allowlist-type "OM_.+|gss_.+" \
--allowlist-var "_?GSS_.+|gss_.+" \
--allowlist-function "gss_.*" \
--dynamic-loading GSSAPI \
-o src/security/gssapi_bindings.rs
5 changes: 1 addition & 4 deletions rust/src/error.rs
Original file line number Diff line number Diff line change
@@ -1,7 +1,5 @@
use std::io;

#[cfg(feature = "kerberos")]
use libgssapi::error::Error as GssapiError;
use prost::DecodeError;
use thiserror::Error;

@@ -45,9 +43,8 @@ pub enum HdfsError {
FatalRPCError(String, String),
#[error("SASL error")]
SASLError(String),
#[cfg(feature = "kerberos")]
#[error("GSSAPI error")]
GSSAPIError(#[from] GssapiError),
GSSAPIError(crate::security::gssapi::GssMajorCodes, u32, String),
#[error("No valid SASL mechanism found")]
NoSASLMechanism,
}
Loading
Oops, something went wrong.