[GPU] Kubeflow & GCP & K8S

Kubernetes

[GPU] Kubeflow & GCP & K8S

아르비스 2021. 1. 27. 21:36

GKE 기반의 GPU나. 다른 버전의 Kubeflow 설치는 많지만.. GCP에서 GPU 를 Kubernetes에서 동작하는 버전이 없어서..

이글을 정리함.

설치 버전 정리

Google Clould Compute Engine
Ubuntu 18.04.5 LTS
Nvidia driver 460
docker-CE 20.10.2
kubernetes v1.16.15
Weave Net
kubeflow 1.1

1. 서버 Spec

구분	CPU	RAM	Storage	GPU
Master	4 vCore	15 GB	300 GB	-
Node-1	8 vCore	36 GB	300 GB	1x NVIDIA Tesla T4
Node-2	8 vCore	36 GB	300 GB	1x NVIDIA Tesla T4

OS : Ubuntu Ubuntu 18.04.5 LTS (Bionic Beaver)

(20.04 버전은 kubeflow가 아직 지원을 안하는 듯 함... 구성시 문제 발생함)

2. NVIDIA GPU Driver 설치

cloud.google.com/compute/docs/gpus/install-drivers-gpu#ubuntu-driver-steps

Ubuntu 18.04 LTS [Woker Node 실행]

curl -O https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
sudo add-apt-repository "deb http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/ /"


sudo apt update

sudo apt install cuda

Worker Node (GPU가 있는 VM)에서 실행

Driver 설치 확인

root@gpu-n1:~# nvidia-smi
Wed Jan 27 08:10:32 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   76C    P0    33W /  70W |      0MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

3. Docker 설치

docker repository 추가

sudo apt-get install \
    apt-transport-https \
    ca-certificates \
    curl \
    gnupg-agent \
    software-properties-common
	
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
	
sudo add-apt-repository \
   "deb [arch=amd64] https://download.docker.com/linux/ubuntu \
   $(lsb_release -cs) \
   stable"
   
sudo apt-get update

Docker Engine 설치 (v19 이후 버전부터 GPU 지원함.) 관련 사유로 최신 버전(v20.10.2)으로 설치

sudo apt-get install docker-ce docker-ce-cli containerd.io
sudo apt-mark hold docker-ce docker-ce-cli

Docker version 확인

root@gpu-mst:~# docker version
Client: Docker Engine - Community
 Version:           20.10.2
 API version:       1.41
 Go version:        go1.13.15
 Git commit:        2291f61
 Built:             Mon Dec 28 16:17:32 2020
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.2
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.13.15
  Git commit:       8891c58
  Built:            Mon Dec 28 16:15:09 2020
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.4.3
  GitCommit:        269548fa27e0089a8b8278fc4fc781d7f65a939b
 runc:
  Version:          1.0.0-rc92
  GitCommit:        ff819c7e9184c13b7c2607fe6c30ae19403a7aff
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

4. NVIDIA-docker2 설치하기

NVIDIA-CONTAINER-RUNTIME 설치

패지키 repo 추가

본인의 배포판에 맞는 패키지 repository를 추가한다. 정확한 내용은 nvidia 깃허브에서 제공하는 내용을 확인한다. 아래 내용은 포스팅 작성일 기준의 repo 설정방법이다.

curl -s -L https://nvidia.github.io/nvidia-container-runtime/gpgkey | \
  sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-container-runtime/$distribution/nvidia-container-runtime.list | \
  sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.list
sudo apt-get update

패키지 설치

apt-get install -y nvidia-container-runtime

설치 확인

which nvidia-container-runtime-hook

여기까지 진행이 되었다면, docker 컨테이너에서 GPU자원을 사용할 수 있는 준비가 완료 된 것이다.

컨테이너 GPU 사용 확인

--gpus 플레그를 추가하여 컨테이너 시작시 GPU 리소스에 접근하도록 설정할 수 있다.

docker run -it --rm --gpus all ubuntu nvidia-smi

docker로 ubuntu 컨테이너가 실행되고, 내부에서 nvidia-smi 명령어가 동작하는 것을 확인 했다면 정상적으로 셋팅 된것이다.

nvidia-smi Docker run 시 아래와 같이 오류가 발생하는 경우 처리

docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].
ERRO[0000] error waiting for container: context canceled

다음과 같이 실행

# Add the package repositories
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker

nvidia option old

# Add the package repositories

curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | \
  sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \
  sudo tee /etc/apt/sources.list.d/nvidia-docker.list


sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
sudo apt-get install nvidia-docker2
sudo systemctl restart docker

nvidia 는 Docker version에 따라서 두 종류로 Docker 실행방법이 달라짐.

# docker version 19.03 이상 
docker run --gpus all nvidia/cuda:10.0-base nvidia-smi 

# docker version 19.03 미만 
docker run --runtime=nvidia nvidia/cuda:10.0-base nvidia-smi

쿠버네티스에서 nvidia-docker를 사용하려면, docker의 기본 런타임(runtime)을 변경하고, NVIDIA 플러그인을 설치해야 함.

runtime 변경은 다음과 같이 진행함.

정상적으로 nvidia-docker2가 정상적으로 설치되었다면, /etc/docker/daemon.json 파일이 생성됨.(없으면 이상)

해당 파일을 열어서 아래 내용을 추가해야 함.

"default-runtime": "nvidia",

> vi /etc/docker/daemon.json

{
  "default-runtime": "nvidia", 
  "runtimes": {
    "nvidia": {
      "path": "nvidia-container-runtime",
      "runtimeArgs": []
    }
  }
}

파일을 수정하고, Docker daemon을 재시작함.

sudo systemctl restart docker

5. Kubernetes 설치

# update package repository
$ apt update && apt upgrade -y

# google k8s 패키지 소스 등록
$ curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -

$ cat <<EOF | sudo tee /etc/apt/sources.list.d/kubernetes.list
#deb https://apt.kubernetes.io/ kubernetes-xenial main
deb https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial main
EOF

$ apt-get update
$ apt install linux-image-extra-virtual ca-certificates curl software-properties-common -y

# 버전 설치
$ apt-get install -y kubelet=1.16.15-00 kubeadm=1.16.15-00 kubectl=1.16.15-00
$ apt-mark hold kubelet=1.16.15-00 kubeadm=1.16.15-00 kubectl=1.16.15-00

$ systemctl daemon-reload
$ systemctl restart kubelet

설치 버전 확인

# kubectl version
Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.15", GitCommit:"2adc8d7091e89b6e3ca8d048140618ec89b39369", GitTreeState:"clean", BuildDate:"2020-09-02T11:40:00Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.15", GitCommit:"2adc8d7091e89b6e3ca8d048140618ec89b39369", GitTreeState:"clean", BuildDate:"2020-09-02T11:31:21Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}

network 설정

swapoff -a

[cluster 생성]

#master 
sudo kubeadm init --pod-network-cidr=192.168.0.0/16

# node
kubeadm join 10.146.0.10:6443 --token srzpse.0j4lkl0ycz1kmffw \
    --discovery-token-ca-cert-hash sha256:1f4c6eabc7a22699bba990b3461ace3f6082bf88479f877f5ad413f591f612eb

kubeconfg 적용

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
export KUBECONFIG=$HOME/.kube/config
export KUBECONFIG=$HOME/.kube/config | tee -a ~/.bashrc

cluster 정보 확인

$ kubectl cluster-info
Kubernetes master is running at https://10.146.0.10:6443
KubeDNS is running at https://10.146.0.10:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

kubernetes 설치 확인

위 kubeconifg 가 적용되어야, kubectl 명령어 사용이 가능함.

여기까지 진행 후 'kubectl get nodes' 실행하면 NotReay 상태로, Pod 조회시

coredns 가 Pending 상태로 지속됨. 이는 kubernetes가 버전업 되면서, network관련 addon을 추가로 설치해 줘야 함.

$ kubectl get pods -A

NAMESPACE     NAME                             READY   STATUS    RESTARTS   AGE
kube-system   coredns-5c98db65d4-5jmmv         0/1     Pending   0          4h38m
kube-system   coredns-5c98db65d4-tqdpj         0/1     Pending   0          4h38m
kube-system   etcd-master                      1/1     Running   0          4h37m
kube-system   kube-apiserver-master            1/1     Running   0          4h38m
kube-system   kube-controller-manager-master   1/1     Running   0          4h38m
kube-system   kube-proxy-n8f76                 1/1     Running   0          4h38m
kube-system   kube-scheduler-master            1/1     Running   0          4h37m

kubernetes에서는 여러종류의 Networking addon을 제공하고 있음.

kubernetes.io/docs/concepts/cluster-administration/addons/

Installing Addons

Caution: This section links to third party projects that provide functionality required by Kubernetes. The Kubernetes project authors aren't responsible for these projects. This page follows CNCF website guidelines by listing projects alphabetically. To ad

kubernetes.io

많은 웹에서 Calico을 사용하고 있으나, 여러버전 테스트시 문제가 있어서, 여기서는 Weave Net 을 설치하여 사용함.

kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"

만약 calico를 사용한다면.. 아래와 같이 실행하면됨. (중복실행하지 않음.)

# kubectl apply -f https://docs.projectcalico.org/v3.17/manifests/calico.yaml

Kubernetes Dashboard 설치

kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0/aio/deploy/recommended.yaml

kubernetes pods 확인

kubectl get pods --all-namespaces

6. kubernetes-device-plugin 설치

kubectl을 이용하여 GPU(nvidia device)를 사용하기 위해서는 nvidia-device-plugin을 설치해야 함.

$ kubectl apply -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v1.12/nvidia-device-plugin.yml

error: unable to recognize "https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v1.12/nvidia-device-plugin.yml": no matches for k
ind "DaemonSet" in version "extensions/v1beta1"

쿠버네티스 버전이 올라가면서, Daemon의 extensions/v1beta1 버전을 더 이상 지원하지 않음. 버전을 apps/v1으로 변경하고, selector를 추가 함(메니페스트 변경)

cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: nvidia-device-plugin-daemonset-1.12
  namespace: kube-system
spec:
  updateStrategy:
    type: RollingUpdate
  selector:
    matchLabels:
      name: nvidia-device-plugin-ds
  template:
    metadata:
      # Mark this pod as a critical add-on; when enabled, the critical add-on scheduler
      # reserves resources for critical add-on pods so that they can be rescheduled after
      # a failure.  This annotation works in tandem with the toleration below.
      annotations:
        scheduler.alpha.kubernetes.io/critical-pod: ""
      labels:
        name: nvidia-device-plugin-ds
    spec:
      tolerations:
      # Allow this pod to be rescheduled while the node is in "critical add-ons only" mode.
      # This, along with the annotation above marks this pod as a critical add-on.
      - key: CriticalAddonsOnly
        operator: Exists
      - key: nvidia.com/gpu
        operator: Exists
        effect: NoSchedule
      containers:
      - image: nvidia/k8s-device-plugin:1.11
        name: nvidia-device-plugin-ctr
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop: ["ALL"]
        volumeMounts:
          - name: device-plugin
            mountPath: /var/lib/kubelet/device-plugins
      volumes:
        - name: device-plugin
          hostPath:
            path: /var/lib/kubelet/device-plugins
EOF

device-plugin pod 정상적으로 작동했는지 확인함.

$ kubectl -n kube-system get pod -l name=nvidia-device-plugin-ds
NAME                                        READY   STATUS    RESTARTS   AGE
nvidia-device-plugin-daemonset-1.12-mrml7   1/1     Running   0          36s
nvidia-device-plugin-daemonset-1.12-pdwsg   1/1     Running   0          36s
nvidia-device-plugin-daemonset-1.12-qswmg   1/1     Running   0          36s

7. Kubeflow 설치

설치 이전에 storage를 설치해 주어야 함.

1. Kubernetes Resource Storage Class 설치

pv 생성을 위해 NFS(Network file System)을 설치 함.

[Master Node]

# Master node nfs 서버 설치 ( master 서버를 nfs 서버로 사용 )
$ sudo apt install -y nfs-common nfs-kernel-server portmap
$ sudo mkdir /nfs      # 스토리지 폴더로 사용
$ sudo chmod 777 /nfs
$ sudo cat > /etc/exports << EOF
/nfs 10.146.0.10(rw,sync,no_root_squash,no_subtree_check) # master node 내부 ip
/nfs 10.146.0.8(rw,sync,no_root_squash,no_subtree_check) # worker node 1 내부 ip
/nfs 10.146.0.9(rw,sync,no_root_squash,no_subtree_check) # worker node 2 내부 ip
EOF
$ /etc/init.d/nfs-kernel-server restart  # 서버 재시작

[Woker Node]

# Worker node nfs 클라이언트 설치 ( 2대 모두 설치 )
$ sudo apt install nfs-common
$ sudo mkdir /nfs
$ sudo chmod 777 /nfs

2. nfs-clinet Storage Class 설치

[Master Node]

$ sudo curl https://raw.githubusercontent.com/helm/helm/master/scripts/get > get_helm.sh
$ sudo chmod 700 get_helm.sh
$ sudo ./get_helm.sh
 
$ sudo kubectl -n kube-system create sa tiller
$ sudo kubectl create clusterrolebinding tiller --clusterrole cluster-admin --serviceaccount=kube-system:tiller
$ sudo helm init --service-account tiller
$ sudo helm repo update
 
$ helm repo add stable https://charts.helm.sh/stable
$ helm repo update

# master node 주소
$ helm install --name nfs-client-provisioner --set nfs.server=10.146.0.10 --set nfs.path=/nfs stable/nfs-client-provisioner 
$ kubectl patch storageclass nfs-client -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

$ kubectl get sc
NAME                   PROVISIONER                            AGE
nfs-client (default)   cluster.local/nfs-client-provisioner   42s

3. Private Registry 배포

# private Registry 배포 
wget https://raw.githubusercontent.com/mojokb/handson-kubeflow/master/registry/kubeflow-registry-deploy.yaml

쿠버네티스 버전이 올라가면서, Daemonset의 extensions/v1beta1 버전을 더 이상 지원하지 않아서 입니다. 버전을 apps/v1 으로 변경하고 실행

kubectl apply -f kubeflow-registry-deploy.yaml

Service 실행

kubectl apply -f https://raw.githubusercontent.com/mojokb/handson-kubeflow/master/registry/kubeflow-registry-svc.yaml

# /etc/hosts에 private registry 등록
K8s에서 pod 생성 시 private registry를 lookup할 수 있도록 /etc/hosts에 등록 master 주소 등록

[master/ worker 모두 실행]

cat << EO_HOSTS >> /etc/hosts
10.146.0.10    kubeflow-registry.default.svc.cluster.local
EO_HOSTS
cat /etc/hosts

4. Kubeflow 설치

최신 버전은 1.2이지만, 안정성을 위해, v1.1을 설치함.

www.kubeflow.org/docs/started/k8s/overview/

Overview of Deployment on an Existing Kubernetes Cluster

Instructions for installing Kubeflow on your existing Kubernetes cluster with a list of supported options

www.kubeflow.org

export KF_HOME=~/kubeflow
export KF_NAME=sds-kubeflow
rm -rf ${KF_HOME}
mkdir -p $KF_HOME
cd $KF_HOME
rm -f ./kfctl*
wget https://github.com/kubeflow/kfctl/releases/download/v1.1.0/kfctl_v1.1.0-0-g9a3621e_linux.tar.gz
tar -xvf kfctl_*.tar.gz
export PATH=$PATH:$KF_HOME
export KF_DIR=${KF_HOME}/${KF_NAME}
#export CONFIG_URI=https://github.com/kubeflow/manifests/raw/master/kfdef/kfctl_k8s_istio.v1.1.0.yaml
export CONFIG_URI=https://github.com/kubeflow/manifests/raw/master/distributions/kfdef/kfctl_k8s_istio.v1.1.0.yaml
mkdir -p ${KF_DIR}
cd ${KF_DIR}
kfctl apply -V -f ${CONFIG_URI}


kubectl get pods -A

kubeflow dex version 설치시

export CONFIG_URI=https://raw.githubusercontent.com/kubeflow/manifests/v1.1-branch/kfdef/kfctl_istio_dex.v1.1.0.yaml

위와 같이 변경 적용

5. istio toke 활성화

kubeflow에서는 인증/권한 기능을 위해서 istio 를 사용한다. 그래서 istio-system 이라는 네임스페이스에 istio 관련 컴포넌트가 설치된다. 이부분 변경을 안하면, istio-token 관련 오류를 출력하고 멈춰진 상태로 유지되어서, 해당 기능을 활성화 해주어야 하며, 아래와 같이 추가해주어야 한다.

(/etc/kubernetes/manifests/kube-apiserver.yaml) 수정

$ sudo vi /etc/kubernetes/manifests/kube-apiserver.yaml
...
        - --service-account-signing-key-file=/etc/kubernetes/pki/sa.key
        - --service-account-issuer=api
        - --service-account-api-audiences=api,vault
        
...

위와 같이 매니페스트 파일을 수정하면, kube-apiserver Pod가 자동으로 다시 시작 됨.

# 설치 확인 
$ kubectl get pods -n kubeflow -o wide

GPU가 제대로 설정되었는지 확인하기 위해서는 다음과 같이 조회 가능함.

kubectl get nodes "-o=custom-columns=NAME:.metadata.name,GPU:.status.allocatable.nvidia\.com/gpu"

사용 가능한 GPU 리소스가 있다면, 다음과 같은 응답 결과를 얻을 수 있습니다.

NAME GPU
mortar 1

인식이 안되면 다음과 같이 표시된다.

# kubectl get nodes "-o=custom-columns=NAME:.metadata.name,GPU:.status.allocatable.nvidia\.com/gpu"
NAME            GPU
a20-01-master   <none>
a20-01-n1       <none>
a20-01-n2       <none>

제대로 인식하는 경우

kubectl get nodes "-o=custom-columns=NAME:.metadata.name,GPU:.status.allocatable.nvidia\.com/gpu"
NAME      GPU
gpu-mst   <none>
gpu-n1    1
gpu-n2    1
root@gpu-mst:~#

위와 같이 모든 Pod가 Pending이나 오류가 없다면, Kubeflow 포트로 접속하여 구동 확인함.

http://{서버 IP}:31380/

admin@kubeflow.org:12341234

www.kubeflow.org/docs/started/k8s/kfctl-istio-dex/

Multi-user, auth-enabled Kubeflow with kfctl_istio_dex

Instructions for installing Kubeflow with kfctl_istio_dex.yaml config

www.kubeflow.org

Kubeflow 화면이 보인다면 성공~!!!

Kubeflow를 설치하면 멀티 사용자 기능을 사용할 수 있습니다. 사용자 인증을 위하여 dex 라는 것을 이용하고 있습니다. dex 는 OpenID Connect 를 지원하는 식별 서비스로서, LDAP, Github, SAML 등의 여러 가지의 인증 방식과 연동이 가능

dex 는 사용자 정보를 파일에 직접 저장하고 있습니다. 이 파일은 ConfigMap에 저장되어 있습니다.

dex는 쿠버네티스의 auth 라는 네임스페이스 설치됩니다. dex 라는 ConfigMap을 조회하면 설정 정보를 볼 수 있습니다.

$ kubectl -n auth get cm dex -o yaml
apiVersion: v1
data:
  config.yaml: |
    issuer: <http://dex.auth.svc.cluster.local:5556/dex>
    storage:
      type: kubernetes
      config:
        inCluster: true
    web:
      http: 0.0.0.0:5556
    logger:
      level: "debug"
      format: text
    oauth2:
      skipApprovalScreen: true
    enablePasswordDB: true
    staticPasswords:
    - email: admin@kubeflow.org
      hash: 12$ruoM7FqXrpVgaol44eRZW.4HWS8SAvg6KYVVSCIwKQPBmTpCm.EeO
      username: admin
      userID: 08a8684b-db88-4b73-90a9-3cd1661f5466
...
kind: ConfigMap
metadata:
  name: dex
  namespace: auth

기본으로 생성되는 이메일은 “admin@kubeflow.org“이고, 패스워드는 “12341234”입니다.

패스워드 해시값은 다양한 방법으로 만들수 있습니다.

만약 파이썬을 사용하고 있다면 bcrypt 라이브러리를 사용할 수 있습니다.

다음은 bcrypt 패키지를 설치하고, “PASSWORD”의 해시값을 생성하는 예제입니다.

pip install bcrypt
python -c 'import bcrypt; print(bcrypt.hashpw(b"PASSWORD", bcrypt.gensalt(rounds=10)).decode("ascii"))'
10$y0qsW5zqhKobi4rsNMxqceG5zFqop27Z3wgdF/wjmmlF0ib53xwTS

dex에 새로운 사용자를 추가해 보겠습니다.

내용이 많지 않기 때문에 ConfigMap을 직접 수정하겠습니다.

$ kubectl -n auth edit cm dex

staticPasswords:
    - email: admin@kubeflow.org
      hash: 12$ruoM7FqXrpVgaol44eRZW.4HWS8SAvg6KYVVSCIwKQPBmTpCm.EeO
      username: admin
      userID: 08a8684b-db88-4b73-90a9-3cd1661f5466
    - email: user@gmail.com
      hash: 10$y0qsW5zqhKobi4rsNMxqceG5zFqop27Z3wgdF/wjmmlF0ib53xwTS
      username: user
      userID: user

변경할 설정 사항을 적용하기 위해서 dex를 재시작합니다

kubectl -n auth rollout restart deployment dex

PVC권한 확인

# kubectl -n kubeflow get pvc
NAME             STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
katib-mysql      Bound    pvc-a0d87ee7-837a-49db-b6c2-26d8b0ed12e5   10Gi       RWO            nfs-client     64m
metadata-mysql   Bound    pvc-4ed336cd-e374-4419-bbcf-369780972353   10Gi       RWO            nfs-client     64m
minio-pvc        Bound    pvc-634c8bf6-6a14-458a-a54f-e9c030726290   20Gi       RWO            nfs-client     64m
mysql-pv-claim   Bound    pvc-609c1b3c-941d-48fe-a57a-7c3e30bd103f   20Gi       RWO            nfs-client     64m

현재글[GPU] Kubeflow & GCP & K8S

sncap Style

[ My Style / My Think / My Life / My world ]

인덱스트리, 위상정렬, 크루스칼, DP, binary search, FastAPI, 이분탐색, MST, 그래프 탐색, 그래프 이론, 위상 정렬, Dijkstra, Krukal, 구간합, 너비 우선 탐색, 펜윅트리, 다익스트라, 플로이드워셜, 세그멘트트리, LIS,

Today :
Yesterday :

sncap Style