15. 클러스터 유지와 보안 트러블 슈팅 (1)

24년 11월 이전/데브옵스(DevOps)를 위한 쿠버네티스 마스터 2021. 9. 1. 21:55

이 문서는 인프런 강의 "데브옵스를 위한 쿠버네티스 마스터"을 듣고 작성되었습니다. 최대한 요약해서 강의 내용을 최소로 하는데 목표를 두고 있어서, 더 친절하고 정확한 내용을 원하신다면 강의를 구매하시는 것을 추천드립니다. => 강의 링크

노드 1개를 업데이트해야 한다면?

이번 절은 GKE에서 진행한다. 쿠버네티스 클러스터를 운영하다보면, OS 업데이트 혹은 쿠버네티스 버전 업데이트 등의 이유로 운영 중인 노드를 하나씩 내리고 업데이트 후 다시 올려야 할 때가 있다.

이 때, 다음과 같은 절차로 진행한다.

업데이트해야 할 노드 drained (노드에서 운영되는 리소스 수거)
노드 업데이트
업데이트된 노드 uncordon (업데이트 노드 다시 클러스터 참여)

노드 정보를 확인해보자.

$ kubectl get nodes
gke-cluster-1-default-pool-66348328-7xbq   Ready    <none>   7m47s   v1.20.8-gke.900
gke-cluster-1-default-pool-66348328-r8j5   Ready    <none>   7m47s   v1.20.8-gke.900
gke-cluster-1-default-pool-66348328-z61k   Ready    <none>   7m48s   v1.20.8-gke.900

먼저 리소스를 여러 개 생성해서 각 노드에 뿌려보자.

$ kubectl create deployment nginx --image=nginx --replicas=15
deployment.apps/nginx created

이제 Pod 15개가 어떻게 배치되는지 확인해보자.

$ kubectl get pod -o wide
NAME                     READY   STATUS    RESTARTS   AGE   IP           NODE                                       NOMINATED NODE   READINESS GATES
nginx-6799fc88d8-2747x   1/1     Running   0          19s   10.32.2.8    gke-cluster-1-default-pool-66348328-z61k   <none>           <none>
nginx-6799fc88d8-2vlh8   1/1     Running   0          19s   10.32.2.9    gke-cluster-1-default-pool-66348328-z61k   <none>           <none>
nginx-6799fc88d8-654r7   1/1     Running   0          19s   10.32.0.8    gke-cluster-1-default-pool-66348328-7xbq   <none>           <none>
nginx-6799fc88d8-6872f   1/1     Running   0          19s   10.32.1.7    gke-cluster-1-default-pool-66348328-r8j5   <none>           <none>
nginx-6799fc88d8-6mljv   1/1     Running   0          19s   10.32.1.6    gke-cluster-1-default-pool-66348328-r8j5   <none>           <none>
nginx-6799fc88d8-8h8f6   1/1     Running   0          19s   10.32.0.10   gke-cluster-1-default-pool-66348328-7xbq   <none>           <none>
nginx-6799fc88d8-jc89p   1/1     Running   0          19s   10.32.2.6    gke-cluster-1-default-pool-66348328-z61k   <none>           <none>
nginx-6799fc88d8-knkcz   1/1     Running   0          19s   10.32.2.7    gke-cluster-1-default-pool-66348328-z61k   <none>           <none>
nginx-6799fc88d8-m7jpb   1/1     Running   0          19s   10.32.1.5    gke-cluster-1-default-pool-66348328-r8j5   <none>           <none>
nginx-6799fc88d8-nxnb8   1/1     Running   0          19s   10.32.0.11   gke-cluster-1-default-pool-66348328-7xbq   <none>           <none>
nginx-6799fc88d8-p8mc8   1/1     Running   0          20s   10.32.2.5    gke-cluster-1-default-pool-66348328-z61k   <none>           <none>
nginx-6799fc88d8-qqm54   1/1     Running   0          19s   10.32.0.9    gke-cluster-1-default-pool-66348328-7xbq   <none>           <none>
nginx-6799fc88d8-tlmdl   1/1     Running   0          19s   10.32.1.8    gke-cluster-1-default-pool-66348328-r8j5   <none>           <none>
nginx-6799fc88d8-wkxkq   1/1     Running   0          19s   10.32.0.7    gke-cluster-1-default-pool-66348328-7xbq   <none>           <none>
nginx-6799fc88d8-zrkft   1/1     Running   0          19s   10.32.1.4    gke-cluster-1-default-pool-66348328-r8j5   <none>           <none>

정리하면 각 노드당 포드 개수는 다음과 같이 배치된다.

노드 gke-cluster-1-default-pool-66348328-7xbq: 5개
노드 gke-cluster-1-default-pool-66348328-r8j5: 5개
노드 gke-cluster-1-default-pool-66348328-z61k: 5개

이제 노드 gke-cluster-1-default-pool-66348328-z61k의 버전 업데이트를 한다고 가정하고 노드를 내려야 한다고 해보자. 이 때 drain 명령어를 통해서 리소스를 회수한 후, 나머지 노드에 배치시킬 수 있다.

$ kubectl drain gke-cluster-1-default-pool-66348328-z61k
node/gke-cluster-1-default-pool-66348328-z61k cordoned
DEPRECATED WARNING: Aborting the drain command in a list of nodes will be deprecated in v1.23.
The new behavior will make the drain command go through all nodes even if one or more nodes failed during the drain.
For now, users can try such experience via: --ignore-errors
error: unable to drain node "gke-cluster-1-default-pool-66348328-z61k", aborting command...

There are pending nodes to be drained:
 gke-cluster-1-default-pool-66348328-z61k
error: cannot delete DaemonSet-managed Pods (use --ignore-daemonsets to ignore): kube-system/fluentbit-gke-hk644, kube-system/gke-metrics-agent-4wmhh, kube-system/pdcsi-node-kxvll

뭔가 에러가 떴다. DaemonSet이 배치되어 있기 때문에 클러스터에 영향이 있을 수 있다고 일단 배치만 못하게 상태를 바꾼 것이다. 이 에러를 해결하면 --ignore-daemonsets 옵션을 주면 된다.

$ kubectl drain gke-cluster-1-default-pool-66348328-z61k --ignore-daemonsets
node/gke-cluster-1-default-pool-66348328-z61k already cordoned
WARNING: ignoring DaemonSet-managed Pods: kube-system/fluentbit-gke-hk644, kube-system/gke-metrics-agent-4wmhh, kube-system/pdcsi-node-kxvll
evicting pod kube-system/stackdriver-metadata-agent-cluster-level-86bfdb4cfc-6f9qx
evicting pod default/nginx-6799fc88d8-p8mc8
evicting pod default/nginx-6799fc88d8-knkcz
evicting pod default/nginx-6799fc88d8-2vlh8
evicting pod kube-system/metrics-server-v0.3.6-9c5bbf784-fdmr9
evicting pod default/nginx-6799fc88d8-2747x
evicting pod default/nginx-6799fc88d8-jc89p
pod/nginx-6799fc88d8-2vlh8 evicted
pod/stackdriver-metadata-agent-cluster-level-86bfdb4cfc-6f9qx evicted
pod/nginx-6799fc88d8-jc89p evicted
pod/nginx-6799fc88d8-2747x evicted
pod/nginx-6799fc88d8-p8mc8 evicted
pod/metrics-server-v0.3.6-9c5bbf784-fdmr9 evicted
pod/nginx-6799fc88d8-knkcz evicted
node/gke-cluster-1-default-pool-66348328-z61k evicted

이렇게 하면 해당 노드에 있는 모든 리소스를 회수하고 다른 노드에 재배치를 한 것이다. 노드를 확인해보자.

$ kubectl get nodes
NAME                                       STATUS                     ROLES    AGE   VERSION
gke-cluster-1-default-pool-66348328-7xbq   Ready                      <none>   19m   v1.20.8-gke.900
gke-cluster-1-default-pool-66348328-r8j5   Ready                      <none>   19m   v1.20.8-gke.900
gke-cluster-1-default-pool-66348328-z61k   Ready,SchedulingDisabled   <none>   19m   v1.20.8-gke.900

이젠 포드의 배치가 어떻게 되었는지 확인해보자.

$ kubectl get pod -o wide
NAME                     READY   STATUS    RESTARTS   AGE     IP           NODE                                       NOMINATED NODE   READINESS GATES
nginx-6799fc88d8-4t98q   1/1     Running   0          92s     10.32.1.11   gke-cluster-1-default-pool-66348328-r8j5   <none>           <none>
nginx-6799fc88d8-5fqmq   1/1     Running   0          92s     10.32.1.9    gke-cluster-1-default-pool-66348328-r8j5   <none>           <none>
nginx-6799fc88d8-654r7   1/1     Running   0          9m47s   10.32.0.8    gke-cluster-1-default-pool-66348328-7xbq   <none>           <none>
nginx-6799fc88d8-6872f   1/1     Running   0          9m47s   10.32.1.7    gke-cluster-1-default-pool-66348328-r8j5   <none>           <none>
nginx-6799fc88d8-6mljv   1/1     Running   0          9m47s   10.32.1.6    gke-cluster-1-default-pool-66348328-r8j5   <none>           <none>
nginx-6799fc88d8-8h8f6   1/1     Running   0          9m47s   10.32.0.10   gke-cluster-1-default-pool-66348328-7xbq   <none>           <none>
nginx-6799fc88d8-gbpz8   1/1     Running   0          92s     10.32.0.12   gke-cluster-1-default-pool-66348328-7xbq   <none>           <none>
nginx-6799fc88d8-m7jpb   1/1     Running   0          9m47s   10.32.1.5    gke-cluster-1-default-pool-66348328-r8j5   <none>           <none>
nginx-6799fc88d8-mkjjn   1/1     Running   0          91s     10.32.0.13   gke-cluster-1-default-pool-66348328-7xbq   <none>           <none>
nginx-6799fc88d8-nxnb8   1/1     Running   0          9m47s   10.32.0.11   gke-cluster-1-default-pool-66348328-7xbq   <none>           <none>
nginx-6799fc88d8-qqm54   1/1     Running   0          9m47s   10.32.0.9    gke-cluster-1-default-pool-66348328-7xbq   <none>           <none>
nginx-6799fc88d8-sl92n   1/1     Running   0          91s     10.32.1.12   gke-cluster-1-default-pool-66348328-r8j5   <none>           <none>
nginx-6799fc88d8-tlmdl   1/1     Running   0          9m47s   10.32.1.8    gke-cluster-1-default-pool-66348328-r8j5   <none>           <none>
nginx-6799fc88d8-wkxkq   1/1     Running   0          9m47s   10.32.0.7    gke-cluster-1-default-pool-66348328-7xbq   <none>           <none>
nginx-6799fc88d8-zrkft   1/1     Running   0          9m47s   10.32.1.4    gke-cluster-1-default-pool-66348328-r8j5   <none>           <none>

정리하면 각 노드당 포드 개수는 다음과 같이 배치된다.

노드 gke-cluster-1-default-pool-66348328-7xbq: 7개
노드 gke-cluster-1-default-pool-66348328-r8j5: 8개
노드 gke-cluster-1-default-pool-66348328-z61k: 0개

이제 노드가 업데이트 되었다고 가정해보자. 다시 복구시키려면 어떻게 해야 할까? uncordon 명령어를 쓰면 된다.

$ kubectl uncordon gke-cluster-1-default-pool-66348328-z61k
node/gke-cluster-1-default-pool-66348328-z61k uncordoned

다시 노드를 확인해보자.

$ kubectl get nodes
NAME                                       STATUS   ROLES    AGE   VERSION
gke-cluster-1-default-pool-66348328-7xbq   Ready    <none>   20m   v1.20.8-gke.900
gke-cluster-1-default-pool-66348328-r8j5   Ready    <none>   20m   v1.20.8-gke.900
gke-cluster-1-default-pool-66348328-z61k   Ready    <none>   20m   v1.20.8-gke.900

이렇게 되면 정상적으로 복구가 된 것이다.

백업과 복원

이번 절은 VM에서 진행된다. 쿠버네티스 클러스터를 운영하다보면, 모든 리소스를 백업 및 복원을 해야 할 때가 있다. 백업과 복원 3가지 방식이 있다.

kubectl get all --all-namespaces -o yaml > backup.yaml 이용
etcd 이용
Persistent Volume 이용

1 -> 2 -> 3 순으로 백업을 3 -> 2 -> 1 순으로 복원을 진행하면 된다. 여기서 3번은 진행하지 않는다. 먼저 백업용 디렉토리를 생성한다.

# backup 폴더
$ mkdir -p ~/k8s-master/backup

모든 리소스를 yaml 파일로 바꿔서 백업해보자. 다음 명령어를 입력한다.

$ kubectl get all --all-namespaces -o yaml > backup.yaml

$ mv backup.yaml ~/k8s-master/backup

이제 etcd에 저장된 데이터들을 백업한다. 다음 명령어들을 입력한다.

# etcd 경로
$ cd ~/k8s-master/etcd/

$ sudo ETCDCTL_API=3 ./etcdctl --endpoints 127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key snapshot save snapshot.db
{"level":"info","ts":1630498443.8052633,"caller":"snapshot/v3_snapshot.go:68","msg":"created temporary db file","path":"snapshot.db.part"}
{"level":"info","ts":1630498443.8163116,"logger":"client","caller":"v3/maintenance.go:211","msg":"opened snapshot stream; downloading"}
{"level":"info","ts":1630498443.8167007,"caller":"snapshot/v3_snapshot.go:76","msg":"fetching snapshot","endpoint":"127.0.0.1:2379"}
{"level":"info","ts":1630498443.9060197,"logger":"client","caller":"v3/maintenance.go:219","msg":"completed snapshot read; closing"}
{"level":"info","ts":1630498443.9323158,"caller":"snapshot/v3_snapshot.go:91","msg":"fetched snapshot","endpoint":"127.0.0.1:2379","size":"5.4 MB","took":"now"}
{"level":"info","ts":1630498443.9324648,"caller":"snapshot/v3_snapshot.go:100","msg":"saved","path":"snapshot.db"}
Snapshot saved at snapshot.db

다음 명령어로 저장된 데이터를 간략히 확인할 수 있다.

$ sudo ./etcdutl snapsho status snapshot.db --write-out=table
+----------+----------+------------+------------+
|   HASH   | REVISION | TOTAL KEYS | TOTAL SIZE |
+----------+----------+------------+------------+
| 7253ecf8 |    79013 |       2501 |     5.4 MB |
+----------+----------+------------+------------+

이제 이 snapshot.db도 백업 파일로 이동한다.

$ sudo mv snapshot.db ~/k8s-master/backup

이제 백업 과정은 완료되었다. 모든 리소스를 삭제한다.

$ kubectl delete all --all
pod "nginx-6799fc88d8-8wt5d" deleted
pod "nginx-6799fc88d8-bdjm9" deleted
pod "nginx-6799fc88d8-cb55f" deleted
pod "nginx-6799fc88d8-cxxnd" deleted
pod "nginx-6799fc88d8-g68zc" deleted
pod "nginx-6799fc88d8-gkprf" deleted
pod "nginx-6799fc88d8-pcc5d" deleted
pod "nginx-6799fc88d8-v2ktn" deleted
pod "nginx-6799fc88d8-wkmkq" deleted
pod "nginx-6799fc88d8-zwthq" deleted
service "kubernetes" deleted
deployment.apps "nginx" deleted
replicaset.apps "nginx-6799fc88d8" deleted

이제 다시 etcd 경로로 이동해서 데이터를 /var/lib/etcd-restore 경로에 데이터를 복원한다.

$ cd ~/k8s-master/etcd/

$ sudo ETCDCTL_API=3 ./etcdctl --endpoints 127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key --data-dir=/var/lib/etcd-restore snapshot restore ~/k8s-master/backup/snapshot.db
Deprecated: Use `etcdutl snapshot restore` instead.

2021-09-01T21:17:42+09:00    info    snapshot/v3_snapshot.go:251    restoring snapshot    {"path": "/home/gurumee/k8s-master/backup/snapshot.db", "wal-dir": "/var/lib/etcd-restore/member/wal", "data-dir": "/var/lib/etcd-restore", "snap-dir": "/var/lib/etcd-restore/member/snap", "stack": "go.etcd.io/etcd/etcdutl/v3/snapshot.(*v3Manager).Restore\n\t/tmp/etcd-release-3.5.0/etcd/release/etcd/etcdutl/snapshot/v3_snapshot.go:257\ngo.etcd.io/etcd/etcdutl/v3/etcdutl.SnapshotRestoreCommandFunc\n\t/tmp/etcd-release-3.5.0/etcd/release/etcd/etcdutl/etcdutl/snapshot_command.go:147\ngo.etcd.io/etcd/etcdctl/v3/ctlv3/command.snapshotRestoreCommandFunc\n\t/tmp/etcd-release-3.5.0/etcd/release/etcd/etcdctl/ctlv3/command/snapshot_command.go:128\ngithub.com/spf13/cobra.(*Command).execute\n\t/home/remote/sbatsche/.gvm/pkgsets/go1.16.3/global/pkg/mod/github.com/spf13/cobra@v1.1.3/command.go:856\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/home/remote/sbatsche/.gvm/pkgsets/go1.16.3/global/pkg/mod/github.com/spf13/cobra@v1.1.3/command.go:960\ngithub.com/spf13/cobra.(*Command).Execute\n\t/home/remote/sbatsche/.gvm/pkgsets/go1.16.3/global/pkg/mod/github.com/spf13/cobra@v1.1.3/command.go:897\ngo.etcd.io/etcd/etcdctl/v3/ctlv3.Start\n\t/tmp/etcd-release-3.5.0/etcd/release/etcd/etcdctl/ctlv3/ctl.go:107\ngo.etcd.io/etcd/etcdctl/v3/ctlv3.MustStart\n\t/tmp/etcd-release-3.5.0/etcd/release/etcd/etcdctl/ctlv3/ctl.go:111\nmain.main\n\t/tmp/etcd-release-3.5.0/etcd/release/etcd/etcdctl/main.go:59\nruntime.main\n\t/home/remote/sbatsche/.gvm/gos/go1.16.3/src/runtime/proc.go:225"}
2021-09-01T21:17:42+09:00    info    membership/store.go:119    Trimming membership information from the backend...
2021-09-01T21:17:42+09:00    info    membership/cluster.go:393    added member{"cluster-id": "cdf818194e3a8c32", "local-member-id": "0", "added-peer-id": "8e9e05c52164694d", "added-peer-peer-urls": ["http://localhost:2380"]}
2021-09-01T21:17:42+09:00    info    snapshot/v3_snapshot.go:272    restored snapshot    {"path": "/home/gurumee/k8s-master/backup/snapshot.db", "wal-dir": "/var/lib/etcd-restore/member/wal", "data-dir": "/var/lib/etcd-restore", "snap-dir": "/var/lib/etcd-restore/member/snap"}

제대로 복원되었는지 확인하려면 다음 명령어를 입력한다.

$ sudo ls /var/lib/etcd-restore
member

위와 같이 나오면 성공이다. 이제 etcd 스태틱 포드를 변경해야 한다. 현재 etcd는 데이터 디렉토리가 /var/lib/etcd로 바라보고 있는데 이를 /var/lib/etcd-restore로 변경해야 한다. 터미널에 다음을 입력한다.

$ sudo vim /etc/kubernetes/manifests/etcd.yaml

다음 그림과 같이 /var/lib/etcd로 할당된 곳을 /var/lib/etcd-restore로 바꾼다.

바꾼 후 바로 kubectl 명령어를 내리면 etcd가 켜져 있지 않기 때문에 수행되지 않는다. 얼마간의 시간이 지나면 다시 명령어를 내릴 수 있다.

$ kubectl get pod

음.. 토큰을 주지 않아서 그런지 강사님의 설명과 달리 리소스 복구가 되지 않는다. 괜찮다. 우리에겐 다른 방법이 있으니까 터미널에 다음을 입력한다.

$ kubectl create -f ~/k8s-master/backup/backup.yaml
pod/nginx-6799fc88d8-8wt5d created
pod/nginx-6799fc88d8-bdjm9 created
pod/nginx-6799fc88d8-cb55f created
pod/nginx-6799fc88d8-cxxnd created
pod/nginx-6799fc88d8-g68zc created
pod/nginx-6799fc88d8-gkprf created
pod/nginx-6799fc88d8-pcc5d created
pod/nginx-6799fc88d8-v2ktn created
pod/nginx-6799fc88d8-wkmkq created
pod/nginx-6799fc88d8-zwthq created
deployment.apps/nginx created
Error from server (ServerTimeout): The POST operation against Pod could not be completed at this time, please try again.
Error from server (ServerTimeout): The POST operation against Pod could not be completed at this time, please try again.
Error from server (AlreadyExists): pods "etcd-master" already exists
Error from server (AlreadyExists): pods "kube-apiserver-master" already exists
Error from server (AlreadyExists): pods "kube-controller-manager-master" already exists
Error from server (ServerTimeout): The POST operation against Pod could not be completed at this time, please try again.
Error from server (ServerTimeout): The POST operation against Pod could not be completed at this time, please try again.
Error from server (ServerTimeout): The POST operation against Pod could not be completed at this time, please try again.
Error from server (AlreadyExists): pods "kube-scheduler-master" already exists
Error from server (ServerTimeout): The POST operation against Pod could not be completed at this time, please try again.
Error from server (ServerTimeout): The POST operation against Pod could not be completed at this time, please try again.
Error from server (ServerTimeout): The POST operation against Pod could not be completed at this time, please try again.
Error from server (ServerTimeout): The POST operation against Pod could not be completed at this time, please try again.
Error from server (ServerTimeout): The POST operation against Pod could not be completed at this time, please try again.
Error from server (ServerTimeout): The POST operation against Pod could not be completed at this time, please try again.
Error from server (Invalid): Service "kubernetes" is invalid: spec.clusterIPs: Invalid value: []string{"10.96.0.1"}: failed to allocated ip:10.96.0.1 with error:provided IP is already allocated
Error from server (Invalid): Service "kube-dns" is invalid: spec.clusterIPs: Invalid value: []string{"10.96.0.10"}: failed to allocated ip:10.96.0.10 with error:provided IP is already allocated
Error from server (Invalid): Service "metrics-server" is invalid: spec.clusterIPs: Invalid value: []string{"10.99.203.219"}: failed to allocated ip:10.99.203.219 with error:provided IP is already allocated
Error from server (Invalid): Service "dashboard-metrics-scraper" is invalid: spec.clusterIPs: Invalid value: []string{"10.105.56.24"}: failed to allocated ip:10.105.56.24 with error:provided IP is already allocated
Error from server (Invalid): Service "kubernetes-dashboard" is invalid: spec.clusterIPs: Invalid value: []string{"10.107.20.201"}: failed to allocated ip:10.107.20.201 with error:provided IP is already allocated
Error from server (AlreadyExists): daemonsets.apps "kube-proxy" already exists
Error from server (AlreadyExists): daemonsets.apps "weave-net" already exists
Error from server (AlreadyExists): deployments.apps "coredns" already exists
Error from server (AlreadyExists): deployments.apps "metrics-server" already exists
Error from server (AlreadyExists): deployments.apps "dashboard-metrics-scraper" already exists
Error from server (AlreadyExists): deployments.apps "kubernetes-dashboard" already exists
Error from server (AlreadyExists): replicasets.apps "nginx-6799fc88d8" already exists
Error from server (AlreadyExists): replicasets.apps "coredns-558bd4d5db" already exists
Error from server (AlreadyExists): replicasets.apps "metrics-server-6dfddc5fb8" already exists
Error from server (AlreadyExists): replicasets.apps "metrics-server-77946fbff9" already exists
Error from server (AlreadyExists): replicasets.apps "dashboard-metrics-scraper-856586f554" already exists
Error from server (AlreadyExists): replicasets.apps "kubernetes-dashboard-67484c44f6" already exists

이제 리소스들을 한 번 확인해보자.

$ kubectl get pod --all-namespaces 
NAMESPACE              NAME                                         READY   STATUS    RESTARTS   AGE
default                nginx-6799fc88d8-6t79v                       1/1     Running   0          113s
default                nginx-6799fc88d8-bkl4c                       1/1     Running   0          113s
default                nginx-6799fc88d8-hn8xh                       1/1     Running   0          113s
default                nginx-6799fc88d8-phhcp                       1/1     Running   0          113s
default                nginx-6799fc88d8-r8x5h                       1/1     Running   0          113s
default                nginx-6799fc88d8-t5czq                       1/1     Running   0          113s
default                nginx-6799fc88d8-vcp9x                       1/1     Running   0          113s
default                nginx-6799fc88d8-z5dvz                       1/1     Running   0          113s
default                nginx-6799fc88d8-z5wnq                       1/1     Running   0          113s
default                nginx-6799fc88d8-zkgqk                       1/1     Running   0          113s
kube-system            coredns-558bd4d5db-5z4jv                     1/1     Running   0          60d
kube-system            coredns-558bd4d5db-z45ck                     1/1     Running   0          50d
kube-system            etcd-master                                  1/1     Running   1          6m36s
kube-system            kube-apiserver-master                        1/1     Running   13         13d
kube-system            kube-controller-manager-master               1/1     Running   34         60d
kube-system            kube-proxy-hvpz5                             1/1     Running   1          60d
kube-system            kube-proxy-kdtb5                             1/1     Running   0          60d
kube-system            kube-proxy-qthkm                             1/1     Running   0          60d
kube-system            kube-scheduler-master                        1/1     Running   31         60d
kube-system            metrics-server-77946fbff9-pqk2d              1/1     Running   0          9d
kube-system            weave-net-46bsq                              2/2     Running   1          60d
kube-system            weave-net-fgxbv                              2/2     Running   1          60d
kube-system            weave-net-qr5v6                              2/2     Running   4          60d
kubernetes-dashboard   dashboard-metrics-scraper-856586f554-9dmfk   1/1     Running   0          8d
kubernetes-dashboard   kubernetes-dashboard-67484c44f6-dnn6t        1/1     Running   0          8d

다시 모든 리소스가 생성된 것을 확인할 수 있다.

스태틱 토큰으로 유저 만들어보기

이번 절은 VM에서 진행한다. 먼저 보안을 위한 리소스, 특히 유저 관련 리소스는 크게 2가지가 있다.

ServiceAccount
StaticToken

ServiceAccount는 Pod에서 실행되는 애플리케이션이 kube-apiserver와 통신이 필요할 때 사용한다. StaticToken은 유저를 생성해서 접근을 제어할 수 있다. 여기서는 StaticToken만 간단히 만들어보자.

먼저 다음과 같이 /etc/kubernetes/pki/token.csv를 생성한다.

/etc/kubernetes/pki/token.csv

password1,user1,uid001,"group1"
password2,user2,uid002
password3,user3,uid003
password4,user4,uid004

그 후 kube-apiserver.yaml에서 --token-auth-file 옵션을 준다. 먼저 터미널에 다음을 입력한다.

$ sudo vim /etc/kubernetes/manifests/kube-apiserver.yaml

그리고 다음과 같이 kube-apiserver 포드의 명령어 줄에 다음과 같이 옵션을 추가해준다.

--token-auth-file=/etc/kubernetes/pki/token.csv를 추가해주면 된다. 처음엔 부팅이 안될 것이다. 조금만 기다리면 다시 kubectl을 통해 명령할 수 있다.

$ kubectl get pod
NAME                     READY   STATUS    RESTARTS   AGE
nginx-6799fc88d8-6t79v   1/1     Running   0          13m
nginx-6799fc88d8-bkl4c   1/1     Running   0          13m
nginx-6799fc88d8-hn8xh   1/1     Running   0          13m
nginx-6799fc88d8-phhcp   1/1     Running   0          13m
nginx-6799fc88d8-r8x5h   1/1     Running   0          13m
nginx-6799fc88d8-t5czq   1/1     Running   0          13m
nginx-6799fc88d8-vcp9x   1/1     Running   0          13m
nginx-6799fc88d8-z5dvz   1/1     Running   0          13m
nginx-6799fc88d8-z5wnq   1/1     Running   0          13m
nginx-6799fc88d8-zkgqk   1/1     Running   0          13m

이제 계정 "user1"으로 로그인해보자. 터미널에 다음을 입력한다.

$ kubectl config set-credentials user1 --token=password1
User "user1" set.

그 후, 해당 유저에 접근할 수 있는 클러스터와 네임스페이스 그리고 접근할 API 그룹을 지정해주어야 한다.

$ kubectl config set-context user1-context --cluster=kubernetes --namespace=frontend --user=user1
Context "user1-context" created.

여기서 --cluster=kubernetes는 마스터 노드를 뜻한다. 이제 설정된 유저가 이 컨텍스트를 사용하게 설정한다.

$ kubectl config use-context user1-context
Switched to context "user1-context".

이제 다시 Pod를 확인해보자.

$ kubectl get pod
Error from server (Forbidden): pods is forbidden: User "user1" cannot list resource "pods" in API group "" in the namespace "frontend"

위와 같은 에러가 뜨는 이유는 "user1"은 "default" 네임 스페이스에 접근할 권한이 없기 때문이다. 사실 네임스페이스를 "frontend"를 주었지만 API 그룹이 없기 때문에 어떠한 리소스도 접근할 수 없다.

이제 다시 admin 권한으로 돌아가자.

$ kubectl config use-context kubernetes-admin@kubernetes

그 후 리소스를 확인해보자.

$ kubectl get pod
NAME                     READY   STATUS    RESTARTS   AGE
nginx-6799fc88d8-6t79v   1/1     Running   0          21m
nginx-6799fc88d8-bkl4c   1/1     Running   0          21m
nginx-6799fc88d8-hn8xh   1/1     Running   0          21m
nginx-6799fc88d8-phhcp   1/1     Running   0          21m
nginx-6799fc88d8-r8x5h   1/1     Running   0          21m
nginx-6799fc88d8-t5czq   1/1     Running   0          21m
nginx-6799fc88d8-vcp9x   1/1     Running   0          21m
nginx-6799fc88d8-z5dvz   1/1     Running   0          21m
nginx-6799fc88d8-z5wnq   1/1     Running   0          21m
nginx-6799fc88d8-zkgqk   1/1     Running   0          21m

728x90

저작자표시 (새창열림)

'레거시 > 데브옵스(DevOps)를 위한 쿠버네티스 마스터' 카테고리의 다른 글

17. 클러스터 유지와 보안 트러블 슈팅 (3) (0)	2021.09.08
16. 클러스터 유지와 보안 트러블 슈팅 (2) (0)	2021.09.08
14. 서비스 매시 환경 모니터링 도구 istio 시작하기 (0)	2021.08.31
13. 리소스 로깅과 모니터링 (0)	2021.08.30
12. 애플리케이션 스케줄링과 라이프사이클 관리 (2) (0)	2021.08.20

ABOUT ME

구르미의 개발 이야기 구르미의 개발 이야기

노드 1개를 업데이트해야 한다면?

백업과 복원

스태틱 토큰으로 유저 만들어보기

'레거시 > 데브옵스(DevOps)를 위한 쿠버네티스 마스터' 카테고리의 다른 글

티스토리툴바

ABOUT ME

노드 1개를 업데이트해야 한다면?

백업과 복원

스태틱 토큰으로 유저 만들어보기

'레거시 > 데브옵스(DevOps)를 위한 쿠버네티스 마스터' 카테고리의 다른 글

관련글 관련글 더보기

티스토리툴바