Database

ES cluster 구축 과정 기록

신수동탈곡기 2022. 5. 26. 13:19

0. 개요


ElasticSearch Cluster 구축 메뉴얼이나 방법이라기 보다 어떤 과정으로 구축했고 그 과정에서 어떠한 실수를 했는지, 어떤 점을 더 공부해야 하는지에 더 가까운 개인 기록입니다.

  • node 0: Ubuntu 20.04.4 LTS / elasticsearch 8.2.0
  • node 1: Centos7 / elasticsearch 8.2.0
  • node 2: Centos7 / elasticsearch 8.2.0

1. Cluster 설치


1-1. tar.gz 배포판 다운로드

$ wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-8.2.0-linux-x86_64.tar.gz
$ tar -zxvf elasticsearch-8.2.0-linux-x86_64.tar.gz

1-2. elasticsearch.yml 설정

# config/elasticsearch.yml

...
cluster.name: "LAB-ES-1"
...
node.name: "lab-ES-0" (다른 노드에서는 "lab-ES-1", "lab-ES-2")
...
path.data: ["my/personlized/path"]
...
network.bind_host: ${ES_PRIVATE_IP}
network.publish_host: ${ES_PUBLIC_IP}
...
discovery.seed_hosts: ["node_2_ip", "node_3_ip"] (다른 노드에서는 ["node_1_ip", "node_3_ip"], ["node_1_ip", "node_2_ip"])

1-3. 실행

$ ./bin/elasticsearch

 

2-1. Trouble shooting

(vm.max_map_count, xpack.security.transport.ssl.enabled, max file descriptors)


Error Log

$ ./bin/elasticsearch
...
...
...
# node0에서
ERROR: [2] bootstrap checks failed. You must address the points described in the following [2] lines before starting Elasticsearch.
bootstrap check failure [1] of [2]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
bootstrap check failure [2] of [2]: Transport SSL must be enabled if security is enabled. Please set [xpack.security.transport.ssl.enabled] to [true] or disable security by setting [xpack.security.enabled] to [false]
ERROR: Elasticsearch did not exit normally - check the logs at /home/ubuntu/distributed-pipeline/elasticsearch-8.2.0/logs/LAB-ES-1.log
[2022-05-10T11:14:18,620][INFO ][o.e.n.Node               ] [lab-ES-0] stopping ...
[2022-05-10T11:14:18,676][INFO ][o.e.n.Node               ] [lab-ES-0] stopped
[2022-05-10T11:14:18,677][INFO ][o.e.n.Node               ] [lab-ES-0] closing ...
[2022-05-10T11:14:18,686][INFO ][o.e.n.Node               ] [lab-ES-0] closed
[2022-05-10T11:14:18,688][INFO ][o.e.x.m.p.NativeController] [lab-ES-0] Native controller process has stopped - no new native processes can be started
# node1, 2 에서
bootstrap check failure [1] of [2]: max file descriptors [4096] for elasticsearch process is too low, increase to at least [65535]

에러원인 1

max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
max file descriptors [4096] for elasticsearch process is too low, increase to at least [65535]

vm.max_map_count이 너무 낮음. vm.max_map_count는 변수로서, /proc/sys/vm/max_map_count 파일을 가리킨다. 이 파일은 Linux kernel의 메모리 맵(파일을 프로세스의 메모리에 매핑하는 것. read와 write 함수를 사용하지 않고도 프로그램 내부에서 정의한 변수를 사용해 파일에서 데이터를 읽거나 쓸 수 있음.) 영역의 최대 개수를 조작할 수 있다.

# ubuntu
$ sudo vi /etc/sysctl.conf # 영구적 변경
vm.max_map_count=262144
$ sudo sysctl -w vm.max_map_count=262144 # 현재 세션 변경

file desciptor(파이프, FIFO, 소켓, 터미널, 디바이스, 일반파일 등 종류에 상관없이 모든 열려있는 파일을 참조할때 쓰는 테이블)를 높여줘야 함. 

$ sudo vi /etc/security/limits.conf

user    soft    nofile  65536 -> 수정하기
user    hard    nofile  65536 -> 수정하기
user    soft    nproc   65536 -> 수정하기
user    hard    nproc   65536 -> 수정하기
user    soft    memlock 65536 -> 수정하기
user    hard    memlock 65536 -> 수정하기

# End of file

에러원인 2

Transport SSL must be enabled if security is enabled. Please set [xpack.security.transport.ssl.enabled] to [true] or disable security by setting [xpack.security.enabled] to [false]

Transport SSL이 비활성화되어있음. 활성화시켜주어야 한다.

임시 디렉토리 생성

$ pwd
/home/user/elasticsearch-8.2.0
$ mkdir tmp && cd tmp
$ mkdir cert && cd cert
$ vi instance.yml

인증서 생성을 위한 instance.yml 생성

# vi instance.yml
instances:
  - name: "lab-ES-node-0"
    ip:
      - "<node 0 ip>"
  - name: "lab-ES-node-1"
    ip:
      - "<node 1 ip>"
  - name: "lab-ES-node-2"
    ip: 
      - "<node 2 ip>"

인증서 생성 -> 결국에 나중에 다른 인증서 재생성함

$ ./bin/elasticsearch-certutil cert --pem --self-signed --in ./tmp/cert/instance.yml --out ./tmp/cert/certs.zip
This tool assists you in the generation of X.509 certificates and certificate
signing requests for use with SSL/TLS in the Elastic stack.

The 'cert' mode generates X.509 certificate and private keys.
    * By default, this generates a single certificate and key for use
       on a single instance.
    * The '-multiple' option will prompt you to enter details for multiple
       instances and will generate a certificate and key for each one
    * The '-in' option allows for the certificate generation to be automated by describing
       the details of each instance in a YAML file

    * An instance is any piece of the Elastic Stack that requires an SSL certificate.
      Depending on your configuration, Elasticsearch, Logstash, Kibana, and Beats
      may all require a certificate and private key.
    * The minimum required value for each instance is a name. This can simply be the
      hostname, which will be used as the Common Name of the certificate. A full
      distinguished name may also be used.
    * A filename value may be required for each instance. This is necessary when the
      name would result in an invalid file or directory name. The name provided here
      is used as the directory name (within the zip) and the prefix for the key and
      certificate files. The filename is required if you are prompted and the name
      is not displayed in the prompt.
    * IP addresses and DNS names are optional. Multiple values can be specified as a
      comma separated string. If no IP addresses or DNS names are provided, you may
      disable hostname verification in your SSL configuration.


    * All certificates generated by this tool will be signed by a certificate authority (CA)
      unless the --self-signed command line option is specified.
      The tool can automatically generate a new CA for you, or you can provide your own with
      the --ca or --ca-cert command line options.


By default the 'cert' mode produces a single PKCS#12 output file which holds:
    * The instance certificate
    * The private key for the instance certificate
    * The CA certificate

If you specify any of the following options:
    * -pem (PEM formatted output)
    * -multiple (generate multiple certificates)
    * -in (generate certificates from an input file)
then the output will be be a zip file containing individual certificate/key files


Certificates written to /home/ubuntu/distributed-pipeline/elasticsearch-8.2.0/tmp/cert/certs.zip

This file should be properly secured as it contains the private keys for 
all instances
After unzipping the file, there will be a directory for each instance.
Each instance has a certificate and private key.
For each Elastic product that you wish to configure, you should copy
the certificate, key, and CA certificate to the relevant configuration directory
and then follow the SSL configuration instructions in the product guide.

For client applications, you may only need to copy the CA certificate and
configure the client to trust this certificate.
$
$ ./bin/elasticsearch-certutil ca --pem --out tmp/cert/ca.zip
This tool assists you in the generation of X.509 certificates and certificate                                                                                                                     
signing requests for use with SSL/TLS in the Elastic stack.                                                                                                                                       
                                                                                                                                                                                                  
The 'ca' mode generates a new 'certificate authority'                                                                                                                                             
This will create a new X.509 certificate and private key that can be used                                                                                                                         
to sign certificate when running in 'cert' mode.                                                                                                                                                  
                                                                                                                                                                                                  
Use the 'ca-dn' option if you wish to configure the 'distinguished name'                                                                                                                          
of the certificate authority                                                                                                                                                                      
                                                                                                                                                                                                  
By default the 'ca' mode produces a single PKCS#12 output file which holds:                                                                                                                       
    * The CA certificate                                                                                                                                                                          
    * The CA's private key                                                                                                                                                                        
                                                                                                                                                                                                  
If you elect to generate PEM format certificates (the -pem option), then the output will                                                                                                          
be a zip file containing individual files for the CA certificate and private key

인증서 복사

$ pwd
/home/user/elasticsearch-8.2.0
$ cd tmp/cert
$ unzip certs.zip -d certs
Archive:  certs.zip
   creating: certs/lab-ES-node-0/
  inflating: certs/lab-ES-node-0/lab-ES-node-0.crt  
  inflating: certs/lab-ES-node-0/lab-ES-node-0.key  
   creating: certs/lab-ES-node-1/
  inflating: certs/lab-ES-node-1/lab-ES-node-1.crt  
  inflating: certs/lab-ES-node-1/lab-ES-node-1.key  
   creating: certs/lab-ES-node-2/
  inflating: certs/lab-ES-node-2/lab-ES-node-2.crt  
  inflating: certs/lab-ES-node-2/lab-ES-node-2.key  
$ unzip ca.zip -d ca
Archive:  ca.zip
   creating: ca/ca/
  inflating: ca/ca/ca.crt            
  inflating: ca/ca/ca.key
$ cd /home/user/elasticsearch-8.2.0
$ cd config && mkdir certs && cd certs
$ cp ../../tmp/cert/certs/lab-ES-node-0/lab-ES-node-0.* .
$ cp ../../tmp/cert/ca/ca/ca.* .

elasticsearch.yml 수정

$ vi conf/elasticsearch.yml

xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.key: certs/lab-ES-node-0.key
xpack.security.transport.ssl.certificate: certs/lab-ES-node-0.crt
xpack.security.transport.ssl.certificate_authorities: certs/ca.crt

실행 후 로그

- [2022-05-11T13:21:39,896][INFO ][o.e.i.r.RecoverySettings ] [lab-ES-node-0] using rate limit [40mb] with [default=40mb, read=0b, write=0b, max=0b]
[2022-05-11T13:21:39,925][INFO ][o.e.d.DiscoveryModule    ] [lab-ES-node-0] using discovery type [multi-node] and seed hosts providers [settings]
[2022-05-11T13:21:41,024][INFO ][o.e.n.Node               ] [lab-ES-node-0] initialized
[2022-05-11T13:21:41,024][INFO ][o.e.n.Node               ] [lab-ES-node-0] starting ...
[2022-05-11T13:21:41,051][INFO ][o.e.x.s.c.f.PersistentCache] [lab-ES-node-0] persistent cache index loaded
[2022-05-11T13:21:41,053][INFO ][o.e.x.d.l.DeprecationIndexingComponent] [lab-ES-node-0] deprecation component started
[2022-05-11T13:21:41,159][INFO ][o.e.t.TransportService   ] [lab-ES-node-0] publish_address {<PUBLIC IP>:9300}, bound_addresses {<node 0 ip>:9300}
[2022-05-11T13:21:41,314][INFO ][o.e.b.BootstrapChecks    ] [lab-ES-node-0] bound or publishing to a non-loopback address, enforcing bootstrap checks
[2022-05-11T13:21:51,329][WARN ][o.e.c.c.ClusterFormationFailureHelper] [lab-ES-node-0] master not discovered yet, this node has not previously joined a bootstrapped cluster, and [cluster.initial_master_nodes] is empty on this node: have discovered [{lab-ES-node-0}{vykaBmGHRxaCN15cp7xavQ}{sYcf7BbzSgSPD_5EwbYVnQ}{<PUBLIC IP>}{<PUBLIC IP>:9300}{cdfhilmrstw}]; discovery will continue using [<node 1 ip>:9300, <node 2 ip>:9300] from hosts providers and [{lab-ES-node-0}{vykaBmGHRxaCN15cp7xavQ}{sYcf7BbzSgSPD_5EwbYVnQ}{<PUBLIC IP>}{<PUBLIC IP>:9300}{cdfhilmrstw}] from last-known cluster state; node term 0, last-accepted version 0 in term 0
[2022-05-11T13:22:01,332][WARN ][o.e.c.c.ClusterFormationFailureHelper] [lab-ES-node-0] master not discovered yet, this node has not previously joined a bootstrapped cluster, and [cluster.initial_master_nodes] is empty on this node: have discovered [{lab-ES-node-0}{vykaBmGHRxaCN15cp7xavQ}{sYcf7BbzSgSPD_5EwbYVnQ}{<PUBLIC IP>}{<PUBLIC IP>:9300}{cdfhilmrstw}]; discovery will continue using [<node 1 ip>:9300, <node 2 ip>:9300] from hosts providers and [{lab-ES-node-0}{vykaBmGHRxaCN15cp7xavQ}{sYcf7BbzSgSPD_5EwbYVnQ}{<PUBLIC IP>}{<PUBLIC IP>:9300}{cdfhilmrstw}] from last-known cluster state; node term 0, last-accepted version 0 in term 0
[2022-05-11T13:22:11,331][WARN ][o.e.n.Node               ] [lab-ES-node-0] timed out while waiting for initial discovery state - timeout: 30s
[2022-05-11T13:22:11,335][WARN ][o.e.c.c.ClusterFormationFailureHelper] [lab-ES-node-0] master not discovered yet, this node has not previously joined a bootstrapped cluster, and [cluster.initial_master_nodes] is empty on this node: have discovered [{lab-ES-node-0}{vykaBmGHRxaCN15cp7xavQ}{sYcf7BbzSgSPD_5EwbYVnQ}{<PUBLIC IP>}{<PUBLIC IP>:9300}{cdfhilmrstw}]; discovery will continue using [<node 1 ip>:9300, <node 2 ip>:9300] from hosts providers and [{lab-ES-node-0}{vykaBmGHRxaCN15cp7xavQ}{sYcf7BbzSgSPD_5EwbYVnQ}{<PUBLIC IP>}{<PUBLIC IP>:9300}{cdfhilmrstw}] from last-known cluster state; node term 0, last-accepted version 0 in term 0
[2022-05-11T13:22:11,342][INFO ][o.e.h.AbstractHttpServerTransport] [lab-ES-node-0] publish_address {<PUBLIC IP>:9200}, bound_addresses {<node 0 ip>:9200}
[2022-05-11T13:22:11,342][INFO ][o.e.n.Node               ] [lab-ES-node-0] started
[2022-05-11T13:22:21,337][WARN ][o.e.c.c.ClusterFormationFailureHelper] [lab-ES-node-0] master not discovered yet, this node has not previously joined a bootstrapped cluster, and [cluster.initial_master_nodes] is empty on this node: have discovered [{lab-ES-node-0}{vykaBmGHRxaCN15cp7xavQ}{sYcf7BbzSgSPD_5EwbYVnQ}{<PUBLIC IP>}{<PUBLIC IP>:9300}{cdfhilmrstw}]; discovery will continue using [<node 1 ip>:9300, <node 2 ip>:9300] from hosts providers and [{lab-ES-node-0}{vykaBmGHRxaCN15cp7xavQ}{sYcf7BbzSgSPD_5EwbYVnQ}{<PUBLIC IP>}{<PUBLIC IP>:9300}{cdfhilmrstw}] from last-known cluster state; node term 0, last-accepted version 0 in term 0
as not previously joined a bootstrapped cluster, and [cluster.initial_master_nodes] is empty on this node: have discovered [{lab-ES-node-0}{vykaBmGHRxaCN15cp7xavQ}{sYcf7BbzSgSPD_5EwbYVnQ}{<PUBLIC IP>}{<PUBLIC IP>:9300}{cdfhilmrstw}]; discovery will continue using [<node 1 ip>:9300, <node 2 ip>:9300] from hosts providers and [{lab-ES-node-0}{vykaBmGHRxaCN15cp7xavQ}{sYcf7BbzSgSPD_5EwbYVnQ}{<PUBLIC IP>}{<PUBLIC IP>:9300}{cdfhilmrstw}] from last-known cluster state; node term 0, last-accepted version 0 in term 0
[2022-05-11T13:22:41,341][WARN ][o.e.c.c.ClusterFormationFailureHelper] [lab-ES-node-0] master not discovered yet, this node has not previously joined a bootstrapped cluster, and [cluster.initial_master_nodes] is empty on this node: have discovered [{lab-ES-node-0}{vykaBmGHRxaCN15cp7xavQ}{sYcf7BbzSgSPD_5EwbYVnQ}{<PUBLIC IP>}{<PUBLIC IP>:9300}{cdfhilmrstw}]; discovery will continue using [<node 1 ip>:9300, <node 2 ip>:9300] from hosts providers and [{lab-ES-node-0}{vykaBmGHRxaCN15cp7xavQ}{sYcf7BbzSgSPD_5EwbYVnQ}{<PUBLIC IP>}{<PUBLIC IP>:9300}{cdfhilmrstw}] from last-known cluster state; node term 0, last-accepted version 0 in term 0
[2022-05-11T13:22:51,343][WARN ][o.e.c.c.ClusterFormationFailureHelper] [lab-ES-node-0] master not discovered yet, this node has not previously joined a bootstrapped cluster, and [cluster.initial_master_nodes] is empty on this node: have discovered [{lab-ES-node-0}{vykaBmGHRxaCN15cp7xavQ}{sYcf7BbzSgSPD_5EwbYVnQ}{<PUBLIC IP>}{<PUBLIC IP>:9300}{cdfhilmrstw}]; discovery will continue using [<node 1 ip>:9300, <node 2 ip>:9300] from hosts providers and [{lab-ES-node-0}{vykaBmGHRxaCN15cp7xavQ}{sYcf7BbzSgSPD_5EwbYVnQ}{<PUBLIC IP>}{<PUBLIC IP>:9300}{cdfhilmrstw}] from last-known cluster state; node term 0, last-accepted version 0 in term 0
[2022-05-11T13:22:52,150][INFO ][o.e.x.m.p.NativeController] [lab-ES-node-0] Native controller process has stopped - no new native processes can be started
3:38:04,330][WARN ][o.e.d.PeerFinder         ] [lab-ES-node-0] address [<node 1 ip>:9300], node [null], requesting [false] discovery result: [][<node 1 ip>:9300] connect_exception: Connection refused: /<node 1 ip>:9300: Connection refused
...
...
...
[2022-05-11T13:38:04,331][WARN ][o.e.d.PeerFinder         ] [lab-ES-node-0] address [<node 2 ip>:9300], node [null], requesting [false] discovery result: [][<node 2 ip>:9300] connect_exception: Connection refused: /<node 2 ip>:9300: Connection refused
[2022-05-11T13:38:05,330][WARN ][o.e.d.PeerFinder         ] [lab-ES-node-0] address [<node 1 ip>:9300], node [null], requesting [false] discovery result: [][<node 1 ip>:9300] connect_exception: Connection refused: /<node 1 ip>:9300: Connection refused
[2022-05-11T13:38:05,331][WARN ][o.e.d.PeerFinder         ] [lab-ES-node-0] address [<node 2 ip>:9300], node [null], requesting [false] discovery result: [][<node 2 ip>:9300] connect_exception: Connection refused: /<node 2 ip>:9300: Connection refused
[2022-05-11T13:38:06,331][WARN ][o.e.d.PeerFinder         ] [lab-ES-node-0] address [<node 1 ip>:9300], node [null], requesting [false] discovery result: [][<node 1 ip>:9300] connect_exception: Connection refused: /<node 1 ip>:9300: Connection refused
[2022-05-11T13:38:06,332][WARN ][o.e.d.PeerFinder         ] [lab-ES-node-0] address [<node 2 ip>:9300], node [null], requesting [false] discovery result: [][<node 2 ip>:9300] connect_exception: Connection refused: /<node 2 ip>:9300: Connection refused
[2022-05-11T13:38:07,333][WARN ][o.e.d.PeerFinder         ] [lab-ES-node-0] address [<node 1 ip>:9300], node [null], requesting [false] discovery result: [][<node 1 ip>:9300] connect_exception: Connection refused: /<node 1 ip>:9300: Connection refused
[2022-05-11T13:38:07,333][WARN ][o.e.d.PeerFinder         ] [lab-ES-node-0] address [<node 2 ip>:9300], node [null], requesting [false] discovery result: [][<node 2 ip>:9300] connect_exception: Connection refused: /<node 2 ip>:9300: Connection refused
[2022-05-11T13:38:08,332][WARN ][o.e.d.PeerFinder         ] [lab-ES-node-0] address [<node 1 ip>:9300], node [null], requesting [false] discovery result: [][<node 1 ip>:9300] connect_exception: Connection refused: /<node 1 ip>:9300: Connection refused
[2022-05-11T13:38:08,332][WARN ][o.e.d.PeerFinder         ] [lab-ES-node-0] address [<node 2 ip>:9300], node [null], requesting [false] discovery result: [][<node 2 ip>:9300] connect_exception: Connection refused: /<node 2 ip>:9300: Connection refused
[2022-05-11T13:38:09,332][WARN ][o.e.d.PeerFinder         ] [lab-ES-node-0] address [<node 1 ip>:9300], node [null], requesting [false] discovery result: [][<node 1 ip>:9300] connect_exception: Connection refused: /<node 1 ip>:9300: Connection refused
[2022-05-11T13:38:09,332][WARN ][o.e.d.PeerFinder         ] [lab-ES-node-0] address [<node 2 ip>:9300], node [null], requesting [false] discovery result: [][<node 2 ip>:9300] connect_exception: Connection refused: /<node 2 ip>:9300: Connection refused
[2022-05-11T13:38:10,332][WARN ][o.e.d.PeerFinder         ] [lab-ES-node-0] address [<node 1 ip>:9300], node [null], requesting [false] discovery result: [][<node 1 ip>:9300] connect_exception: Connection refused: /<node 1 ip>:9300: Connection refused
[2022-05-11T13:38:10,333][WARN ][o.e.d.PeerFinder         ] [lab-ES-node-0] address [<node 2 ip>:9300], node [null], requesting [false] discovery result: [][<node 2 ip>:9300] connect_exception: Connection refused: /<node 2 ip>:9300: Connection refused
...
...
...
[2022-05-11T13:22:52,151][INFO ][o.e.n.Node               ] [lab-ES-node-0] stopping ...
[2022-05-11T13:22:52,156][INFO ][o.e.x.w.WatcherService   ] [lab-ES-node-0] stopping watch service, reason [shutdown initiated]
[2022-05-11T13:22:52,158][INFO ][o.e.x.w.WatcherLifeCycleService] [lab-ES-node-0] watcher has stopped and shutdown
[2022-05-11T13:22:52,380][INFO ][o.e.n.Node               ] [lab-ES-node-0] stopped
[2022-05-11T13:22:52,381][INFO ][o.e.n.Node               ] [lab-ES-node-0] closing ...
[2022-05-11T13:22:52,389][INFO ][o.e.n.Node               ] [lab-ES-node-0] closed
[2022-05-11T13:21:41,024][INFO ][o.e.n.Node ] [lab-ES-node-0] initialized
[2022-05-11T13:21:41,024][INFO ][o.e.n.Node ] [lab-ES-node-0] starting ...
...
[2022-05-11T13:22:11,342][INFO ][o.e.n.Node ] [lab-ES-node-0] started

우선 실행은 된다. 위에서 만난 두 에러는 해결이 된 것 같다. 

[2022-05-11T13:38:04,331][WARN ][o.e.d.PeerFinder ] [lab-ES-node-0] address [<node 2 ip>:9300], node [null], requesting [false] discovery result: [][<node 2 ip>:9300] connect_exception: Connection refused: /<node 2 ip>:9300: Connection refused
[2022-05-11T13:38:05,330][WARN ][o.e.d.PeerFinder ] [lab-ES-node-0] address [<node 1 ip>:9300], node [null], requesting [false] discovery result: [][<node 1 ip>:9300] connect_exception: Connection refused: /<node 1 ip>:9300: Connection refused
...
...
...
[2022-05-11T13:22:52,151][INFO ][o.e.n.Node ] [lab-ES-node-0] stopping ...
[2022-05-11T13:22:52,156][INFO ][o.e.x.w.WatcherService ] [lab-ES-node-0] stopping watch service, reason [shutdown initiated]
[2022-05-11T13:22:52,158][INFO ][o.e.x.w.WatcherLifeCycleService] [lab-ES-node-0] watcher has stopped and shutdown
[2022-05-11T13:22:52,380][INFO ][o.e.n.Node ] [lab-ES-node-0] stopped
[2022-05-11T13:22:52,381][INFO ][o.e.n.Node ] [lab-ES-node-0] closing ...
[2022-05-11T13:22:52,389][INFO ][o.e.n.Node ] [lab-ES-node-0] closed

다른 노드들과의 통신이 불가한 것으로 추정된다.

2-2. Trouble shooting (인증서 문제)


Error Log

[2022-05-11T13:45:09,613][WARN ][o.e.t.OutboundHandler    ] [lab-ES-node-0] send message failed [channel: Netty4TcpChannel{localAddress=/<node 0 ip>:33006, remoteAddress=/<node 1 ip>:9300, profile=default}]
javax.net.ssl.SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
        at sun.security.ssl.Alert.createSSLException(Alert.java:131) ~[?:?]
        at sun.security.ssl.TransportContext.fatal(TransportContext.java:371) ~[?:?]
        at sun.security.ssl.TransportContext.fatal(TransportContext.java:314) ~[?:?]
        at sun.security.ssl.TransportContext.fatal(TransportContext.java:309) ~[?:?]
        at sun.security.ssl.CertificateMessage$T13CertificateConsumer.checkServerCerts(CertificateMessage.java:1351) ~[?:?]
        at sun.security.ssl.CertificateMessage$T13CertificateConsumer.onConsumeCertificate(CertificateMessage.java:1226) ~[?:?]
        at sun.security.ssl.CertificateMessage$T13CertificateConsumer.consume(CertificateMessage.java:1169) ~[?:?]
        at sun.security.ssl.SSLHandshake.consume(SSLHandshake.java:396) ~[?:?]
        at sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:480) ~[?:?]
        at sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1277) ~[?:?]
        at sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1264) ~[?:?]
        at java.security.AccessController.doPrivileged(AccessController.java:712) ~[?:?]
        at sun.security.ssl.SSLEngineImpl$DelegatedTask.run(SSLEngineImpl.java:1209) ~[?:?]
        at io.netty.handler.ssl.SslHandler.runDelegatedTasks(SslHandler.java:1548) [netty-handler-4.1.74.Final.jar:4.1.74.Final]
        at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1394) [netty-handler-4.1.74.Final.jar:4.1.74.Final]
        at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1235) [netty-handler-4.1.74.Final.jar:4.1.74.Final]
        at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1284) [netty-handler-4.1.74.Final.jar:4.1.74.Final]
        at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:510) [netty-codec-4.1.74.Final.jar:4.1.74.Final]
        at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:449) [netty-codec-4.1.74.Final.jar:4.1.74.Final]
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:279) [netty-codec-4.1.74.Final.jar:4.1.74.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.74.Final.jar:4.1.74.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.74.Final.jar:4.1.74.Final]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.74.Final.jar:4.1.74.Final]
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) [netty-transport-4.1.74.Final.jar:4.1.74.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.74.Final.jar:4.1.74.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.74.Final.jar:4.1.74.Final]
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) [netty-transport-4.1.74.Final.jar:4.1.74.Final]
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) [netty-transport-4.1.74.Final.jar:4.1.74.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:722) [netty-transport-4.1.74.Final.jar:4.1.74.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:623) [netty-transport-4.1.74.Final.jar:4.1.74.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:586) [netty-transport-4.1.74.Final.jar:4.1.74.Final]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496) [netty-transport-4.1.74.Final.jar:4.1.74.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986) [netty-common-4.1.74.Final.jar:4.1.74.Final]
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.74.Final.jar:4.1.74.Final]
        at java.lang.Thread.run(Thread.java:833) [?:?]
Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
        at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:388) ~[?:?]
        at sun.security.validator.PKIXValidator.engineValidate(PKIXValidator.java:271) ~[?:?]
        at sun.security.validator.Validator.validate(Validator.java:256) ~[?:?]
        at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:285) ~[?:?]
        at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:144) ~[?:?]
        at org.elasticsearch.common.ssl.DiagnosticTrustManager.checkServerTrusted(DiagnosticTrustManager.java:102) ~[?:?]
        at sun.security.ssl.CertificateMessage$T13CertificateConsumer.checkServerCerts(CertificateMessage.java:1329) ~[?:?]
        ... 30 more
Caused by: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
        at sun.security.provider.certpath.SunCertPathBuilder.build(SunCertPathBuilder.java:141) ~[?:?]
        at sun.security.provider.certpath.SunCertPathBuilder.engineBuild(SunCertPathBuilder.java:126) ~[?:?]
        at java.security.cert.CertPathBuilder.build(CertPathBuilder.java:297) ~[?:?]
        at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:383) ~[?:?]
        at sun.security.validator.PKIXValidator.engineValidate(PKIXValidator.java:271) ~[?:?]
        at sun.security.validator.Validator.validate(Validator.java:256) ~[?:?]
        at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:285) ~[?:?]
        at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:144) ~[?:?]
        at org.elasticsearch.common.ssl.DiagnosticTrustManager.checkServerTrusted(DiagnosticTrustManager.java:102) ~[?:?]
        at sun.security.ssl.CertificateMessage$T13CertificateConsumer.checkServerCerts(CertificateMessage.java:1329) ~[?:?]
        ... 30 more
Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
Caused by: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
$ vi conf/elasticsearch.yml

xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.key: certs/lab-ES-node-0.key
xpack.security.transport.ssl.certificate: certs/lab-ES-node-0.crt
xpack.security.transport.ssl.certificate_authorities: certs/ca.crt

위와 같이 인증서 경로를 지정해줬는데 인증서를 찾지를 못한다. 인증에 대한 개념, 지식이 없어 다른 블로그들을 참고했다.

해결과정

$ ./bin/elasticsearch-certutil cert --ca elastic-stack-ca.p12
$ mv *.p12 config/certs/
$ cd config/certs/
$ scp *.p12 user@ES-node-1:/home/user/distributed-pipeline/elasticsearch-8.2.0/config/certs
$ scp *.p12 user@ES-node-2:/home/user/distributed-pipeline/elasticsearch-8.2.0/config/certs
# vi elasticsearch.yml (모든 노드에서)

xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.keystore.path: certs/elastic-certificates.p12
xpack.security.transport.ssl.truststore.path: certs/elastic-certificates.p12
xpack.security.http.ssl.enabled: true
xpack.security.http.ssl.verification_mode: certificate
xpack.security.http.ssl.keystore.path: certs/elastic-certificates.p12
xpack.security.http.ssl.truststore.path: certs/elastic-certificates.p12

참고한 블로그 글

 

[Elasticsearch] security 적용하기 - 2

@ 시작하기전 Window PC에서 VMware로 Linux서버 3대로 클러스터 구성할 계획입니다. 서버사양 OS : CentOS7.4 CPU : 3 core Memory : 4 GB Elasticsearch Version : 7.9.1 [Elasticsearch] 설치하기 @ 시작하기전..

llnote.tistory.com

2-3. Trouble shooting (TCP 포트 지정, discovery.seed_hosts 설정)

Error Log

[2022-05-11T15:26:24,855][WARN ][o.e.d.HandshakingTransportAddressConnector] [lab-ES-node-0] completed handshake with [{lab-ES-node-1}{2cl6rxCKQq6V8FwbXW3KCA}{EtMQL88yTTaPjW3u7BrpZw}{<PUBLIC IP>}{<PUBLIC IP>:9300}{cdfhilmrstw}] at [<node 1 ip>:9300] but followup connection to [<PUBLIC IP>:9300] failed
org.elasticsearch.transport.ConnectTransportException: [lab-ES-node-1][<PUBLIC IP>:9300] connect_timeout[30s]
        at org.elasticsearch.transport.TcpTransport$ChannelsConnectedListener.onTimeout(TcpTransport.java:1112) ~[elasticsearch-8.2.0.jar:8.2.0]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:714) ~[elasticsearch-8.2.0.jar:8.2.0]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
        at java.lang.Thread.run(Thread.java:833) [?:?]
[2022-05-11T15:26:25,708][WARN ][o.e.d.HandshakingTransportAddressConnector] [lab-ES-node-0] completed handshake with [{lab-ES-node-2}{bVpgWOQaTYWYjqKAg35s9A}{T009khcJTuuHEgBPT-95Uw}{<PUBLIC IP>}{<PUBLIC IP>:9300}{cdfhilmrstw}] at [<node 2 ip>:9300] but followup connection to [<PUBLIC IP>:9300] failed
org.elasticsearch.transport.ConnectTransportException: [lab-ES-node-2][<PUBLIC IP>:9300] connect_exception
        at org.elasticsearch.transport.TcpTransport$ChannelsConnectedListener.onFailure(TcpTransport.java:1106) ~[elasticsearch-8.2.0.jar:8.2.0]
        at org.elasticsearch.action.ActionListener.lambda$toBiConsumer$0(ActionListener.java:305) ~[elasticsearch-8.2.0.jar:8.2.0]
        at org.elasticsearch.core.CompletableContext.lambda$addListener$0(CompletableContext.java:33) ~[elasticsearch-core-8.2.0.jar:8.2.0]
        at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863) ~[?:?]
        at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841) ~[?:?]
        at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510) ~[?:?]
        at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162) ~[?:?]
        at org.elasticsearch.core.CompletableContext.completeExceptionally(CompletableContext.java:48) ~[elasticsearch-core-8.2.0.jar:8.2.0]
        at org.elasticsearch.transport.netty4.Netty4TcpChannel.lambda$addListener$0(Netty4TcpChannel.java:63) ~[?:?]
        at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:578) ~[?:?]
        at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:571) ~[?:?]
        at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:550) ~[?:?]
        at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:491) ~[?:?]
        at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:616) ~[?:?]
        at io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:609) ~[?:?]
        at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:117) ~[?:?]
        at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe$1.run(AbstractNioChannel.java:262) ~[?:?]
        at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) ~[?:?]
        at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:170) ~[?:?]
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) ~[?:?]
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:469) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:503) ~[?:?]
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986) ~[?:?]
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[?:?]
        at java.lang.Thread.run(Thread.java:833) [?:?]
Caused by: io.netty.channel.ConnectTimeoutException: connection timed out: <PUBLIC IP>/<PUBLIC IP>:9300
        at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe$1.run(AbstractNioChannel.java:261) ~[?:?]
        ... 8 more

해결과정

# vi config/elasticsearch.yml

transport.port: 9300 (node 1)
transport.port: 9301 (node 2)
transport.port: 9302 (node 3)

discovery.seed_hosts: ["<node 0 ip>:9300", "<node 1 ip>:9301", "<node 2 ip>:9302"] (all node)
  • 서로 PUBLIC IP PORT도 같길래 transport.port를 아예 변경하고, discovery.seed_hosts에 port도 명기함
  • 내부에서는 동일하게 9300을 쓰고 외부 포트만 다르게 할 수 있긴한데 그냥 함

2-4. Trouble shooting (TCP 내부포트를 외부포트로 포워딩)

  • 내부에서 <node 0 ip>:9300, <node 1 ip>:9301, <node 2 ip>:9302 끼리는 통신했는데, 외부에서 PUBLIC_IP:9300, PUBLIC_IP:9301, PUBLIC_IP:9302 끼리 통신이 불가한 문제

Node 0 Error Log

[2022-05-11T15:47:16,784][WARN ][o.e.d.HandshakingTransportAddressConnector] [lab-ES-node-0] completed handshake with [{lab-ES-node-1}{2cl6rxCKQq6V8FwbXW3KCA}
{8NyYzIP-TJOti5tPuBajQg}{<PUBLIC IP>}{<PUBLIC IP>:9301}{cdfhilmrstw}] at [<node 1 ip>:9301] but followup connection to [<PUBLIC IP>:9301] failed    
org.elasticsearch.transport.ConnectTransportException: [lab-ES-node-1][<PUBLIC IP>:9301] connect_timeout[30s]                                             
        at org.elasticsearch.transport.TcpTransport$ChannelsConnectedListener.onTimeout(TcpTransport.java:1112) ~[elasticsearch-8.2.0.jar:8.2.0]              
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:714) ~[elasticsearch-8.2.0.jar:8.2.0]      
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]                                                              
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]                                                              
        at java.lang.Thread.run(Thread.java:833) [?:?]                                                                                                        
[2022-05-11T15:47:18,776][WARN ][o.e.d.HandshakingTransportAddressConnector] [lab-ES-node-0] completed handshake with [{lab-ES-node-2}{bVpgWOQaTYWYjqKAg35s9A}
{E-ZnuWSHSL2_HqSSbMjFkQ}{<PUBLIC IP>}{<PUBLIC IP>:9302}{cdfhilmrstw}] at [<node 2 ip>:9302] but followup connection to [<PUBLIC IP>:9302] failed    
org.elasticsearch.transport.ConnectTransportException: [lab-ES-node-2][<PUBLIC IP>:9302] connect_exception                                                
        at org.elasticsearch.transport.TcpTransport$ChannelsConnectedListener.onFailure(TcpTransport.java:1106) ~[elasticsearch-8.2.0.jar:8.2.0]              
        at org.elasticsearch.action.ActionListener.lambda$toBiConsumer$0(ActionListener.java:305) ~[elasticsearch-8.2.0.jar:8.2.0]                            
        at org.elasticsearch.core.CompletableContext.lambda$addListener$0(CompletableContext.java:33) ~[elasticsearch-core-8.2.0.jar:8.2.0]                   
        at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863) ~[?:?]                                                          
        at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841) ~[?:?]                                                  
        at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510) ~[?:?]                                                             
        at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162) ~[?:?]                                                   
        at org.elasticsearch.core.CompletableContext.completeExceptionally(CompletableContext.java:48) ~[elasticsearch-core-8.2.0.jar:8.2.0]                  
        at org.elasticsearch.transport.netty4.Netty4TcpChannel.lambda$addListener$0(Netty4TcpChannel.java:63) ~[?:?]                                          
        at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:578) ~[?:?]                                                            
        at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:571) ~[?:?]                                                           
        at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:550) ~[?:?]                                                         
        at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:491) ~[?:?]                                                            
        at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:616) ~[?:?]                                                                  
        at io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:609) ~[?:?]                                                                
        at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:117) ~[?:?]                                                                 
        at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe$1.run(AbstractNioChannel.java:262) ~[?:?]                                                
        at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) ~[?:?]                                                                           
        at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:170) ~[?:?]                                                              
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) ~[?:?]                                                  
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:469) ~[?:?]                                          
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:503) ~[?:?]                                                                                
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986) ~[?:?]                                                
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[?:?]                                                                   
        at java.lang.Thread.run(Thread.java:833) [?:?]
Caused by: io.netty.channel.ConnectTimeoutException: connection timed out: <PUBLIC IP>/<PUBLIC IP>:9302
        at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe$1.run(AbstractNioChannel.java:261) ~[?:?]
        ... 8 more

Node 1 Error Log

[2022-05-11T15:46:47,262][WARN ][o.e.d.HandshakingTransportAddressConnector] [lab-ES-node-1] completed handshake with [{lab-ES-node-2}{bVpgWOQaTYWYjqKAg35s9A}
{E-ZnuWSHSL2_HqSSbMjFkQ}{<PUBLIC IP>}{<PUBLIC IP>:9302}{cdfhilmrstw}] at [<node 2 ip>:9302] but followup connection to [<PUBLIC IP>:9302] failed    
org.elasticsearch.transport.ConnectTransportException: [lab-ES-node-2][<PUBLIC IP>:9302] connect_timeout[30s]                                             
        at org.elasticsearch.transport.TcpTransport$ChannelsConnectedListener.onTimeout(TcpTransport.java:1112) ~[elasticsearch-8.2.0.jar:8.2.0]              
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:714) ~[elasticsearch-8.2.0.jar:8.2.0]      
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]                                                              
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]                                                              
        at java.lang.Thread.run(Thread.java:833) [?:?]
[2022-05-11T15:47:16,115][WARN ][o.e.d.HandshakingTransportAddressConnector] [lab-ES-node-1] completed handshake with [{lab-ES-node-0}{vykaBmGHRxaCN1[34/1892$
{Itd9jlfgSO-ESlNsv7xYvg}{<PUBLIC IP>}{<PUBLIC IP>:9300}{cdfhilmrstw}] at [<node 0 ip>:9300] but followup connection to [<PUBLIC IP>:9300] failed    
org.elasticsearch.transport.ConnectTransportException: [lab-ES-node-0][<PUBLIC IP>:9300] connect_exception                                                
        at org.elasticsearch.transport.TcpTransport$ChannelsConnectedListener.onFailure(TcpTransport.java:1106) ~[elasticsearch-8.2.0.jar:8.2.0]              
        at org.elasticsearch.action.ActionListener.lambda$toBiConsumer$0(ActionListener.java:305) ~[elasticsearch-8.2.0.jar:8.2.0]                            
        at org.elasticsearch.core.CompletableContext.lambda$addListener$0(CompletableContext.java:33) ~[elasticsearch-core-8.2.0.jar:8.2.0]                   
        at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863) ~[?:?]                                                          
        at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841) ~[?:?]                                                  
        at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510) ~[?:?]                                                             
        at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162) ~[?:?]                                                   
        at org.elasticsearch.core.CompletableContext.completeExceptionally(CompletableContext.java:48) ~[elasticsearch-core-8.2.0.jar:8.2.0]                  
        at org.elasticsearch.transport.netty4.Netty4TcpChannel.lambda$addListener$0(Netty4TcpChannel.java:63) ~[?:?]                                          
        at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:578) ~[?:?]                                                            
        at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:571) ~[?:?]                                                           
        at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:550) ~[?:?]                                                         
        at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:491) ~[?:?]                                                            
        at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:616) ~[?:?]                                                                  
        at io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:609) ~[?:?]                                                                
        at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:117) ~[?:?]                                                                 
        at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe$1.run(AbstractNioChannel.java:262) ~[?:?]                                                
        at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) ~[?:?]                                                                           
        at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:170) ~[?:?]                                                              
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) ~[?:?]                                                  
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:469) ~[?:?]                                          
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:503) ~[?:?]                                                                                
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986) ~[?:?]                                                
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[?:?]                                                                   
        at java.lang.Thread.run(Thread.java:833) [?:?]                                                                                                        
Caused by: io.netty.channel.ConnectTimeoutException: connection timed out: <PUBLIC IP>/<PUBLIC IP>:9300                                               
        at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe$1.run(AbstractNioChannel.java:261) ~[?:?]                                                
        ... 8 more

Node 2 Error Log

[2022-05-11T15:46:43,704][WARN ][o.e.n.Node               ] [lab-ES-node-0] timed out while waiting for initial discovery state - timeout: 30s
[2022-05-11T15:46:43,710][WARN ][o.e.c.c.ClusterFormationFailureHelper] [lab-ES-node-0] master not discovered yet, this node has not previously joined a bootstrapped cluster, and this node must discover master-eligible nodes [lab-ES-node-0, lab-ES-node-1, lab-ES-node-2] to bootstrap a cluster: have discovered [{lab-ES-node-0}{vykaBmGHRxaCN15cp7xavQ}{Itd9jlfgSO-ESlNsv7xYvg}{<PUBLIC IP>}{<PUBLIC IP>:9300}{cdfhilmrstw}]; discovery will continue using [<node 1 ip>:9301, <node 2 ip>:9302] from hosts providers and [{lab-ES-node-0}{vykaBmGHRxaCN15cp7xavQ}{Itd9jlfgSO-ESlNsv7xYvg}{<PUBLIC IP>}{<PUBLIC IP>:9300}{cdfhilmrstw}] from last-known cluster state; node term 0, last-accepted version 0 in term 0
[2022-05-11T15:46:43,716][INFO ][o.e.h.AbstractHttpServerTransport] [lab-ES-node-0] publish_address {<PUBLIC IP>:9200}, bound_addresses {<node 0 ip>:9200}
[2022-05-11T15:46:43,716][INFO ][o.e.n.Node               ] [lab-ES-node-0] started
[2022-05-11T15:46:45,845][WARN ][o.e.d.HandshakingTransportAddressConnector] [lab-ES-node-0] completed handshake with [{lab-ES-node-1}{2cl6rxCKQq6V8FwbXW3KCA}{8NyYzIP-TJOti5tPuBajQg}{<PUBLIC IP>}{<PUBLIC IP>:9301}{cdfhilmrstw}] at [<node 1 ip>:9301] but followup connection to [<PUBLIC IP>:9301] failed
org.elasticsearch.transport.ConnectTransportException: [lab-ES-node-1][<PUBLIC IP>:9301] connect_timeout[30s]
        at org.elasticsearch.transport.TcpTransport$ChannelsConnectedListener.onTimeout(TcpTransport.java:1112) ~[elasticsearch-8.2.0.jar:8.2.0]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:714) ~[elasticsearch-8.2.0.jar:8.2.0]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
        at java.lang.Thread.run(Thread.java:833) [?:?]
[2022-05-11T15:46:47,795][WARN ][o.e.d.HandshakingTransportAddressConnector] [lab-ES-node-0] completed handshake with [{lab-ES-node-2}{bVpgWOQaTYWYjqKAg35s9A}{E-ZnuWSHSL2_HqSSbMjFkQ}{<PUBLIC IP>}{<PUBLIC IP>:9302}{cdfhilmrstw}] at [<node 2 ip>:9302] but followup connection to [<PUBLIC IP>:9302] failed
org.elasticsearch.transport.ConnectTransportException: [lab-ES-node-2][<PUBLIC IP>:9302] connect_timeout[30s]
        at org.elasticsearch.transport.TcpTransport$ChannelsConnectedListener.onTimeout(TcpTransport.java:1112) ~[elasticsearch-8.2.0.jar:8.2.0]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:714) ~[elasticsearch-8.2.0.jar:8.2.0]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
        at java.lang.Thread.run(Thread.java:833) [?:?]
  • 포트포워딩 해줌

결과 로그

[2022-05-11T15:55:28,915][INFO ][o.e.n.Node               ] [lab-ES-node-0] version[8.2.0], pid[19623], build[default/tar/b174af62e8dd9f4ac4d25875e9381ffe2b9282c5/2022-04-20T10:35:10.180408517Z], OS[Linux/4.15.0-29-generic/amd64], JVM[Eclipse Adoptium/OpenJDK 64-Bit Server VM/18/18+36]
[2022-05-11T15:55:28,921][INFO ][o.e.n.Node               ] [lab-ES-node-0] JVM home [/home/ubuntu/distributed-pipeline/elasticsearch-8.2.0/jdk], using bundled JDK [true]
[2022-05-11T15:55:28,921][INFO ][o.e.n.Node               ] [lab-ES-node-0] JVM arguments [-Xshare:auto, -Des.networkaddress.cache.ttl=60, -Des.networkaddress.cache.negative.ttl=10, -Djava.security.manager=allow, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -XX:+ShowCodeDetailsInExceptionMessages, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Dlog4j2.formatMsgNoLookups=true, -Djava.locale.providers=SPI,COMPAT, --add-opens=java.base/java.io=ALL-UNNAMED, -XX:+UseG1GC, -Djava.io.tmpdir=/tmp/elasticsearch-3041196443150992640, -XX:+HeapDumpOnOutOfMemoryError, -XX:+ExitOnOutOfMemoryError, -XX:HeapDumpPath=data, -XX:ErrorFile=logs/hs_err_pid%p.log, -Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m, -Xms31744m, -Xmx31744m, -XX:MaxDirectMemorySize=16642998272, -XX:InitiatingHeapOccupancyPercent=30, -XX:G1ReservePercent=25, -Des.path.home=/home/ubuntu/distributed-pipeline/elasticsearch-8.2.0, -Des.path.conf=/home/ubuntu/distributed-pipeline/elasticsearch-8.2.0/config, -Des.distribution.flavor=default, -Des.distribution.type=tar, -Des.bundled_jdk=true]
[2022-05-11T15:55:31,138][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [aggs-matrix-stats]
[2022-05-11T15:55:31,139][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [analysis-common]
[2022-05-11T15:55:31,139][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [constant-keyword]
[2022-05-11T15:55:31,139][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [data-streams]
[2022-05-11T15:55:31,140][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [frozen-indices]
[2022-05-11T15:55:31,140][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [ingest-common]
[2022-05-11T15:55:31,140][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [ingest-geoip]
[2022-05-11T15:55:31,140][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [ingest-user-agent]
[2022-05-11T15:55:31,140][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [kibana]
[2022-05-11T15:55:31,141][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [lang-expression]
[2022-05-11T15:55:31,141][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [lang-mustache]
[2022-05-11T15:55:31,141][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [lang-painless]
[2022-05-11T15:55:31,141][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [legacy-geo]
[2022-05-11T15:55:31,141][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [mapper-extras]
[2022-05-11T15:55:31,142][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [mapper-version]
[2022-05-11T15:55:31,142][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [parent-join]
[2022-05-11T15:55:31,142][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [percolator]
[2022-05-11T15:55:31,142][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [rank-eval]
[2022-05-11T15:55:31,142][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [reindex]
[2022-05-11T15:55:31,143][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [repositories-metering-api]
[2022-05-11T15:55:31,143][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [repository-azure]
[2022-05-11T15:55:31,143][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [repository-encrypted]
[2022-05-11T15:55:31,143][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [repository-gcs]
[2022-05-11T15:55:31,143][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [repository-s3]
[2022-05-11T15:55:31,144][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [repository-url]
[2022-05-11T15:55:31,144][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [runtime-fields-common]
[2022-05-11T15:55:31,144][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [search-business-rules]
[2022-05-11T15:55:31,144][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [searchable-snapshots]
[2022-05-11T15:55:31,145][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [snapshot-based-recoveries]
[2022-05-11T15:55:31,145][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [snapshot-repo-test-kit]
[2022-05-11T15:55:31,145][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [spatial]
[2022-05-11T15:55:31,145][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [transform]
[2022-05-11T15:55:31,145][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [transport-netty4]
[2022-05-11T15:55:31,146][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [unsigned-long]
[2022-05-11T15:55:31,146][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [vector-tile]
[2022-05-11T15:55:31,146][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [vectors]
[2022-05-11T15:55:31,146][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [wildcard]
[2022-05-11T15:55:31,146][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [x-pack-aggregate-metric]
[2022-05-11T15:55:31,147][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [x-pack-analytics]
[2022-05-11T15:55:31,147][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [x-pack-async]
[2022-05-11T15:55:31,147][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [x-pack-async-search]
[2022-05-11T15:55:31,147][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [x-pack-autoscaling]
[2022-05-11T15:55:31,147][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [x-pack-ccr]
[2022-05-11T15:55:31,148][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [x-pack-core]
[2022-05-11T15:55:31,148][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [x-pack-deprecation]
[2022-05-11T15:55:31,148][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [x-pack-enrich]
[2022-05-11T15:55:31,148][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [x-pack-eql]
[2022-05-11T15:55:31,148][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [x-pack-fleet]
[2022-05-11T15:55:31,148][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [x-pack-graph]
[2022-05-11T15:55:31,149][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [x-pack-identity-provider]
[2022-05-11T15:55:31,149][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [x-pack-ilm]
[2022-05-11T15:55:31,149][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [x-pack-logstash]
[2022-05-11T15:55:31,149][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [x-pack-ml]
[2022-05-11T15:55:31,149][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [x-pack-monitoring] 
[2022-05-11T15:55:31,149][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [x-pack-ql]
[2022-05-11T15:55:31,149][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [x-pack-rollup]
[2022-05-11T15:55:31,150][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [x-pack-security]
[2022-05-11T15:55:31,150][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [x-pack-shutdown]
[2022-05-11T15:55:31,150][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [x-pack-sql]
[2022-05-11T15:55:31,150][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [x-pack-stack]
[2022-05-11T15:55:31,150][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [x-pack-text-structure]
[2022-05-11T15:55:31,150][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [x-pack-voting-only-node]
[2022-05-11T15:55:31,151][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] loaded module [x-pack-watcher]
[2022-05-11T15:55:31,151][INFO ][o.e.p.PluginsService     ] [lab-ES-node-0] no plugins loaded
[2022-05-11T15:55:31,197][INFO ][o.e.e.NodeEnvironment    ] [lab-ES-node-0] using [1] data paths, mounts [[/home (/dev/sda5)]], net usable_space [5.2tb], net total_space [5.5tb], types [ext4]
[2022-05-11T15:55:31,198][INFO ][o.e.e.NodeEnvironment    ] [lab-ES-node-0] heap size [31gb], compressed ordinary object pointers [true]
[2022-05-11T15:55:31,248][INFO ][o.e.n.Node               ] [lab-ES-node-0] node name [lab-ES-node-0], node ID [vykaBmGHRxaCN15cp7xavQ], cluster name [LAB-ES-1], roles [data_hot, ml, data_frozen, ingest, data_cold, data, remote_cluster_client, master, data_warm, data_content, transform] 
[2022-05-11T15:55:35,120][INFO ][o.e.x.m.p.l.CppLogMessageHandler] [lab-ES-node-0] [controller/19966] [Main.cc@123] controller (64 bit): Version 8.2.0 (Build a8c0a88ede0ff2) Copyright (c) 2022 Elasticsearch BV
[2022-05-11T15:55:35,240][INFO ][o.e.x.s.Security         ] [lab-ES-node-0] Security is enabled
[2022-05-11T15:55:35,512][INFO ][o.e.x.s.a.Realms         ] [lab-ES-node-0] license mode is [trial], currently licensed security realms are [reserved/reserved,file/default_file,native/default_native]
[2022-05-11T15:55:35,521][INFO ][o.e.x.s.a.s.FileRolesStore] [lab-ES-node-0] parsed [0] roles from file [/home/ubuntu/distributed-pipeline/elasticsearch-8.2.0/config/roles.yml]
[2022-05-11T15:55:36,562][INFO ][o.e.t.n.NettyAllocator   ] [lab-ES-node-0] creating NettyAllocator with the following configs: [name=elasticsearch_configured, chunk_size=1mb, suggested_max_allocation_size=1mb, factors={es.unsafe.use_netty_default_chunk_and_page_size=false, g1gc_enabled=true, g1gc_region_size=16mb}]
[2022-05-11T15:55:36,588][INFO ][o.e.i.r.RecoverySettings ] [lab-ES-node-0] using rate limit [40mb] with [default=40mb, read=0b, write=0b, max=0b]
[2022-05-11T15:55:36,615][INFO ][o.e.d.DiscoveryModule    ] [lab-ES-node-0] using discovery type [multi-node] and seed hosts providers [settings]
[2022-05-11T15:55:37,598][INFO ][o.e.n.Node               ] [lab-ES-node-0] initialized
[2022-05-11T15:55:37,598][INFO ][o.e.n.Node               ] [lab-ES-node-0] starting ...
[2022-05-11T15:55:37,619][INFO ][o.e.x.s.c.f.PersistentCache] [lab-ES-node-0] persistent cache index loaded
[2022-05-11T15:55:37,620][INFO ][o.e.x.d.l.DeprecationIndexingComponent] [lab-ES-node-0] deprecation component started
[2022-05-11T15:55:37,709][INFO ][o.e.t.TransportService   ] [lab-ES-node-0] publish_address {<PUBLIC IP>:9300}, bound_addresses {<node 0 ip>:9300}
[2022-05-11T15:55:37,857][INFO ][o.e.b.BootstrapChecks    ] [lab-ES-node-0] bound or publishing to a non-loopback address, enforcing bootstrap checks
[2022-05-11T15:55:38,333][INFO ][o.e.c.c.Coordinator      ] [lab-ES-node-0] setting initial configuration to VotingConfiguration{vykaBmGHRxaCN15cp7xavQ,{bootstrap-placeholder}-lab-ES-node-1,bVpgWOQaTYWYjqKAg35s9A}
[2022-05-11T15:55:39,191][INFO ][o.e.c.s.ClusterApplierService] [lab-ES-node-0] master node changed {previous [], current [{lab-ES-node-2}{bVpgWOQaTYWYjqKAg35s9A}{eh0-pmfbSxGBuYba28MGCg}{<PUBLIC IP>}{<PUBLIC IP>:9302}{cdfhilmrstw}]}, added {{lab-ES-node-1}{2cl6rxCKQq6V8FwbXW3KCA}{aET7ghaYSRaXxrqYjNj4cQ}{<PUBLIC IP>}{<PUBLIC IP>:9301}{cdfhilmrstw}, {lab-ES-node-2}{bVpgWOQaTYWYjqKAg35s9A}{eh0-pmfbSxGBuYba28MGCg}{<PUBLIC IP>}{<PUBLIC IP>:9302}{cdfhilmrstw}}, term: 1, version: 33, reason: ApplyCommitRequest{term=1, version=33, sourceNode={lab-ES-node-2}{bVpgWOQaTYWYjqKAg35s9A}{eh0-pmfbSxGBuYba28MGCg}{<PUBLIC IP>}{<PUBLIC IP>:9302}{cdfhilmrstw}{ml.machine_memory=134664908800, ml.max_jvm_size=33285996544, xpack.installed=true}}
[2022-05-11T15:55:39,271][INFO ][o.e.x.s.a.TokenService   ] [lab-ES-node-0] refresh keys
[2022-05-11T15:55:39,419][INFO ][o.e.x.s.a.TokenService   ] [lab-ES-node-0] refreshed keys
[2022-05-11T15:55:39,443][INFO ][o.e.h.AbstractHttpServerTransport] [lab-ES-node-0] publish_address {<PUBLIC IP>:9200}, bound_addresses {<node 0 ip>:9200}
[2022-05-11T15:55:39,444][INFO ][o.e.n.Node               ] [lab-ES-node-0] started
[2022-05-11T15:55:41,400][INFO ][o.e.l.LicenseService     ] [lab-ES-node-0] license [3985f36f-9a49-4326-b778-f71ad251f4a7] mode [basic] - valid
[2022-05-11T15:55:41,401][INFO ][o.e.x.s.a.Realms         ] [lab-ES-node-0] license mode is [basic], currently licensed security realms are [reserved/reserved,file/default_file,native/default_native]
[2022-05-11T15:55:42,031][INFO ][o.e.i.g.DatabaseNodeService] [lab-ES-node-0] retrieve geoip database [GeoLite2-ASN.mmdb] from [.geoip_databases] to [/tmp/elasticsearch-3041196443150992640/geoip-databases/vykaBmGHRxaCN15cp7xavQ/GeoLite2-ASN.mmdb.tmp.gz]
[2022-05-11T15:55:42,049][INFO ][o.e.i.g.GeoIpDownloader  ] [lab-ES-node-0] successfully downloaded geoip database [GeoLite2-ASN.mmdb]
[2022-05-11T15:55:42,238][INFO ][o.e.i.g.DatabaseNodeService] [lab-ES-node-0] successfully loaded geoip database file [GeoLite2-ASN.mmdb]
[2022-05-11T15:55:45,946][INFO ][o.e.i.g.DatabaseNodeService] [lab-ES-node-0] retrieve geoip database [GeoLite2-City.mmdb] from [.geoip_databases] to [/tmp/elasticsearch-3041196443150992640/geoip-databases/vykaBmGHRxaCN15cp7xavQ/GeoLite2-City.mmdb.tmp.gz]
[2022-05-11T15:55:45,963][INFO ][o.e.i.g.GeoIpDownloader  ] [lab-ES-node-0] successfully downloaded geoip database [GeoLite2-City.mmdb]
[2022-05-11T15:55:46,752][INFO ][o.e.i.g.DatabaseNodeService] [lab-ES-node-0] retrieve geoip database [GeoLite2-Country.mmdb] from [.geoip_databases] to [/tmp/elasticsearch-3041196443150992640/geoip-databases/vykaBmGHRxaCN15cp7xavQ/GeoLite2-Country.mmdb.tmp.gz]
[2022-05-11T15:55:46,766][INFO ][o.e.i.g.GeoIpDownloader  ] [lab-ES-node-0] successfully downloaded geoip database [GeoLite2-Country.mmdb]
[2022-05-11T15:55:46,920][INFO ][o.e.i.g.DatabaseNodeService] [lab-ES-node-0] successfully loaded geoip database file [GeoLite2-Country.mmdb]
[2022-05-11T15:55:47,927][INFO ][o.e.i.g.DatabaseNodeService] [lab-ES-node-0] successfully loaded geoip database file [GeoLite2-City.mmdb]

3. 비밀번호 세팅


$ ./bin/elasticsearch-setup-passwords interactive --url https://<node 0 ip>:9200 
******************************************************************************
Note: The 'elasticsearch-setup-passwords' tool has been deprecated. This       command will be removed in a future release.
******************************************************************************

Initiating the setup of passwords for reserved users elastic,apm_system,kibana,kibana_system,logstash_system,beats_system,remote_monitoring_user.
You will be prompted to enter passwords as the process progresses.
Please confirm that you would like to continue [y/N]y


Enter password for [elastic]: 
Reenter password for [elastic]: 
Enter password for [apm_system]: 
Reenter password for [apm_system]: 
Enter password for [kibana_system]: 
Reenter password for [kibana_system]: 
Enter password for [logstash_system]: 
Reenter password for [logstash_system]: 
Enter password for [beats_system]: 
Reenter password for [beats_system]: 
Enter password for [remote_monitoring_user]: 
Reenter password for [remote_monitoring_user]: 
Changed password for user [apm_system]
Changed password for user [kibana_system]
Changed password for user [kibana]
Changed password for user [logstash_system]
Changed password for user [beats_system]
Changed password for user [remote_monitoring_user]
Changed password for user [elastic]

비밀번호 세팅 후 로그

[2022-05-11T16:11:52,561][INFO ][o.e.c.m.MetadataCreateIndexService] [lab-ES-node-1] [.security-7] creating index, cause [api], templates [], shards [1]/[0]
[2022-05-11T16:11:52,566][INFO ][o.e.c.r.a.AllocationService] [lab-ES-node-1] updating number_of_replicas to [1] for indices [.security-7]
[2022-05-11T16:11:53,306][INFO ][o.e.c.r.a.AllocationService] [lab-ES-node-1] current.health="GREEN" message="Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[.security-7][0]]])." previous.health="YELLOW" reason="shards started [[.security-7][0]]"

4. Req 테스트


curl https://<host ip>:9200 -k -u elastic:<passwd>
{
  "name" : "lab-ES-node-0",
  "cluster_name" : "LAB-ES-1",
  "cluster_uuid" : "C9ftxwdeSvKYUrOUQBc4BA",
  "version" : {
    "number" : "8.2.0",
    "build_flavor" : "default",
    "build_type" : "tar",
    "build_hash" : "b174af62e8dd9f4ac4d25875e9381ffe2b9282c5",
    "build_date" : "2022-04-20T10:35:10.180408517Z",
    "build_snapshot" : false,
    "lucene_version" : "9.1.0",
    "minimum_wire_compatibility_version" : "7.17.0",
    "minimum_index_compatibility_version" : "7.0.0"
  },
  "tagline" : "You Know, for Search"
}

5. 데이터 인덱싱 테스트


5-1. 첫 번째 시도

최초 코드

from elasticsearch import Elasticsearch


if __name__ == "__main__":
    es = Elasticsearch(
        hosts=["https://<host ip>:9200"],
        basic_auth=("elastic", "mypasswd")
    )

    index = "sm-test-shopping"
    doc = {
        "title": "Romeo and Juliet:python",
        "author": "William Shakespeare",
        "category": "Tragedies",
        "publish_date": "1562-12-01T00:00:00",
        "pages": 125
    }

    es.index(index=index, document=doc)

Error log

Traceback (most recent call last):
  File "/home/smheo/dev/Common-Datahandler/test/insert-es-test.py", line 29, in <module>
    es.index(index=index, document=doc)
  File "/home/smheo/anaconda3/envs/data_handler/lib/python3.8/site-packages/elasticsearch/_sync/client/utils.py", line 404, in wrapped
    return api(*args, **kwargs)
  File "/home/smheo/anaconda3/envs/data_handler/lib/python3.8/site-packages/elasticsearch/_sync/client/__init__.py", line 2218, in index
    return self.perform_request(  # type: ignore[return-value]
  File "/home/smheo/anaconda3/envs/data_handler/lib/python3.8/site-packages/elasticsearch/_sync/client/_base.py", line 286, in perform_request
    meta, resp_body = self.transport.perform_request(
  File "/home/smheo/anaconda3/envs/data_handler/lib/python3.8/site-packages/elastic_transport/_transport.py", line 329, in perform_request
    meta, raw_data = node.perform_request(
  File "/home/smheo/anaconda3/envs/data_handler/lib/python3.8/site-packages/elastic_transport/_node/_http_urllib3.py", line 164, in perform_request
    response = self.pool.urlopen(  # type: ignore[no-untyped-call]
  File "/home/smheo/anaconda3/envs/data_handler/lib/python3.8/site-packages/urllib3/connectionpool.py", line 699, in urlopen
    httplib_response = self._make_request(
  File "/home/smheo/anaconda3/envs/data_handler/lib/python3.8/site-packages/urllib3/connectionpool.py", line 382, in _make_request
    self._validate_conn(conn)
  File "/home/smheo/anaconda3/envs/data_handler/lib/python3.8/site-packages/urllib3/connectionpool.py", line 1010, in _validate_conn
    conn.connect()
  File "/home/smheo/anaconda3/envs/data_handler/lib/python3.8/site-packages/urllib3/connection.py", line 416, in connect
    self.sock = ssl_wrap_socket(
  File "/home/smheo/anaconda3/envs/data_handler/lib/python3.8/site-packages/urllib3/util/ssl_.py", line 453, in ssl_wrap_socket
    ssl_sock = _ssl_wrap_socket_impl(sock, context, tls_in_tls)
  File "/home/smheo/anaconda3/envs/data_handler/lib/python3.8/site-packages/urllib3/util/ssl_.py", line 495, in _ssl_wrap_socket_impl
    return ssl_context.wrap_socket(sock)
  File "/home/smheo/anaconda3/envs/data_handler/lib/python3.8/ssl.py", line 500, in wrap_socket
    return self.sslsocket_class._create(
  File "/home/smheo/anaconda3/envs/data_handler/lib/python3.8/ssl.py", line 997, in _create
    raise ValueError("check_hostname requires server_hostname")
ValueError: check_hostname requires server_hostname

Process finished with exit code 1

원인 및 해결방법

1. cert 파일 만들 때 작성한 instance.yml을 본다.

# vi instance.yml
instances:
  - name: "lab-ES-node-0"
    ip:
      - "<node 0 ip>"
  - name: "lab-ES-node-1"
    ip:
      - "<node 1 ip>"
  - name: "lab-ES-node-2"
    ip: 
      - "<node 2 ip>"

2. hosts 인자 값을 수정한다.

from elasticsearch import Elasticsearch


if __name__ == "__main__":
    es = Elasticsearch(
        hosts=["https://lab-ES-node-0:9200"], #lab-ES-node-1이나 lab-ES-node-2도 통신만 되면 가능함
        basic_auth=("elastic", "mypasswd"),

    )

    index = "sm-test-shopping"
    doc = {
        "title": "Romeo and Juliet:python",
        "author": "William Shakespeare",
        "category": "Tragedies",
        "publish_date": "1562-12-01T00:00:00",
        "pages": 125
    }

    es.index(index=index, document=doc)

3. hosts 파일 수정

$ sudo vi /ets/hosts

# hosts
...
...
<node 0 ip>   lab-ES-node-0
<node 1 ip>   lab-ES-node-1
<node 2 ip>   lab-ES-node-2

4. 최종코드

from elasticsearch import Elasticsearch


if __name__ == "__main__":
    es = Elasticsearch(
        hosts=["https://lab-ES-node-0:9200", "https://lab-ES-node-1:9200", "https://lab-ES-node-2:9200"],
        basic_auth=("elastic", "mypasswd")
    )

    index = "sm-test-shopping"
    doc = {
        "title": "Romeo and Juliet:python",
        "author": "William Shakespeare",
        "category": "Tragedies",
        "publish_date": "1562-12-01T00:00:00",
        "pages": 125
    }

    es.index(index=index, document=doc)

5-2. 두 번째 시도

5-1의 최종코드 수행했을 때 에러 로그

Traceback (most recent call last):
  File "/home/smheo/dev/Common-Datahandler/test/insert-es-test.py", line 21, in <module>
    es.index(index=index, document=doc)
  File "/home/smheo/anaconda3/envs/data_handler/lib/python3.8/site-packages/elasticsearch/_sync/client/utils.py", line 404, in wrapped
    return api(*args, **kwargs)
  File "/home/smheo/anaconda3/envs/data_handler/lib/python3.8/site-packages/elasticsearch/_sync/client/__init__.py", line 2218, in index
    return self.perform_request(  # type: ignore[return-value]
  File "/home/smheo/anaconda3/envs/data_handler/lib/python3.8/site-packages/elasticsearch/_sync/client/_base.py", line 286, in perform_request
    meta, resp_body = self.transport.perform_request(
  File "/home/smheo/anaconda3/envs/data_handler/lib/python3.8/site-packages/elastic_transport/_transport.py", line 329, in perform_request
    meta, raw_data = node.perform_request(
  File "/home/smheo/anaconda3/envs/data_handler/lib/python3.8/site-packages/elastic_transport/_node/_http_urllib3.py", line 199, in perform_request
    raise err from None
elastic_transport.TlsError: TLS error caused by: TlsError(TLS error caused by: SSLError([SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1131)))

Process finished with exit code 1

원인 및 해결방법 (← 정확히는 우회방법)

  • Elasticsearch client를 만들 때 verify_certs 인자를 False로 주면 CERTIFICATE_VERIFY_FAILED 이 발생하지 않는다.
  • 하지만, 이는 우회방법이지 정식적인 방법이 아니다.
  • verify_certs 를 False로 하면 보안 방안이 전혀 없냐? 하면 그건 또 아니다. verify_certs False로 주고 유저, 패스워드 없이 client 생성할 경우 AuthenticationException 이 발생하기 때문에 수행할 수 있는 있는 보안책은 수행했다고 볼 수 있다.
  • 추측컨데 로그에self signed certificate in certificate chain라는 내용이 있는 걸로 보아 위의 코드에서 CERTIFICATE_VERIFY_FAILED 이 발생하는 원인은 인증서를 발급한 주체가 나이기 때문인 것 같다. 기관에서 받은 인증서를 서버에 심고, 클라이언트가 그 인증서로 서버가 위조된 서버인지 아닌지 확인하는데 나는 인증서가 공인 기관에서 발급받은 게 아닌 내가 발급한 것이기 때문에 문제가 되는 것 같다.
    • 생각해보니 이건 보안의 문제가 아니다. server를 보호하기 위함이 아니라, client를 보호하기 위함이네. 나 바보였네. server 운영에는 지장이 없는데, client 측에서 이 ES가 위조된 서버가 아님을 확신할 수 없는 상황이다.

verify_certs False로 주고 유저, 패스워드 없이 client 생성할 경우

/home/smheo/anaconda3/envs/data_handler/lib/python3.8/site-packages/elasticsearch/_sync/client/__init__.py:375: SecurityWarning: Connecting to 'https://lab-es-node-1:9200' using TLS with verify_certs=False is insecure
  _transport = transport_class(
/home/smheo/anaconda3/envs/data_handler/lib/python3.8/site-packages/elasticsearch/_sync/client/__init__.py:375: SecurityWarning: Connecting to 'https://lab-es-node-0:9200' using TLS with verify_certs=False is insecure
  _transport = transport_class(
/home/smheo/anaconda3/envs/data_handler/lib/python3.8/site-packages/elasticsearch/_sync/client/__init__.py:375: SecurityWarning: Connecting to 'https://lab-es-node-2:9200' using TLS with verify_certs=False is insecure
  _transport = transport_class(
/home/smheo/anaconda3/envs/data_handler/lib/python3.8/site-packages/urllib3/connectionpool.py:1013: InsecureRequestWarning: Unverified HTTPS request is being made to host 'lab-es-node-1'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings
  warnings.warn(
Traceback (most recent call last):
  File "/home/smheo/dev/Common-Datahandler/test/insert-es-test.py", line 21, in <module>
    resp = es.index(index=index, document=doc, id=3)
  File "/home/smheo/anaconda3/envs/data_handler/lib/python3.8/site-packages/elasticsearch/_sync/client/utils.py", line 404, in wrapped
    return api(*args, **kwargs)
  File "/home/smheo/anaconda3/envs/data_handler/lib/python3.8/site-packages/elasticsearch/_sync/client/__init__.py", line 2218, in index
    return self.perform_request(  # type: ignore[return-value]
  File "/home/smheo/anaconda3/envs/data_handler/lib/python3.8/site-packages/elasticsearch/_sync/client/_base.py", line 321, in perform_request
    raise HTTP_EXCEPTIONS.get(meta.status, ApiError)(
elasticsearch.AuthenticationException: AuthenticationException(401, 'security_exception', 'missing authentication credentials for REST request [/sm-test-bookstore/_doc/3]')

Process finished with exit code 1
  • 참고한 자료
 

오복이네 oboki.net

오복애비의 데이터엔지니어링 기록

oboki.net

6. 최종결과


elasticsearch.yml

# ======================== Elasticsearch Configuration =========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
#       Before you set out to tweak and tune the configuration, make sure you
#       understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please consult the documentation for further information on configuration options:
# https://www.elastic.co/guide/en/elasticsearch/reference/index.html
#
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
cluster.name: "LAB-ES-1"
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
node.name: "lab-ES-node-1"
#
# Add custom attributes to the node:
#
#node.attr.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
#path.data: /path/to/dat
path.data: ["/home/user/distributed-pipeline/elasticsearch-8.2.0/data"]
#
# Path to log files:
#
#path.logs: /path/to/logs
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
#bootstrap.memory_lock: true
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# By default Elasticsearch is only accessible on localhost. Set a different
# address here to expose this node on the network:
#
#network.host: 192.168.0.1
network.bind_host: <node 1 IP>
network.publish_host: <PUBLIC IP>

transport.port: 9301
#
# By default Elasticsearch listens for HTTP traffic on the first free port it
# finds starting at 9200. Set a specific HTTP port here:
#
#http.port: 9200
#
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when this node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
#discovery.seed_hosts: ["host1", "host2"]
discovery.seed_hosts: ["<node 0 IP>:9300", "<node 1 IP>:9301", "<node 2 IP>:9302"]
#
# Bootstrap the cluster using an initial set of master-eligible nodes:
#
#cluster.initial_master_nodes: ["node-1", "node-2"]
cluster.initial_master_nodes: ["lab-ES-node-0", "lab-ES-node-1", "lab-ES-node-2"]
#
# For more information, consult the discovery and cluster formation module documentation.
#
# --------------------------------- Readiness ----------------------------------
#
# Enable an unauthenticated TCP readiness endpoint on localhost
#
#readiness.port: 9399
#
# ---------------------------------- Various -----------------------------------
#
# Allow wildcard deletion of indices:
#
#action.destructive_requires_name: false
#
xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.keystore.path: certs/elastic-certificates.p12
xpack.security.transport.ssl.truststore.path: certs/elastic-certificates.p12
xpack.security.http.ssl.enabled: true
xpack.security.http.ssl.verification_mode: certificate
xpack.security.http.ssl.keystore.path: certs/elastic-certificates.p12      
xpack.security.http.ssl.truststore.path: certs/elastic-certificates.p12

 

Simple python example

from elasticsearch import Elasticsearch


if __name__ == "__main__":
    es = Elasticsearch(
        hosts=["https://lab-ES-node-0:9200", "https://lab-ES-node-1:9200", "https://lab-ES-node-2:9200"],
        basic_auth=("elastic", "mypasswd"),
        verify_certs=False
    )

    index = "sm-test-bookstore"
    doc = {
        "title": "Romeo and Juliet:python",
        "author": "William Shakespeare",
        "category": "Tragedies",
        "publish_date": "1562-12-01T00:00:00",
        "pages": 125
    }

    # es.index(index=index, document=doc)
    resp = es.index(index=index, document=doc)
    print(resp)
    resp = es.get(index=index, id=3)
    print(resp)

    resp = es.search(index=index, query={"match_all": {}})
    for hit in resp.get("hits").get("hits"):
        docu = hit.get('_source')
        print(f"TITLE: {docu.get('title')}      AUTHOR: {docu.get('author')}")
TITLE: Romeo and Juliet:python | AUTHOR: William Shakespeare | PUB DATE: 2022-05-17T16:39:16.054243
TITLE: Romeo and Juliet:python | AUTHOR: William Shakespeare | PUB DATE: 2022-05-17T16:40:01.621532
TITLE: Romeo and Juliet:python | AUTHOR: William Shakespeare | PUB DATE: 2022-05-17T16:40:02.723050
TITLE: Romeo and Juliet:python | AUTHOR: William Shakespeare | PUB DATE: 2022-05-17T16:40:15.662012
TITLE: Romeo and Juliet:python | AUTHOR: William Shakespeare | PUB DATE: 2022-05-17T16:40:17.031280
TITLE: Romeo and Juliet:Rewind | AUTHOR: William Shakespeare | PUB DATE: 1562-12-01T13:00:00
TITLE: Romeo and Juliet:python | AUTHOR: William Shakespeare | PUB DATE: 1562-12-01T00:00:00
TITLE: Romeo and Juliet:python | AUTHOR: William Shakespeare | PUB DATE: 2022-05-17T16:38:45.230190
TITLE: Romeo and Juliet:python | AUTHOR: William Shakespeare | PUB DATE: 2022-05-17T16:40:39.427729
TITLE: Romeo and Juliet:python | AUTHOR: William Shakespeare | PUB DATE: 2022-05-17T16:40:43.011230

Process finished with exit code 0

To-do

  • SSL, TLS, HTTPS, CA 등 보안 관련 공부