Database/Redis

[Redis] Jedis 에서 Sentinel 의 pub/sub 을 제대로 인지할까?

꽁담 2020. 12. 27. 23:40

개요


Redis 와 Sentinel 로 구성된 레디스가 있습니다.

Jedis 를 사용하는 어플리케이션에서 Sentinel 에서 마스터 IP 를 조회하여 Redis 에 접속하고 있습니다.

 

sentinel failover 를 수행한 후 어플리케이션에서 새로운 마스터를 인지하지 못하는 상황이 발생했습니다.

모든 Sentinel 에서는 새로운 마스터 IP 로 변경되어 있는 상태였습니다.

 

 

원인에 대한 범위 좁히기


왜 어플리케이션에서 새로운 마스터를 찾지 못했을까

Jedis 라이브러리를 사용하는 어플리케이션에서 Pub/Sub 기능을 이용하여 Sentinel 의 메시지를 받고

이 Sentinel 메시지 중 +switch-master 의 메시지를 받으면 Failover 가 수행되도록 쓰레드로 구현이 되어 있습니다.

 

그렇다면.. 아래와 같이 2가지 원인으로 좁힐 수 있습니다.

2번의 경우에는 모든 센티넬에서 마스터 IP 가 변경되었기 때문에 1번의 원인을 좀 더 우선순위로 두었습니다.

1. Jedis 에서 +switch-master 메시지를 받지 못하였거나

2. Sentinel 에서 +switch-master 메시지를 전달을 안했다는게 원인이 됩니다.

 

추가 근거

redisgate (레디스 전문 사이트) 에서는 아래와 같은 문구가 기록되어 있습니다.

http://redisgate.kr/redis/clients/jedis_pubsub.php

 

 

근거에 대한 객관적 자료를 확보하기 위해 테스트를 진행해 봅니다.

 

 

환경


버전

테스트를 위해 사용한 버전은 다음과 같습니다.

종류 버전
OS Ubuntu 18.04
JDK openjdk 1.8
Jedis 2.9.0
Redis 2.8.4

 

Redis 와 Sentinel 구조

Redis 와 Sentinel 의 IP 와 Port 입니다.

Redis1 과 Redis2 는 마스터/슬레이브 구조입니다.

Sentinel1 / Sentinel2 / Sentinel3 에서는 위 Redis1, Redis2 를 모니터링하고 있습니다.

레디스/센티넬 IP 와 Port
Redis1 127.0.0.1 6161
Redis2 127.0.0.1 6162
Sentinel1 127.0.0.1 26379
Sentinel2 127.0.0.1 26380
Sentinel3 127.0.0.1 26381

Redis 와 Sentinel 의 Property 입니다.

레디스/센티넬 프로퍼티
Redis1 $ cat /home/psh/redis/redis_6161.conf
daemonize yes

port 6161

save 900 1
save 300 10
save 60 10000

masterauth "P@ssword"
requirepass "P@ssword"

pidfile "/home/psh/redis/redis_6161.pid"
logfile "/home/psh/redis/redis_6161.log"
dir "/home/psh/redis"
dbfilename "6161.rdb"
Redis2 $ cat /home/psh/redis/redis_6162.pid
daemonize yes 


port 6162 

save 900 1 
save 300 10 
save 60 10000 

masterauth "P@ssword"
requirepass "P@ssword"

pidfile "/home/psh/redis/redis_6162.pid"
logfile "/home/psh/redis/redis_6162.log"
dir "/home/psh/redis"
dbfilename "6162.rdb"
Sentinel1 $ cat /home/psh/redis/sentinel_26379.conf
daemonize yes
pidfile "/home/psh/redis/sentinel_26379.pid"
logfile "/home/psh/redis/sentinel_26379.log"
dir "/home/psh/redis"

port 26379

sentinel monitor test 127.0.0.1 6161 2
sentinel down-after-milliseconds test 3000
sentinel auth-pass test P@ssword
Sentinel2 $ cat /home/psh/redis/sentinel_26380.conf
daemonize yes
pidfile "/home/psh/redis/sentinel_26380.pid"
logfile "/home/psh/redis/sentinel_26380.log"
dir "/home/psh/redis"

port 26380

sentinel monitor test 127.0.0.1 6161 2
sentinel down-after-milliseconds test 3000
sentinel auth-pass test P@ssword
Sentinel3 $ cat /home/psh/redis/sentinel_26381.conf
daemonize yes
pidfile "/home/psh/redis/sentinel_26381.pid"
logfile "/home/psh/redis/sentinel_26381.log"
dir "/home/psh/redis"

port 26381

sentinel monitor test 127.0.0.1 6161 2
sentinel down-after-milliseconds test 3000
sentinel auth-pass test P@ssword

 

 

센티넬 Pub/Sub 메시지 받기


Java 코드에서 메시지 받기

Java 코드를 간략하게 설명하면 jedis 2.9.0 라이브러리를 import 하였고

Sentinel1 / 2 / 3 에 접속하여 쓰레드로 Sentinel 의 Pub 메시지를 받도록 구현시켜 놓은 코드입니다.

switch-master 로그만 받도록 해놓은게 아닌 모든 메시지를 받도록 구현하였습니다.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
import java.text.SimpleDateFormat;
import redis.clients.jedis.*;
 
import java.util.Set;
import java.util.HashSet;
 
public class test {
        public static void main(String[] args) {
                Runnable r1 = new classRun1();
                Thread[] t1 = new Thread[1];
 
                Runnable r2 = new classRun2();
                Thread[] t2 = new Thread[1];
 
                Runnable r3 = new classRun3();
                Thread[] t3 = new Thread[1];
 
                t1[0= new Thread(r1);
                t2[0= new Thread(r2);
                t3[0= new Thread(r3);
                t1[0].start();
                t2[0].start();
                t3[0].start();
        }
}
 
class classRun1 implements Runnable {
        public void run()
        {
        final String host1 = "127.0.0.1";
        final int port1 = 26379;
        final Jedis sentinelJedis1 = new Jedis(host1, port1);
 
        SimpleDateFormat format1 = new SimpleDateFormat ( "yyyy-MM-dd HH:mm:ss" );
 
                while(!Thread.currentThread().isInterrupted())
                {
                        try {
                                Thread.sleep(1000);
 
                                System.out.println("Ping 체크시작");
                                for(int j = 0 ; j < 10 ; j ++)
                                {
                                        try {
                                                sentinelJedis1.ping();
                                                break;
                                        }catch(Exception e) {
                                                continue;
                                        }
                                }
                                sentinelJedis1.psubscribe ( new JedisPubSub() {
                                        public void onMessage(String channel, String message){}
                                        public void onSubscribe(String channel, int num){}
                                        public void onUnsubscribe(String channel, int num){}
                                        public void onPSubscribe(String channel, int num){}
                                        public void onPUnsubscribe(String channel, int num){}
                                        public void onPMessage(String pattern, String event, String message) {
                        String format_time = format1.format(System.currentTimeMillis());
                                                System.out.println("[" + format_time + "] sentinelJedis1 : pattern : " + pattern);
                                                System.out.println("[" + format_time + "] sentinelJedis1 : event : " + event);
                                                System.out.println("[" + format_time + "] sentinelJedis1 : message : " + message);
                    }
                                }, "*");
                        }catch (Exception e){
                                System.out.println("[ERROR] Exception : " + e.getMessage());
                                continue;
                        }
                }
                try {
                        Thread.sleep(1000);
                }
                catch (InterruptedException e) {
                        System.out.println("[ERROR] InterruptedException : " + e.getMessage());
                }
        }
}
 
class classRun2 implements Runnable {
        public void run()
        {
        final String host2 = "127.0.0.1";
        final int port2 = 26380;
        final Jedis sentinelJedis2 = new Jedis(host2, port2);
 
        SimpleDateFormat format1 = new SimpleDateFormat ( "yyyy-MM-dd HH:mm:ss" );
 
                while(!Thread.currentThread().isInterrupted())
                {
                        try {
                                Thread.sleep(1000);
 
                                System.out.println("Ping 체크시작");
                                for(int j = 0 ; j < 10 ; j ++)
                                {
                                        try {
                                                sentinelJedis2.ping();
                                                break;
                                        }catch(Exception e) {
                                                continue;
                                        }
                                }
                                sentinelJedis2.psubscribe ( new JedisPubSub() {
                                        public void onMessage(String channel, String message){}
                                        public void onSubscribe(String channel, int num){}
                                        public void onUnsubscribe(String channel, int num){}
                                        public void onPSubscribe(String channel, int num){}
                                        public void onPUnsubscribe(String channel, int num){}
                                        public void onPMessage(String pattern, String event, String message) {
                        String format_time = format1.format(System.currentTimeMillis());
                                                System.out.println("[" + format_time + "] sentinelJedis2 : pattern : " + pattern);
                                                System.out.println("[" + format_time + "] sentinelJedis2 : event : " + event);
                                                System.out.println("[" + format_time + "] sentinelJedis2 : message : " + message);
                    }
                                }, "*");
                        }catch (Exception e){
                                System.out.println("[ERROR] Exception : " + e.getMessage());
                                continue;
                        }
                }
                try {
                        Thread.sleep(1000);
                }
                catch (InterruptedException e) {
                        System.out.println("[ERROR] InterruptedException : " + e.getMessage());
                }
        }
}
 
class classRun3 implements Runnable {
        public void run()
        {
        final String host3 = "127.0.0.1";
        final int port3 = 26381;
        final Jedis sentinelJedis3 = new Jedis(host3, port3);
 
        SimpleDateFormat format1 = new SimpleDateFormat ( "yyyy-MM-dd HH:mm:ss" );
 
                while(!Thread.currentThread().isInterrupted())
                {
                        try {
                                Thread.sleep(1000);
 
                                System.out.println("Ping 체크시작");
                                for(int j = 0 ; j < 10 ; j ++)
                                {
                                        try {
                                                sentinelJedis3.ping();
                                                break;
                                        }catch(Exception e) {
                                                continue;
                                        }
                                }
                                sentinelJedis3.psubscribe ( new JedisPubSub() {
                                        public void onMessage(String channel, String message){
                        System.out.println("sentinelJedis3 onMessage : " +  message);
                    }
                                        public void onSubscribe(String channel, int num){}
                                        public void onUnsubscribe(String channel, int num){}
                                        public void onPSubscribe(String channel, int num){}
                                        public void onPUnsubscribe(String channel, int num){}
                                        public void onPMessage(String pattern, String event, String message) {
                        String format_time = format1.format(System.currentTimeMillis());
                                                System.out.println("[" + format_time + "] sentinelJedis3 : pattern : " + pattern);
                                                System.out.println("[" + format_time + "] sentinelJedis3 : event : " + event);
                                                System.out.println("[" + format_time + "] sentinelJedis3 : message : " + message);
                    }
                                }, "*");
                        }catch (Exception e){
                                System.out.println("[ERROR] Exception : " + e.getMessage());
                                continue;
                        }
                }
                try {
                        Thread.sleep(1000);
                }
                catch (InterruptedException e) {
                        System.out.println("[ERROR] InterruptedException : " + e.getMessage());
                }
        }
}
 
cs

 

redis-cli 로 직접 붙어서 subscribe 에서 메시지 받기

센티넬 명령어
Sentinel1 $ redis-cli -p 26379
127.0.0.1:26379> psubscribe *
Sentinel2 $ redis-cli -p 26380 
127.0.0.1:26380> psubscribe *
Sentinel3 $ redis-cli -p 26381 
127.0.0.1:26381> psubscribe *

 

 

테스트 진행


Sentinel1 에서 Failover 를 실행

127.0.0.1:26379> sentinel failover test
OK

 

Sentinel 로그

센티넬 로그파일에 switch-master 로그내용이 기록되었습니다. (진하게 표기)

센티넬 로그 파일의 내용
Sentinel1 [16863] 27 Dec 22:57:16.459 # Executing user requested FAILOVER of 'test' 
[16863] 27 Dec 22:57:16.459 # +new-epoch 15 
[16863] 27 Dec 22:57:16.459 # +try-failover master test 127.0.0.1 6161 
[16863] 27 Dec 22:57:16.474 # +vote-for-leader 7c8f6176dfbcf5397ddf30e42df90c2ba8778279 15 
[16863] 27 Dec 22:57:16.475 # +elected-leader master test 127.0.0.1 6161 
[16863] 27 Dec 22:57:16.475 # +failover-state-select-slave master test 127.0.0.1 6161 
[16863] 27 Dec 22:57:16.534 # +selected-slave slave 127.0.0.1:6162 127.0.0.1 6162 @ test 127.0.0.1 6161 
[16863] 27 Dec 22:57:16.534 * +failover-state-send-slaveof-noone slave 127.0.0.1:6162 127.0.0.1 6162 @ test 127.0.0.1 6161 
[16863] 27 Dec 22:57:16.610 * +failover-state-wait-promotion slave 127.0.0.1:6162 127.0.0.1 6162 @ test 127.0.0.1 6161 
[16863] 27 Dec 22:57:17.905 # +promoted-slave slave 127.0.0.1:6162 127.0.0.1 6162 @ test 127.0.0.1 6161 
[16863] 27 Dec 22:57:17.905 # +failover-state-reconf-slaves master test 127.0.0.1 6161 
[16863] 27 Dec 22:57:17.905 # +failover-end master test 127.0.0.1 6161 
[16863] 27 Dec 22:57:17.905 # +switch-master test 127.0.0.1 6161 127.0.0.1 6162 
[16863] 27 Dec 22:57:17.906 * +slave slave 127.0.0.1:6161 127.0.0.1 6161 @ test 127.0.0.1 6162 
[16863] 27 Dec 22:57:28.096 * +convert-to-slave slave 127.0.0.1:6161 127.0.0.1 6161 @ test 127.0.0.1 6162
Sentinel2 [16867] 27 Dec 22:57:17.458 # +new-epoch 15 
[16867] 27 Dec 22:57:19.516 # +switch-master test 127.0.0.1 6161 127.0.0.1 6162 
[16867] 27 Dec 22:57:19.517 * +slave slave 127.0.0.1:6161 127.0.0.1 6161 @ test 127.0.0.1 6162
Sentinel3 [16871] 27 Dec 22:57:17.458 # +new-epoch 15 
[16871] 27 Dec 22:57:19.515 # +switch-master test 127.0.0.1 6161 127.0.0.1 6162 
[16871] 27 Dec 22:57:19.516 * +slave slave 127.0.0.1:6161 127.0.0.1 6161 @ test 127.0.0.1 6162

 

 

redis-cli 로 직접 붙어서 subscribe 에서 메시지 받았을 때

switch-master 로그가 기록되어 있는 것을 확인할 수 있습니다. (진하게 표기)

센티넬 psubscribe 로 확인한 로그
Sentinel1 1) "pmessage"
2) "*"
3) "+new-epoch"
4) "15"
1) "pmessage"
2) "*"
3) "+try-failover"
4) "master test 127.0.0.1 6161"
1) "pmessage"
2) "*"
3) "+vote-for-leader"
4) "7c8f6176dfbcf5397ddf30e42df90c2ba8778279 15"
1) "pmessage"
2) "*"
3) "+elected-leader"
4) "master test 127.0.0.1 6161"
1) "pmessage"
2) "*"
3) "+failover-state-select-slave"
4) "master test 127.0.0.1 6161"
1) "pmessage"
2) "*"
3) "+selected-slave"
4) "slave 127.0.0.1:6162 127.0.0.1 6162 @ test 127.0.0.1 6161"
1) "pmessage"
2) "*"
3) "+failover-state-send-slaveof-noone"
4) "slave 127.0.0.1:6162 127.0.0.1 6162 @ test 127.0.0.1 6161"
1) "pmessage"
2) "*"
3) "+failover-state-wait-promotion"
4) "slave 127.0.0.1:6162 127.0.0.1 6162 @ test 127.0.0.1 6161"
1) "pmessage"
2) "*"
3) "+promoted-slave"
4) "slave 127.0.0.1:6162 127.0.0.1 6162 @ test 127.0.0.1 6161"
1) "pmessage"
2) "*"
3) "+failover-state-reconf-slaves"
4) "master test 127.0.0.1 6161"
1) "pmessage"
2) "*"
3) "+failover-end"
4) "master test 127.0.0.1 6161"
1) "pmessage"
2) "*"
3) "+switch-master"
4) "test 127.0.0.1 6161 127.0.0.1 6162"
1) "pmessage"
2) "*"
3) "+slave"
4) "slave 127.0.0.1:6161 127.0.0.1 6161 @ test 127.0.0.1 6162"
1) "pmessage"
2) "*"
3) "+convert-to-slave"
4) "slave 127.0.0.1:6161 127.0.0.1 6161 @ test 127.0.0.1 6162"
Sentinel2 1) "pmessage"
2) "*"
3) "+new-epoch"
4) "15"
1) "pmessage"
2) "*"
3) "+switch-master"
4) "test 127.0.0.1 6161 127.0.0.1 6162"
1) "pmessage"
2) "*"
3) "+slave"
4) "slave 127.0.0.1:6161 127.0.0.1 6161 @ test 127.0.0.1 6162"
Sentinel3 1) "pmessage"
2) "*"
3) "+new-epoch"
4) "15"
1) "pmessage"
2) "*"
3) "+switch-master"
4) "test 127.0.0.1 6161 127.0.0.1 6162"
1) "pmessage"
2) "*"
3) "+slave"
4) "slave 127.0.0.1:6161 127.0.0.1 6161 @ test 127.0.0.1 6162"

 

 

Java 코드에서 메시지 받기

switch-master 로그가 정상적으로 기록되었습니다.

센티넬 자바 로그
Sentinel1 [2020-12-27 22:57:16] sentinelJedis1 : pattern : *
[2020-12-27 22:57:16] sentinelJedis1 : event : +new-epoch
[2020-12-27 22:57:16] sentinelJedis1 : message : 15 
[2020-12-27 22:57:16] sentinelJedis1 : pattern : *
[2020-12-27 22:57:16] sentinelJedis1 : event : +try-failover
[2020-12-27 22:57:16] sentinelJedis1 : message : master test 127.0.0.1 6161
[2020-12-27 22:57:16] sentinelJedis1 : pattern : *
[2020-12-27 22:57:16] sentinelJedis1 : event : +vote-for-leader
[2020-12-27 22:57:16] sentinelJedis1 : 7c8f6176dfbcf5397ddf30e42df90c2ba8778279 15
[2020-12-27 22:57:16] sentinelJedis1 : pattern : *
[2020-12-27 22:57:16] sentinelJedis1 : event : +elected-leader
[2020-12-27 22:57:16] sentinelJedis1 : message : master test 127.0.0.1 6161
[2020-12-27 22:57:16] sentinelJedis1 : pattern : *
[2020-12-27 22:57:16] sentinelJedis1 : event : +failover-state-select-slave
[2020-12-27 22:57:16] sentinelJedis1 : message : master test 127.0.0.1 6161
[2020-12-27 22:57:16] sentinelJedis1 : pattern : *
[2020-12-27 22:57:16] sentinelJedis1 : event : +selected-slave
[2020-12-27 22:57:16] sentinelJedis1 : message : slave 127.0.0.1:6162 127.0.0.1 6162 @ test 127.0.0.1 6161
[2020-12-27 22:57:16] sentinelJedis1 : pattern : *
[2020-12-27 22:57:16] sentinelJedis1 : event : +failover-state-send-slaveof-noone
[2020-12-27 22:57:16] sentinelJedis1 : message : slave 127.0.0.1:6162 127.0.0.1 6162 @ test 127.0.0.1 6161
[2020-12-27 22:57:16] sentinelJedis1 : pattern : *
[2020-12-27 22:57:16] sentinelJedis1 : event : +failover-state-wait-promotion
[2020-12-27 22:57:16] sentinelJedis1 : message : slave 127.0.0.1:6162 127.0.0.1 6162 @ test 127.0.0.1 6161
[2020-12-27 22:57:17] sentinelJedis1 : pattern : *
[2020-12-27 22:57:17] sentinelJedis1 : event : +promoted-slave
[2020-12-27 22:57:17] sentinelJedis1 : message : slave 127.0.0.1:6162 127.0.0.1 6162 @ test 127.0.0.1 6161
[2020-12-27 22:57:17] sentinelJedis1 : pattern : *
[2020-12-27 22:57:17] sentinelJedis1 : event : +failover-state-reconf-slaves
[2020-12-27 22:57:17] sentinelJedis1 : message : master test 127.0.0.1 6161
[2020-12-27 22:57:17] sentinelJedis1 : pattern : *
[2020-12-27 22:57:17] sentinelJedis1 : event : +failover-end
[2020-12-27 22:57:17] sentinelJedis1 : message : master test 127.0.0.1 6161
[2020-12-27 22:57:17] sentinelJedis1 : pattern : *
[2020-12-27 22:57:17] sentinelJedis1 : event : +switch-master
[2020-12-27 22:57:17] sentinelJedis1 : message : test 127.0.0.1 6161 127.0.0.1 6162
[2020-12-27 22:57:17] sentinelJedis1 : pattern : *
[2020-12-27 22:57:17] sentinelJedis1 : event : +slave
[2020-12-27 22:57:17] sentinelJedis1 : message : slave 127.0.0.1:6161 127.0.0.1 6161 @ test 127.0.0.1 6162
[2020-12-27 22:57:27] sentinelJedis1 : pattern : *
[2020-12-27 22:57:27] sentinelJedis1 : event : +convert-to-slave
[2020-12-27 22:57:27] sentinelJedis1 : message : slave 127.0.0.1:6161 127.0.0.1 6161 @ test 127.0.0.1 6162
Sentinel2 [2020-12-27 22:57:17] sentinelJedis2 : pattern : *
[2020-12-27 22:57:17] sentinelJedis2 : event : +new-epoch
[2020-12-27 22:57:17] sentinelJedis2 : message : 15 
[2020-12-27 22:57:19] sentinelJedis2 : pattern : *
[2020-12-27 22:57:19] sentinelJedis2 : event : +switch-master
[2020-12-27 22:57:19] sentinelJedis2 : message : test 127.0.0.1 6161 127.0.0.1 6162
[2020-12-27 22:57:19] sentinelJedis2 : pattern : *
[2020-12-27 22:57:19] sentinelJedis2 : event : +slave
[2020-12-27 22:57:19] sentinelJedis2 : message : slave 127.0.0.1:6161 127.0.0.1 6161 @ test 127.0.0.1 6162
Sentinel3 [2020-12-27 22:57:17] sentinelJedis3 : pattern : *
[2020-12-27 22:57:17] sentinelJedis3 : event : +new-epoch
[2020-12-27 22:57:17] sentinelJedis3 : message : 15 
[2020-12-27 22:57:19] sentinelJedis3 : pattern : *
[2020-12-27 22:57:19] sentinelJedis3 : event : +switch-master
[2020-12-27 22:57:19] sentinelJedis3 : message : test 127.0.0.1 6161 127.0.0.1 6162
[2020-12-27 22:57:19] sentinelJedis3 : pattern : *
[2020-12-27 22:57:19] sentinelJedis3 : event : +slave
[2020-12-27 22:57:19] sentinelJedis3 : message : slave 127.0.0.1:6161 127.0.0.1 6161 @ test 127.0.0.1 6162

 

 

레디스 작업도중 센티넬에서 어떠한 이유에 의해.. switch-master event 를 던져주지 못했다..? 로 추정..

은 다음 포스팅에서 기회되면 정리