Configuration and detailed description controller-witness
Description
controller-witness - service to change replication roles of two controllers.
controller-witness consists of SystemD
service taking its parameters from config file.
Configuration file must be filled up before service is run.
Service attempts to establish SSH connection to both servers using parameters from /etc/default/controller-witness
and then periodically checks connectivity and servers statuses.
If server currently acting as replication master becomes unavailable,
service attempts to change current slave server role to master.
When two masters in cluster are detected, service sends "set role master" command
to server which was seen continouosly as master for longer time. This sets another server role to slave.
Options
Settings are stored in /etc/default/controller-witness
file.
File consists of 2 sections.
If settings file was manually changed, service must be restarted by command:
systemctl restart controller-witness
[default]
Contains required parameters. Initially these options can be filled interactively during package installation process if terminal and one of whiptail and dialog programs are available. If all parameters are filled the service is enabled and started. CAUTION! ONLY PARAMETER PRESENCE BUT NOT CORRECTNESS CHECKING IS PERFORMED! If any mandatory parameter is omitted, service remains inactive. One must then fill config file, enable and run service manually.
-
server1
Mandatory parameter. IP address (or domain name) of first server. Optionally one can set port number after colon (
host:port
). If no port is given, default SSH port 22 is assumed. -
server2
The same for another one.
-
username
Username if the user to connect to server1 and server2 is the same. This user MUST be able to request replication status and change roles via
/usr/local/sbin/veil-controller
script on both servers. Must be omitted if usernames for server1 and server2 are different. -
username1
Username to connect server1 via SSH. This user MUST be able to request replication status and change roles via
/usr/local/sbin/veil-controller
script on server1. -
username2
The same for another server.
-
password
Password for both servers if password authentication is used and both users have the same password. Must be omitted in case of pkey authentication or different passwords.
-
password1
Password for server1 if password authentication is used. May be omitted in case of pkey authentication.
-
password2
The same for another server.
-
pkey
PEM file that contains RSA primary key to authenticate on both servers. File is generated and downloaded via controller web interface (Security - Encryption keys - Generate key). After generation, SSH key must be connected to corresponding controllers. May be omitted if password authentication used. Must be omitted if pkeys are different for server1 and server2.
-
pkey1
The same for just server1.
-
pkey2
The same for just server2.
-
id
Instance ID unique string. It is created automatically as UUID string during install so no need to create or edit it manually. If still needed to change id please take into account that it has to be no more than 50 characters long and has to consist of character set
0-9a-zA-Z-_
.
[optional]
Contains additional optional parameters. If some parameters or whole section is absent (it is not created during package installation, can be added manually if needed), default values are used.
-
loop_timeout
Interval in seconds between status acquiring on each server.
Default is 10.
-
ssh_loop_timeout
Timeout in seconds for worker to exit if no commands from parent arrived.
Default is 300.
-
status_timeout
Timeout in seconds on get reply to status command.
Default is 30.
-
set_master_timeout
Timeout in seconds on get reply to set role master command.
Default is 240.
-
no_comm_mute
Number of sequential unsuccessful reconnect attempts to servers after that reconnecting will be attempted silently.
Default is 10.
-
reports_period
Interval in seconds between reports to log about servers status.
Default is 3600 (1 hour).
-
confirm_2similar_timeout
Timeout in seconds to make decision about situation "2 similar roles" in case if services status is requested when roles are switched from CLI or so. In these moments both servers theoretically can say the same about their statuses.
-
master_failure_duration
Timeout in seconds before switching slave to master if one controller is slave and another is unavailable (timeout to allow another controller to get online).
Default is 30.
-
switch_flag
Path to file causing forced switching server roles if appeared. File must be created with access rights allowing its deleting by
controller-witness
service.Default is
/tmp/role-switch
. -
debug
Debug level.
Default is 0.
One- or two-digit decimal number. Least-significant digit controls debug info about role switching commands, most-significant - about status request commands sent to servers.
For every digit:
-
0 - no debug info.
-
1 - report about return code and execution time.
-
2 - full report (a short report plus corresponding command stdout).
Examples
-
2 - full debug info about role switching commands.
-
12 - same plus short report about status commands execution.
-
22 - full debug info about all commands sent to servers. Debug info can be seen in
controller-witness
service logfile.
-
Configuration example
[default]
server1=example.mylab.org:2222
server2=example.mylab.org
username1=root
password1=MySecPass
pkey1=
username2=root
pkey2=/home/user/key_key2.pem
id=768f026b-a911-4910-835b-c9f7eedb66d5
[optional]
set_master_timeout=120
status_timeout=30
loop_timeout=10
ssh_loop_timeout=240
no_conn_mute=10
reports_period=3600
confirm_2similar_timeout=30
switch_flag=/tmp/role-switch
debug=0
Logging
Service log can be viewed by command:
journalctl -u controller-witness