Refactor mariadb chart

This is a major refactor of the mariadb chart.  A few things
are accomplished:

* The chart template layout is updated to match our keystone
chart, providing a more structure to the chart.

* The chart was updated to leverage StatefulSets, which requires
Kubernetes 1.5, and helm 2.1.0.

* The bootstrapping process was completely overhauled to support
the unique constraints of statefulsets, namely that they come up
one by one, needing the previous to be in a ready state before
the next is provisioned.

* The references to {{ .IP }} we removed and replaced with POD_IP
environmental passing and address binding was fixed in several
places for wsrep functionality.  This may explain several oddities
with the previous setup causing mysterious and intermittent
database consistency issues.
This commit is contained in:
Alan Meadows 2016-12-15 17:20:47 -08:00
parent ac9985cfde
commit 0ca1a7942e
35 changed files with 572 additions and 626 deletions

View File

@ -2,12 +2,24 @@
By default, this chart creates a 3-member mariadb galera cluster.
PetSets would be ideal to use for this purpose, but as they are going through a transition in 1.5.0-beta.1 and not supported by Helm 2.0.0 under their new name, StatefulSets, we have opted to leverage helms template generation ability to create Values.replicas * POD+PVC+PV resources. Essentially, we create a mariadb-0, mariadb-1, and mariadb-2 Pod and associated unique PersistentVolumeClaims for each. This removes the previous daemonset limitations in other mariadb approaches.
This charge leverages StatefulSets, with persistent storage.
You must ensure that your control nodes that should receive mariadb instances are labeled with openstack-control-plane=enabled:
It creates a job that acts as a temporary standalone galera cluster. This host is bootstrapped with authentication and then
the WSREP bindings are exposed publicly. The cluster members being StatefulSets are provisioned one at a time. The first host
must be marked as ```Ready``` before the next host will be provisioned. This is determined by the readinessProbes which actually
validate that MySQL is up and responsive.
The configuration leverages xtrabackup-v2 for synchronization. This may later be augmented to leverage rsync which has
some benefits.
Once the seed job completes, which completes only when galera reports that it is Synced and all cluster members are reporting in
thus matching the cluster count according to the job to the replica count in the helm values configuration, the job is terminated.
When the job is no longer active, future StatefulSets provisioned will leverage the existing cluster members as gcomm endpoints. It is only when the job is running that the cluster members leverage the seed job as their gcomm endpoint. This ensures you can restart members and scale the cluster.
The StatefulSets all leverage PVCs to provide stateful storage to /var/lib/mysql.
You must ensure that your control nodes that should receive mariadb instances are labeled with openstack-control-plane=enabled, or whatever you have configured in values.yaml for the label configuration:
```
kubectl label nodes openstack-control-plane=enabled --all
```
We will continue to refine our labeling so that it is consistent throughout the project.
```

View File

@ -0,0 +1,53 @@
#!/bin/sh
set -ex
SLEEP_TIMEOUT=5
# Initialize system .Values.database.
mysql_install_db --datadir=/var/lib/mysql
# Start mariadb and wait for it to be ready.
#
# note that we bind to 127.0.0.1 here because we want
# to interact with the database but we dont want to expose it
# yet for other cluster members to accidently connect yet
mysqld_safe --defaults-file=/etc/my.cnf \
--console \
--wsrep-new-cluster \
--wsrep_cluster_address='gcomm://' \
--bind-address='127.0.0.1' \
--wsrep_node_address="127.0.0.1:{{ .Values.network.port.wsrep }}" \
--wsrep_provider_options="gcache.size=512M; gmcast.listen_addr=tcp://127.0.0.1:{{ .Values.network.port.wsrep }}" &
TIMEOUT=120
while [[ ! -f /var/lib/mysql/mariadb.pid ]]; do
if [[ ${TIMEOUT} -gt 0 ]]; then
let TIMEOUT-=1
sleep 1
else
exit 1
fi
done
# Reset permissions.
# kolla_security_reset requires to be run from home directory
cd /var/lib/mysql ; DB_ROOT_PASSWORD="{{ .Values.database.root_password }}" kolla_security_reset
mysql -u root --password="{{ .Values.database.root_password }}" --port="{{ .Values.network.port.mariadb }}" -e "GRANT ALL PRIVILEGES ON *.* TO 'root'@'localhost' IDENTIFIED BY '{{ .Values.database.root_password }}' WITH GRANT OPTION;"
mysql -u root --password="{{ .Values.database.root_password }}" --port="{{ .Values.network.port.mariadb }}" -e "GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY '{{ .Values.database.root_password }}' WITH GRANT OPTION;"
# Restart .Values.database.
mysqladmin -uroot -p"{{ .Values.database.root_password }}" --port="{{ .Values.network.port.mariadb }}" shutdown
# Wait for the mariadb server to shut down
SHUTDOWN_TIMEOUT=60
while [[ -f /var/lib/mysql/mariadb.pid ]]; do
if [[ ${SHUTDOWN_TIMEOUT} -gt 0 ]]; then
let SHUTDOWN_TIMEOUT-=1
sleep 1
else
echo "MariaDB instance couldn't be properly shut down"
exit 1
fi
done

View File

@ -0,0 +1,91 @@
#!/usr/bin/env python
import json
import os
import urllib2
import ssl
import socket
import sys
import time
URL = ('https://kubernetes.default.svc.cluster.local/api/v1/namespaces/{namespace}'
'/endpoints/{service_name}')
TOKEN_FILE = '/var/run/secrets/kubernetes.io/serviceaccount/token'
def get_service_endpoints(service_name):
url = URL.format(namespace=os.environ['NAMESPACE'], service_name=service_name)
try:
token = file (TOKEN_FILE, 'r').read()
except KeyError:
exit("Unable to open a file with token.")
header = {'Authorization': " Bearer {}".format(token)}
req = urllib2.Request(url=url, headers=header)
ctx = create_ctx()
connection = urllib2.urlopen(req, context=ctx)
data = connection.read()
# parse to dict
json_acceptable_string = data.replace("'", "\"")
output = json.loads(json_acceptable_string)
return output
def get_ip_addresses(output, force_only_members=False):
subsets = output['subsets'][0]
if not 'addresses' in subsets:
return []
# where we are seeding, the only cluster member is the seed job
if not force_only_members:
for subset in subsets['addresses']:
if subset.has_key('name'):
if 'seed' in subset['name']:
return [subset['ip']]
# otherwise use the other cluster members
ip_addresses = [x['ip'] for x in subsets['addresses']]
my_ip = get_my_ip_address()
if my_ip in ip_addresses:
ip_addresses.remove(my_ip)
return ip_addresses
def get_my_ip_address():
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.connect(('kubernetes.default.svc.cluster.local', 0))
return s.getsockname()[0]
def create_ctx():
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE
return ctx
def print_galera_cluster_address(service_name, force_only_members):
while True:
output = get_service_endpoints(service_name)
ips = get_ip_addresses(output, force_only_members)
#print "=== OUTPUT: %s" % output
#print "=== IPS: %s" % ips
if len(ips):
wsrep_cluster_address = '--wsrep_cluster_address=gcomm://{}'.format(
','.join(get_ip_addresses(output)))
print wsrep_cluster_address
break
time.sleep(5)
def main():
if len(sys.argv) != 3:
exit('peer-finder: You need to pass argument <service name> <1|0 for force cluster members>')
service_name = sys.argv[1]
force_only_members = int(sys.argv[2])
print_galera_cluster_address(service_name, force_only_members)
if __name__ == '__main__':
main()

View File

@ -0,0 +1,27 @@
#!/usr/bin/env python
import os
import sys
import time
import pymysql
DB_HOST = "127.0.0.1"
DB_PORT = int(os.environ.get('MARIADB_SERVICE_PORT', '3306'))
while True:
try:
pymysql.connections.Connection(host=DB_HOST, port=DB_PORT,
connect_timeout=1)
sys.exit(0)
except pymysql.err.OperationalError as e:
code, message = e.args
if code == 2003 and 'time out' in message:
print('Connection timeout, sleeping')
time.sleep(1)
continue
if code == 1045:
print('Mysql ready to use. Exiting')
sys.exit(0)
# other error
raise

View File

@ -0,0 +1,79 @@
#!/bin/sh
set -ex
SLEEP_TIMEOUT=5
function wait_for_cluster {
# Wait for the mariadb server to be "Ready" before starting the security reset with a max timeout
TIMEOUT=600
while [[ ! -f /var/lib/mysql/mariadb.pid ]]; do
if [[ ${TIMEOUT} -gt 0 ]]; then
let TIMEOUT-=1
sleep 1
else
exit 1
fi
done
REPLICAS={{ .Values.replicas }}
# We need to count seed instance here.
MINIMUM_CLUSTER_SIZE=$(( $REPLICAS + 1 ))
# wait until we have at least two more members in a cluster.
while true ; do
CLUSTER_SIZE=`mysql -uroot -h ${POD_IP} -p"{{ .Values.database.root_password }}" --port="{{ .Values.network.port.mariadb }}" -e'show status' | grep wsrep_cluster_size | awk ' { if($2 ~ /[0-9]/){ print $2 } else { print 0 } } '`
if [ "${CLUSTER_SIZE}" -lt ${MINIMUM_CLUSTER_SIZE} ] ; then
echo "Cluster seed not finished, waiting."
sleep ${SLEEP_TIMEOUT}
continue
fi
CLUSTER_STATUS=`mysql -uroot -h ${POD_IP} -p"{{ .Values.database.root_password }}" --port="{{ .Values.network.port.mariadb }}" -e'show status' | grep wsrep_local_state_comment | awk ' { print $2 } '`
if [ "${CLUSTER_STATUS}" != "Synced" ] ; then
echo "Cluster not synced, waiting."
sleep ${SLEEP_TIMEOUT}
continue
fi
# Count number of endpoint separators.
ENDPOINTS_CNT=`python /tmp/peer-finder.py mariadb 1 | grep -o ',' | wc -l`
# TODO(tomasz.paszkowski): Fix a corner case when only one endpoint is on the list.
# Add +1 for seed node and +1 as first item does not have a separator
ENDPOINTS_CNT=$(($ENDPOINTS_CNT+2))
if [ "${ENDPOINTS_CNT}" != "${CLUSTER_SIZE}" ] ; then
echo "Cluster not synced, waiting."
sleep ${SLEEP_TIMEOUT}
continue
fi
echo "Cluster ready, exiting seed."
kill -- -$$
break
done
}
# With the DaemonSet implementation, there may be a difference
# in the number of replicas and actual number of nodes matching
# mariadb node selector label. Problem will be solved when
# the implementation will be switched to Deployment
# (using anti-affinity feature).
REPLICAS={{ .Values.replicas }}
if [ "$REPLICAS" -eq 1 ] ; then
echo "Requested to build one-instance MariaDB cluster. There is no need to run seed. Exiting."
exit 0
elif [ "$REPLICAS" -eq 2 ] ; then
echo "2-instance cluster is not a valid MariaDB configuration."
exit 1
fi
. /tmp/bootstrap-db.sh
mysqld_safe --defaults-file=/etc/my.cnf \
--console \
--wsrep-new-cluster \
--wsrep_cluster_address='gcomm://' \
--bind-address="0.0.0.0" \
--wsrep_node_address="${POD_IP}:{{ .Values.network.port.wsrep }}" \
--wsrep_provider_options="gcache.size=512M; gmcast.listen_addr=tcp://${POD_IP}:{{ .Values.network.port.wsrep }}" &
wait_for_cluster
exit 0

View File

@ -0,0 +1,48 @@
#!/bin/bash
set -ex
trap "trap - SIGTERM && kill -- -$$" SIGINT SIGTERM EXIT
sudo chown mysql: /var/lib/mysql
rm -rf /var/lib/mysql/lost+found
REPLICAS={{ .Values.replicas }}
PETSET_NAME={{ printf "%s" .Values.service_name }}
INIT_MARKER="/var/lib/mysql/init_done"
function join_by { local IFS="$1"; shift; echo "$*"; }
# Remove mariadb.pid if exists
if [[ -f /var/lib/mysql/mariadb.pid ]]; then
if [[ `pgrep -c $(cat /var/lib/mysql/mariadb.pid)` -eq 0 ]]; then
rm -vf /var/lib/mysql/mariadb.pid
fi
fi
if [ "$REPLICAS" -eq 1 ] ; then
if [[ ! -f ${INIT_MARKER} ]]; then
cd /var/lib/mysql
echo "Creating one-instance MariaDB."
bash /tmp/bootstrap-db.sh
touch ${INIT_MARKER}
fi
exec mysqld_safe --defaults-file=/etc/my.cnf \
--console \
--wsrep-new-cluster \
--wsrep_cluster_address='gcomm://'
else
# give the seed more of a chance to be ready by the time
# we start the first pet so we succeed on the first pass
# a little hacky, but prevents restarts as we aren't waiting
# for job completion here so I'm not sure what else
# to look for
sleep 30
export WSREP_OPTIONS=`python /tmp/peer-finder.py mariadb 0`
exec mysqld --defaults-file=/etc/my.cnf \
--console \
--bind-address="0.0.0.0" \
--wsrep_node_address="${POD_IP}:{{ .Values.network.port.wsrep }}" \
--wsrep_provider_options="gcache.size=512M; gmcast.listen_addr=tcp://${POD_IP}:{{ .Values.network.port.wsrep }}" \
$WSREP_OPTIONS
fi

View File

@ -1,56 +0,0 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: bootstrap-db
data:
bootstrap-db.sh: |
#!/bin/sh
set -ex
SLEEP_TIMEOUT=5
# Initialize system .Values.database.
mysql_install_db --datadir=/var/lib/mysql
# Start mariadb and wait for it to be ready.
mysqld_safe --defaults-file=/etc/my.cnf \
--console \
--wsrep-new-cluster \
--wsrep_cluster_address='gcomm://' \
--bind-address='127.0.0.1' \
--wsrep_node_address='127.0.0.1' \
--wsrep_provider_options='gcache.size=512M; gmcast.listen_addr=tcp://127.0.0.1:{{ .Values.network.port.wsrep }}' &
TIMEOUT=120
while [[ ! -f /var/lib/mysql/mariadb.pid ]]; do
if [[ ${TIMEOUT} -gt 0 ]]; then
let TIMEOUT-=1
sleep 1
else
exit 1
fi
done
# Reset permissions.
# kolla_security_reset requires to be run from home directory
cd /var/lib/mysql ; DB_ROOT_PASSWORD="{{ .Values.database.root_password }}" kolla_security_reset
mysql -u root --password="{{ .Values.database.root_password }}" --port="{{ .Values.network.port.mariadb }}" -e "GRANT ALL PRIVILEGES ON *.* TO 'root'@'localhost' IDENTIFIED BY '{{ .Values.database.root_password }}' WITH GRANT OPTION;"
mysql -u root --password="{{ .Values.database.root_password }}" --port="{{ .Values.network.port.mariadb }}" -e "GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY '{{ .Values.database.root_password }}' WITH GRANT OPTION;"
# Restart .Values.database.
mysqladmin -uroot -p"{{ .Values.database.root_password }}" --port="{{ .Values.network.port.mariadb }}" shutdown
# Wait for the mariadb server to shut down
SHUTDOWN_TIMEOUT=60
while [[ -f /var/lib/mysql/mariadb.pid ]]; do
if [[ ${SHUTDOWN_TIMEOUT} -gt 0 ]]; then
let SHUTDOWN_TIMEOUT-=1
sleep 1
else
echo "MariaDB instance couldn't be properly shut down"
exit 1
fi
done

View File

@ -1,13 +0,0 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: mariadb-charsets
data:
charsets.cnf: |+
[mysqld]
character_set_server=utf8
collation_server=utf8_unicode_ci
skip-character-set-client-handshake
[client]
default_character_set=utf8

View File

@ -0,0 +1,15 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: mariadb-bin
data:
start.sh: |
{{ tuple "bin/_start.sh.tpl" . | include "template" | indent 4 }}
peer-finder.py: |
{{ tuple "bin/_peer-finder.py.tpl" . | include "template" | indent 4 }}
readiness.py: |
{{ tuple "bin/_readiness.py.tpl" . | include "template" | indent 4 }}
bootstrap-db.sh: |
{{ tuple "bin/_bootstrap-db.sh.tpl" . | include "template" | indent 4 }}
seed.sh: |
{{ tuple "bin/_seed.sh.tpl" . | include "template" | indent 4 }}

View File

@ -0,0 +1,21 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: mariadb-etc
data:
charsets.cnf: |
{{ tuple "etc/_charsets.cnf.tpl" . | include "template" | indent 4 }}
engine.cnf: |
{{ tuple "etc/_engine.cnf.tpl" . | include "template" | indent 4 }}
my.cnf: |
{{ tuple "etc/_galera-my.cnf.tpl" . | include "template" | indent 4 }}
log.cnf: |
{{ tuple "etc/_log.cnf.tpl" . | include "template" | indent 4 }}
pid.cnf: |
{{ tuple "etc/_pid.cnf.tpl" . | include "template" | indent 4 }}
tuning.cnf: |
{{ tuple "etc/_tuning.cnf.tpl" . | include "template" | indent 4 }}
networking.cnf: |
{{ tuple "etc/_networking.cnf.tpl" . | include "template" | indent 4 }}
wsrep.cnf: |
{{ tuple "etc/_wsrep.cnf.tpl" . | include "template" | indent 4 }}

View File

@ -1,57 +1,17 @@
{{- $root := . -}}
{{ range $k, $v := until (atoi .Values.replicas) }}
---
apiVersion: v1
kind: Service
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: mariadb-{{$v}}
labels:
release: {{ $root.Release.Name | quote }}
chart: "{{ $root.Chart.Name }}-{{ $root.Chart.Version }}"
name: {{ .Values.service_name }}
spec:
ports:
- name: db
port: {{ $root.Values.network.port.mariadb }}
- name: wsrep
port: {{ $root.Values.network.port.wsrep }}
selector:
app: mariadb
server-id: "{{$v}}"
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: mariadb-{{$v}}
annotations:
{{ $root.Values.volume.class_path }}: {{ $root.Values.volume.class_name }}
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: {{ $root.Values.volume.size }}
---
kind: Deployment
apiVersion: extensions/v1beta1
metadata:
labels:
app: mariadb
galera: enabled
server-id: "{{$v}}"
name: mariadb-{{$v}}
spec:
replicas: 1
serviceName: "{{ .Values.service_name }}"
replicas: 3
template:
securityContext:
runAsUser: 0
metadata:
name: mariadb-{{$v}}
labels:
app: mariadb
app: {{ .Values.service_name }}
galera: enabled
server-id: "{{$v}}"
annotations:
pod.beta.kubernetes.io/hostname: mariadb-{{$v}}
helm.sh/created: {{ $root.Release.Time.Seconds | quote }}
# alanmeadows: this soft requirement allows single
# host deployments to spawn several mariadb containers
# but in a larger environment, would attempt to spread
@ -71,13 +31,13 @@ spec:
"weight": 10
}]
}
}
}
spec:
nodeSelector:
{{ $root.Values.labels.node_selector_key }}: {{ $root.Values.labels.node_selector_value }}
{{ .Values.labels.node_selector_key }}: {{ .Values.labels.node_selector_value }}
containers:
- name: mariadb-{{$v}}
image: {{ $root.Values.images.mariadb }}
- name: {{ .Values.service_name }}
image: {{ .Values.images.mariadb }}
imagePullPolicy: Always
env:
- name: INTERFACE_NAME
@ -86,22 +46,30 @@ spec:
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: COMMAND
value: "bash /tmp/start.sh"
- name: DEPENDENCY_CONFIG
value: "/etc/my.cnf.d/wsrep.cnf"
ports:
- containerPort: {{ $root.Values.network.port.mariadb }}
- containerPort: {{ $root.Values.network.port.wsrep }}
- containerPort: {{ .Values.network.port.mariadb }}
- containerPort: {{ .Values.network.port.wsrep }}
- containerPort: {{ .Values.network.port.ist }}
# a readinessprobe is a much more complex affair with
# statefulsets, as the container must be "live"
# before the next stateful member is created
# and with galera this is problematic
readinessProbe:
exec:
command:
- python
- /mariadb-readiness.py
initialDelaySeconds: 60
volumeMounts:
- name: mycnfd
mountPath: /etc/my.cnf.d
@ -111,9 +79,12 @@ spec:
- name: bootstrapdb
mountPath: /tmp/bootstrap-db.sh
subPath: bootstrap-db.sh
- name: peer-finder
- name: peerfinder
mountPath: /tmp/peer-finder.py
subPath: peer-finder.py
- name: readiness
mountPath: /mariadb-readiness.py
subPath: readiness.py
- name: charsets
mountPath: /etc/my.cnf.d/charsets.cnf
subPath: charsets.cnf
@ -136,60 +107,56 @@ spec:
mountPath: /etc/my.cnf.d/tuning.cnf
subPath: tuning.cnf
- name: wsrep
mountPath: /configmaps/wsrep.cnf
- name: replicas
mountPath: /tmp/replicas.py
subPath: replicas.py
- name: readiness
mountPath: /mariadb-readiness.py
subPath: mariadb-readiness.py
mountPath: /etc/my.cnf.d/wsrep.cnf
subPath: wsrep.cnf
- name: mysql-data
mountPath: /var/lib/mysql
mountPath: /var/lib/mysql
volumes:
- name: mycnfd
emptyDir: {}
- name: startsh
configMap:
name: mariadb-startsh
name: mariadb-bin
- name: bootstrapdb
configMap:
name: bootstrap-db
- name: peer-finder
name: mariadb-bin
- name: peerfinder
configMap:
name: mariadb-peer-finder
- name: charsets
configMap:
name: mariadb-charsets
- name: engine
configMap:
name: mariadb-engine
- name: log
configMap:
name: mariadb-log
- name: mycnf
configMap:
name: mariadb-mycnf
- name: networking
configMap:
name: mariadb-networking
- name: pid
configMap:
name: mariadb-pid
- name: tuning
configMap:
name: mariadb-tuning
- name: wsrep
configMap:
name: mariadb-wsrep
- name: replicas
configMap:
name: mariadb-replicas
name: mariadb-bin
- name: readiness
configMap:
name: mariadb-readiness
- name: mysql-data
persistentVolumeClaim:
matchLabels:
server-id: "{{$v}}"
claimName: mariadb-{{$v}}
{{ end }}
name: mariadb-bin
- name: charsets
configMap:
name: mariadb-etc
- name: engine
configMap:
name: mariadb-etc
- name: log
configMap:
name: mariadb-etc
- name: mycnf
configMap:
name: mariadb-etc
- name: networking
configMap:
name: mariadb-etc
- name: pid
configMap:
name: mariadb-etc
- name: tuning
configMap:
name: mariadb-etc
- name: wsrep
configMap:
name: mariadb-etc
volumeClaimTemplates:
- metadata:
name: mysql-data
annotations:
{{ .Values.volume.class_path }}: {{ .Values.volume.class_name }}
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: {{ .Values.volume.size }}

View File

@ -1,10 +0,0 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: mariadb-engine
data:
engine.cnf: |+
[mysqld]
default-storage-engine=InnoDB
innodb=FORCE
binlog_format=ROW

View File

@ -0,0 +1,7 @@
[mysqld]
character_set_server=utf8
collation_server=utf8_unicode_ci
skip-character-set-client-handshake
[client]
default_character_set=utf8

View File

@ -0,0 +1,4 @@
[mysqld]
default-storage-engine=InnoDB
innodb=FORCE
binlog_format=ROW

View File

@ -0,0 +1,6 @@
[mysqld]
datadir=/var/lib/mysql
basedir=/usr
[client-server]
!includedir /etc/my.cnf.d/

View File

@ -0,0 +1,11 @@
[mysqld]
slow_query_log=off
slow_query_log_file=/var/log/mysql/mariadb-slow.log
log_warnings=2
# General logging has huge performance penalty therefore is disabled by default
general_log=off
general_log_file=/var/log/mysql/mariadb-error.log
long_query_time=3
log_queries_not_using_indexes=on

View File

@ -0,0 +1,14 @@
[mysqld]
bind_address=0.0.0.0
port={{ .Values.network.port.mariadb }}
# When a client connects, the server will perform hostname resolution,
# and when DNS is slow, establishing the connection will become slow as well.
# It is therefore recommended to start the server with skip-name-resolve to
# disable all DNS lookups. The only limitation is that the GRANT statements
# must then use IP addresses only.
skip_name_resolve
[client]
protocol=tcp
port={{ .Values.network.port.mariadb }}

View File

@ -0,0 +1,2 @@
[mysqld]
pid_file=/var/lib/mysql/mariadb.pid

View File

@ -0,0 +1,47 @@
[mysqld]
user=mysql
max_allowed_packet=256M
open_files_limit=10240
max_connections=8192
max-connect-errors=1000000
## Generally, it is unwise to set the query cache to be larger than 64-128M
## as the costs associated with maintaining the cache outweigh the performance
## gains.
## The query cache is a well known bottleneck that can be seen even when
## concurrency is moderate. The best option is to disable it from day 1
## by setting query_cache_size=0 (now the default on MySQL 5.6)
## and to use other ways to speed up read queries: good indexing, adding
## replicas to spread the read load or using an external cache.
query_cache_size =0
query_cache_type=0
sync_binlog=0
thread_cache_size=16
table_open_cache=2048
table_definition_cache=1024
#
# InnoDB
#
# The buffer pool is where data and indexes are cached: having it as large as possible
# will ensure you use memory and not disks for most read operations.
# Typical values are 50..75% of available RAM.
# TODO(tomasz.paszkowski): This needs to by dynamic based on avaliable RAM.
innodb_buffer_pool_size=4096M
innodb_log_file_size=2000M
innodb_flush_method=O_DIRECT
innodb_flush_log_at_trx_commit=2
innodb_old_blocks_time=1000
innodb_autoinc_lock_mode=2
innodb_doublewrite=0
innodb_file_format=Barracuda
innodb_file_per_table=1
innodb_io_capacity=500
innodb_locks_unsafe_for_binlog=1
innodb_read_io_threads=8
innodb_write_io_threads=8
[mysqldump]
max-allowed-packet=16M

View File

@ -0,0 +1,15 @@
[mysqld]
wsrep_cluster_name="{{ .Values.database.cluster_name }}"
wsrep_provider=/usr/lib/galera/libgalera_smm.so
wsrep_provider_options="gcache.size=512M"
wsrep_slave_threads=12
wsrep_sst_auth=root:{{ .Values.database.root_password }}
# xtrabackup-v2 would be more desirable here, but its
# not in the upstream stackanetes images
# ()[mysql@mariadb-seed-gdqr8 /]$ xtrabackup --version
# xtrabackup version 2.2.13 based on MySQL server 5.6.24 Linux (x86_64) (revision id: 70f4be3)
wsrep_sst_method=xtrabackup-v2
wsrep_node_name={{ .Values.database.node_name }}
datadir=/var/lib/mysql
tmpdir=/tmp
user=mysql

View File

@ -1,12 +0,0 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: mariadb-mycnf
data:
my.cnf: |+
[mysqld]
datadir=/var/lib/mysql
basedir=/usr
[client-server]
!includedir /etc/my.cnf.d/

View File

@ -1,3 +1,4 @@
---
apiVersion: batch/v1
kind: Job
metadata:
@ -9,7 +10,6 @@ spec:
app: mariadb
spec:
restartPolicy: Never
terminationGracePeriodSeconds: 30
containers:
- name: mariadb-init
image: {{ .Values.images.mariadb }}
@ -21,16 +21,20 @@ spec:
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: COMMAND
value: "bash /tmp/seed.sh"
- name: DEPENDENCY_CONFIG
value: "/etc/my.cnf.d/wsrep.cnf"
ports:
- containerPort: {{ .Values.network.port.mariadb }}
- containerPort: {{ .Values.network.port.wsrep }}
- containerPort: {{ .Values.network.port.ist }}
volumeMounts:
- name: mycnfd
mountPath: /etc/my.cnf.d
@ -40,7 +44,7 @@ spec:
- name: bootstrapdb
mountPath: /tmp/bootstrap-db.sh
subPath: bootstrap-db.sh
- name: peer-finder
- name: peerfinder
mountPath: /tmp/peer-finder.py
subPath: peer-finder.py
- name: charsets
@ -65,46 +69,41 @@ spec:
mountPath: /etc/my.cnf.d/tuning.cnf
subPath: tuning.cnf
- name: wsrep
mountPath: /configmaps/wsrep.cnf
- name: replicas
mountPath: /tmp/replicas.py
subPath: replicas.py
mountPath: /etc/my.cnf.d/wsrep.cnf
subPath: wsrep.cnf
volumes:
- name: mycnfd
emptyDir: {}
- name: seedsh
configMap:
name: mariadb-seedsh
name: mariadb-bin
- name: bootstrapdb
configMap:
name: bootstrap-db
- name: peer-finder
name: mariadb-bin
- name: peerfinder
configMap:
name: mariadb-peer-finder
name: mariadb-bin
- name: charsets
configMap:
name: mariadb-charsets
name: mariadb-etc
- name: engine
configMap:
name: mariadb-engine
name: mariadb-etc
- name: log
configMap:
name: mariadb-log
name: mariadb-etc
- name: mycnf
configMap:
name: mariadb-mycnf
name: mariadb-etc
- name: networking
configMap:
name: mariadb-networking
name: mariadb-etc
- name: pid
configMap:
name: mariadb-pid
name: mariadb-etc
- name: tuning
configMap:
name: mariadb-tuning
name: mariadb-etc
- name: wsrep
configMap:
name: mariadb-wsrep
- name: replicas
configMap:
name: mariadb-replicas
name: mariadb-etc

View File

@ -1,17 +0,0 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: mariadb-log
data:
log.cnf: |+
[mysqld]
slow_query_log=off
slow_query_log_file=/var/log/mysql/mariadb-slow.log
log_warnings=2
# General logging has huge performance penalty therefore is disabled by default
general_log=off
general_log_file=/var/log/mysql/mariadb-error.log
long_query_time=3
log_queries_not_using_indexes=on

View File

@ -1,33 +0,0 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: mariadb-readiness
data:
mariadb-readiness.py: |+
#!/usr/bin/env python
import os
import sys
import time
import pymysql
DB_HOST = "127.0.0.1"
DB_PORT = int(os.environ.get('MARIADB_SERVICE_PORT', '3306'))
while True:
try:
pymysql.connections.Connection(host=DB_HOST, port=DB_PORT,
connect_timeout=1)
sys.exit(0)
except pymysql.err.OperationalError as e:
code, message = e.args
if code == 2003 and 'time out' in message:
print('Connection timeout, sleeping')
time.sleep(1)
continue
if code == 1045:
print('Mysql ready to use. Exiting')
sys.exit(0)
# other error
raise

View File

@ -1,10 +0,0 @@
apiVersion: v1
kind: Service
metadata:
name: mariadb
spec:
ports:
- name: db
port: {{ .Values.network.port.mariadb }}
selector:
app: mariadb

View File

@ -1,20 +0,0 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: mariadb-networking
data:
networking.cnf: |+
[mysqld]
bind_address=0.0.0.0
port={{ .Values.network.port.mariadb }}
# When a client connects, the server will perform hostname resolution,
# and when DNS is slow, establishing the connection will become slow as well.
# It is therefore recommended to start the server with skip-name-resolve to
# disable all DNS lookups. The only limitation is that the GRANT statements
# must then use IP addresses only.
skip_name_resolve
[client]
protocol=tcp
port={{ .Values.network.port.mariadb }}

View File

@ -1,84 +0,0 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: mariadb-peer-finder
data:
peer-finder.py: |+
import json
import os
import urllib2
import ssl
import socket
import sys
import time
URL = ('https://kubernetes.default.svc.cluster.local/api/v1/namespaces/{namespace}'
'/endpoints/{service_name}')
TOKEN_FILE = '/var/run/secrets/kubernetes.io/serviceaccount/token'
def get_service_endpoints(service_name):
url = URL.format(namespace=os.environ['NAMESPACE'], service_name=service_name)
try:
token = file (TOKEN_FILE, 'r').read()
except KeyError:
exit("Unable to open a file with token.")
header = {'Authorization': " Bearer {}".format(token)}
req = urllib2.Request(url=url, headers=header)
ctx = create_ctx()
connection = urllib2.urlopen(req, context=ctx)
data = connection.read()
# parse to dict
json_acceptable_string = data.replace("'", "\"")
output = json.loads(json_acceptable_string)
return output
def get_ip_addresses(output):
subsets = output['subsets'][0]
if not 'addresses' in subsets:
return []
ip_addresses = [x['ip'] for x in subsets['addresses']]
my_ip = get_my_ip_address()
if my_ip in ip_addresses:
ip_addresses.remove(my_ip)
return ip_addresses
def get_my_ip_address():
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.connect(('kubernetes.default.svc.cluster.local', 0))
return s.getsockname()[0]
def create_ctx():
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE
return ctx
def print_galera_cluster_address(service_name):
while True:
output = get_service_endpoints(service_name)
if len(get_ip_addresses(output)):
wsrep_cluster_address = '--wsrep_cluster_address=gcomm://{}'.format(
','.join(get_ip_addresses(output)))
print wsrep_cluster_address
break
time.sleep(5)
def main():
if len(sys.argv) != 2:
exit('peer-finder: You need to pass argument')
service_name = sys.argv[1]
print_galera_cluster_address(service_name)
if __name__ == '__main__':
main()

View File

@ -1,8 +0,0 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: mariadb-pid
data:
pid.cnf: |+
[mysqld]
pid_file=/var/lib/mysql/mariadb.pid

View File

@ -1,46 +0,0 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: mariadb-replicas
data:
replicas.py: |
#!/usr/bin/env python
import json
import os
import ssl
import sys
import urllib2
URL = ('https://kubernetes.default.svc.{{ .Values.network.dns.kubernetes_domain }}/apis/extensions/v1beta1/deployments')
TOKEN_FILE = '/var/run/secrets/kubernetes.io/serviceaccount/token'
def create_ctx():
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE
return ctx
def get_deployments():
url = URL.format()
try:
token = file(TOKEN_FILE, 'r').read()
except KeyError:
exit("Unable to open a file with token.")
header = {'Authorization': " Bearer {}".format(token)}
req = urllib2.Request(url=url, headers=header)
ctx = create_ctx()
response = urllib2.urlopen(req, context=ctx)
output = json.load(response)
return output
def main():
reply = get_deployments()
name = "mariadb"
namespace = "default" if not os.environ["NAMESPACE"] else os.environ["NAMESPACE"]
mariadb = filter(lambda d: d["metadata"]["namespace"] == namespace and d["metadata"]["name"].startswith(name), reply["items"])
print len(mariadb)
if __name__ == "__main__":
main()

View File

@ -1,82 +0,0 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: mariadb-seedsh
data:
seed.sh: |+
#!/bin/sh
set -ex
SLEEP_TIMEOUT=5
function wait_for_cluster {
# Wait for the mariadb server to be "Ready" before starting the security reset with a max timeout
TIMEOUT=120
while [[ ! -f /var/lib/mysql/mariadb.pid ]]; do
if [[ ${TIMEOUT} -gt 0 ]]; then
let TIMEOUT-=1
sleep 1
else
exit 1
fi
done
REPLICAS=$(python /tmp/replicas.py)
# We need to count seed instance here.
MINIMUM_CLUSTER_SIZE=$(( $REPLICAS + 1 ))
# wait until we have at least two more members in a cluster.
while true ; do
CLUSTER_SIZE=`mysql -uroot -p"{{ .Values.database.root_password }}" --port="{{ .Values.network.port.mariadb }}" -e'show status' | grep wsrep_cluster_size | awk ' { if($2 ~ /[0-9]/){ print $2 } else { print 0 } } '`
if [ "${CLUSTER_SIZE}" -lt ${MINIMUM_CLUSTER_SIZE} ] ; then
echo "Cluster seed not finished, waiting."
sleep ${SLEEP_TIMEOUT}
continue
fi
CLUSTER_STATUS=`mysql -uroot -p"{{ .Values.database.root_password }}" --port="{{ .Values.network.port.mariadb }}" -e'show status' | grep wsrep_local_state_comment | awk ' { print $2 } '`
if [ "${CLUSTER_STATUS}" != "Synced" ] ; then
echo "Cluster not synced, waiting."
sleep ${SLEEP_TIMEOUT}
continue
fi
# Count number of endpoint separators.
ENDPOINTS_CNT=`python /tmp/peer-finder.py mariadb | grep -o ',' | wc -l`
# TODO(tomasz.paszkowski): Fix a corner case when only one endpoint is on the list.
# Add +1 for seed node and +1 as first item does not have a separator
ENDPOINTS_CNT=$(($ENDPOINTS_CNT+2))
if [ "${ENDPOINTS_CNT}" != "${CLUSTER_SIZE}" ] ; then
echo "Cluster not synced, waiting."
sleep ${SLEEP_TIMEOUT}
continue
fi
echo "Cluster ready, exiting seed."
kill -- -$$
break
done
}
# With the DaemonSet implementation, there may be a difference
# in the number of replicas and actual number of nodes matching
# mariadb node selector label. Problem will be solved when
# the implementation will be switched to Deployment
# (using anti-affinity feature).
REPLICAS=$(python /tmp/replicas.py)
if [ "$REPLICAS" -eq 1 ] ; then
echo "Requested to build one-instance MariaDB cluster. There is no need to run seed. Exiting."
exit 0
elif [ "$REPLICAS" -eq 2 ] ; then
echo "2-instance cluster is not a valid MariaDB configuration."
exit 1
fi
bash /tmp/bootstrap-db.sh
mysqld_safe --defaults-file=/etc/my.cnf \
--console \
--wsrep-new-cluster \
--wsrep_cluster_address='gcomm://' &
wait_for_cluster
exit 0

View File

@ -0,0 +1,18 @@
apiVersion: v1
kind: Service
metadata:
name: {{ .Values.service_name }}
annotations:
# This is needed to make the peer-finder work properly and to help avoid
# edge cases where instance 0 comes up after losing its data and needs to
# decide whether it should create a new cluster or try to join an existing
# one. If it creates a new cluster when it should have joined an existing
# one, we'd end up with two separate clusters listening at the same service
# endpoint, which would be very bad.
service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
spec:
ports:
- name: db
port: {{ .Values.network.port.mariadb }}
selector:
app: {{ .Values.service_name }}

View File

@ -1,37 +0,0 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: mariadb-startsh
data:
start.sh: |+
#!/bin/bash
set -ex
trap "trap - SIGTERM && kill -- -$$" SIGINT SIGTERM EXIT
sudo chown mysql: /var/lib/mysql
REPLICAS=$(python /tmp/replicas.py)
INIT_MARKER="/var/lib/mysql/init_done"
# Remove mariadb.pid if exists
if [[ -f /var/lib/mysql/mariadb.pid ]]; then
if [[ `pgrep -c $(cat /var/lib/mysql/mariadb.pid)` -eq 0 ]]; then
rm -vf /var/lib/mysql/mariadb.pid
fi
fi
if [ "$REPLICAS" -eq 1 ] ; then
if [[ ! -f ${INIT_MARKER} ]]; then
cd /var/lib/mysql
echo "Creating one-instance MariaDB."
bash /tmp/bootstrap-db.sh
touch ${INIT_MARKER}
fi
exec mysqld_safe --defaults-file=/etc/my.cnf \
--console \
--wsrep-new-cluster \
--wsrep_cluster_address='gcomm://'
else
export WSREP_OPTIONS=`python /tmp/peer-finder.py mariadb`
exec mysqld --defaults-file=/etc/my.cnf --console $WSREP_OPTIONS
fi

View File

@ -1,53 +0,0 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: mariadb-tuning
data:
tuning.cnf: |+
[mysqld]
user=mysql
max_allowed_packet=256M
open_files_limit=10240
max_connections=8192
max-connect-errors=1000000
## Generally, it is unwise to set the query cache to be larger than 64-128M
## as the costs associated with maintaining the cache outweigh the performance
## gains.
## The query cache is a well known bottleneck that can be seen even when
## concurrency is moderate. The best option is to disable it from day 1
## by setting query_cache_size=0 (now the default on MySQL 5.6)
## and to use other ways to speed up read queries: good indexing, adding
## replicas to spread the read load or using an external cache.
query_cache_size =0
query_cache_type=0
sync_binlog=0
thread_cache_size=16
table_open_cache=2048
table_definition_cache=1024
#
# InnoDB
#
# The buffer pool is where data and indexes are cached: having it as large as possible
# will ensure you use memory and not disks for most read operations.
# Typical values are 50..75% of available RAM.
# TODO(tomasz.paszkowski): This needs to by dynamic based on avaliable RAM.
innodb_buffer_pool_size=4096M
innodb_log_file_size=2000M
innodb_flush_method=O_DIRECT
innodb_flush_log_at_trx_commit=2
innodb_old_blocks_time=1000
innodb_autoinc_lock_mode=2
innodb_doublewrite=0
innodb_file_format=Barracuda
innodb_file_per_table=1
innodb_io_capacity=500
innodb_locks_unsafe_for_binlog=1
innodb_read_io_threads=8
innodb_write_io_threads=8
[mysqldump]
max-allowed-packet=16M

View File

@ -1,15 +0,0 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: mariadb-wsrep
data:
wsrep.cnf: |+
[mysqld]
wsrep_cluster_name="{{ .Values.database.cluster_name }}"
wsrep_provider=/usr/lib/galera/libgalera_smm.so
wsrep_provider_options="gcache.size=512M"
wsrep_slave_threads=12
wsrep_sst_auth=root:{{ .Values.database.root_password }}
wsrep_sst_method=xtrabackup-v2
wsrep_node_name={{ .Values.database.node_name }}
wsrep_node_address={{ .Values.network.ip_address }}:{{ .Values.network.port.wsrep }}

View File

@ -3,7 +3,12 @@
# Declare name/value pairs to be passed into your templates.
# name: value
replicas: "3" # this must be quoted to deal with atoi
# note that you need to update the gcomm member list
# below when changing this value
replicas: 3
# this drives the service name, and statefulset name
service_name: mariadb
images:
mariadb: quay.io/stackanetes/stackanetes-mariadb:newton
@ -21,6 +26,7 @@ network:
port:
wsrep: 4567
mariadb: 3306
ist: 4444
dns:
kubernetes_domain: cluster.local
ip_address: "{{ .IP }}"