Deploying Grafana HA Kubernetes Cluster on Azure AKS
Introduction
This post aims to demonstrate how to deploy a Grafana high-availability cluster using disk persistence and data storage in a Postgres instance (You can choose other databases according to the documentation available on the official Grafana website). The cluster will be deployed on the Azure AKS (Azure Kubernetes Service) platform, stressing that to get the most out of the material, avoid copying and pasting and find out how each line works before applying yamls as if there were no tomorrow. I will leave some useful links during each deployment process to facilitate the studies of all.
Grafana HA
Ensure that n replicas use the same database to persist the data pertaining to the cluster. You can delve into the subject (recommended) by accessing Grafana's official documentation.
Kind StatefulSet Grafana
In order to be able to deploy Grafana consistently in this example, we use the kind StatefulSet. You can learn more through the following address. What really interested me in using this type is the guarantee of generating a volume for each pod dynamically.
apiVersion: apps/v1beta2
kind: StatefulSet
metadata:
name: grafana-deployment
namespace: monitoring
labels:
app: grafana-deployment
component: grafana-core
spec:
replicas: 3
selector:
matchLabels:
app: grafana-deployment
component: grafana-core
serviceName: grafana-service
template:
metadata:
labels:
app: grafana-deployment
component: grafana-core
spec:
containers:
- image: grafana/grafana
name: grafana-deployment
resources:
limits:
cpu: 100m
memory: 100Mi
requests:
cpu: 100m
memory: 100Mi
env:
- name: GF_SERVER_DOMAIN
value: "sample.server.com"
- name: GF_SERVER_ROOT_URL
value: "http://sample.server.com/"
- name: GF_AUTH_BASIC_ENABLED
value: "true"
- name: GF_AUTH_ANONYMOUS_ENABLED
value: "false"
- name: GF_DATABASE_TYPE
value: "postgres"
- name: GF_DATABASE_HOST
value: "postgres-service.monitoring.svc.cluster.local:5432"
- name: GF_DATABASE_NAME
value: "grafanadb"
- name: GF_DATABASE_USER
valueFrom:
secretKeyRef:
name: grafana-secret-ha
key: grafanadb-user
- name: GF_DATABASE_PASSWORD
valueFrom:
secretKeyRef:
name: grafana-secret-ha
key: grafanadb-password
- name: GF_SECURITY_ADMIN_PASSWORD
valueFrom:
secretKeyRef:
name: grafana-secret-ha
key: grafana-admin-password
readinessProbe:
httpGet:
path: /login
port: 3000
volumeMounts:
- name: config-grafana-ha
mountPath: /etc/grafana/grafana.ini
subPath: grafana.ini
volumes:
- configMap:
defaultMode: 420
name: config-grafana-ha
name: config-grafana-ha
initContainers:
- name: fix-volume-permissions-to-grafana
securityContext:
runAsUser: 0
runAsGroup: 0
image: busybox
command: ["sh", "-c", "chown -R 0:0 /var/lib/grafana"]
volumeMounts:
- name: grafana-volume
mountPath: /var/lib/grafana
volumeClaimTemplates:
- metadata:
name: grafana-volume
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: managed-premium
resources:
requests:
storage: 5Gi
The above StatefulSet has three excerpts: one refers to the deployment of Grafana itself, and the second is an initContainer that adjusts the permissions of the "/var/lib/grafana" file, as regards an impairment due to the new versions. I set the permissions as root, but it is not advisable in production environments. Finally, the assembly of our dynamic volumes using Azure's premium disk class. For the settings we will need to apply configmap settings, below:
apiVersion: v1
data:
grafana.ini: |
##################### Grafana Configuration Example #####################
#
# Everything has defaults so you only need to uncomment things you want to
# change
# possible values : production, development
; app_mode = development
# instance name, defaults to HOSTNAME environment variable value or hostname if HOSTNAME var is empty
; instance_name = ${HOSTNAME}
#################################### Paths ####################################
[paths]
# Path to where grafana can store temp files, sessions, and the sqlite3 db (if that is used)
#
;data = /var/lib/grafana
#
# Directory where grafana can store logs
#
;logs = /var/log/grafana
#
# Directory where grafana will automatically scan and look for plugins
#
;plugins = /var/lib/grafana/plugins
#
#################################### Server ####################################
[server]
# Protool (http, https, socket)
;protocol = http
# The ip address to bind to, empty will bind to all interfaces
;http_addr =
# The http port to use
;http_port = 3000
# The public facing domain name used to access grafana from a browser
;domain = localhost
# Redirect to correct domain if host header does not match domain
# Prevents DNS rebinding attacks
;enforce_domain = false
# The full public facing url you use in browser, used for redirects and emails
# If you use reverse proxy and sub path specify full url (with sub path)
#root_url = "sample.server.com"
# Log web requests
;router_logging = false
# the path relative working path
;static_root_path = public
# enable gzip
;enable_gzip = false
# https certs & key file
;cert_file =
;cert_key =
# Unix socket path
;socket =
#################################### Database ####################################
[database]
# You can configure the database connection by specifying type, host, name, user and password
# as seperate properties or as on string using the url propertie.
# Either "mysql", "postgres" or "sqlite3", it's your choice
# type = mysql
# host = 127.0.0.1:3306
;name = grafana
;user = root
# If the password contains # or ; you have to wrap it with trippel quotes. Ex """#password;"""
;password =
# Use either URL or the previous fields to configure the database
# Example: mysql://user:secret@host:port/database
;url =
# For "postgres" only, either "disable", "require" or "verify-full"
;ssl_mode = disable
# For "sqlite3" only, path relative to data_path setting
;path = grafana.db
# Max conn setting default is 0 (mean not set)
;max_idle_conn =
;max_open_conn =
#################################### Session ####################################
[session]
# Either "memory", "file", "redis", "mysql", "postgres", default is "file"
;provider = file
# Provider config options
# memory: not have any config yet
# file: session dir path, is relative to grafana data_path
# redis: config like redis server e.g. `addr=127.0.0.1:6379,pool_size=100,db=grafana`
# mysql: go-sql-driver/mysql dsn config string, e.g. `user:password@tcp(127.0.0.1:3306)/database_name`
# postgres: user=a password=b host=localhost port=5432 dbname=c sslmode=disable
;provider_config = sessions
# Session cookie name
;cookie_name = grafana_sess
# If you use session in https only, default is false
;cookie_secure = false
# Session life time, default is 86400
;session_life_time = 86400
#################################### Data proxy ###########################
[dataproxy]
# This enables data proxy logging, default is false
;logging = false
#################################### Analytics ####################################
[analytics]
# Server reporting, sends usage counters to stats.grafana.org every 24 hours.
# No ip addresses are being tracked, only simple counters to track
# running instances, dashboard and error counts. It is very helpful to us.
# Change this option to false to disable reporting.
;reporting_enabled = true
# Set to false to disable all checks to https://grafana.net
# for new vesions (grafana itself and plugins), check is used
# in some UI views to notify that grafana or plugin update exists
# This option does not cause any auto updates, nor send any information
# only a GET request to http://grafana.com to get latest versions
;check_for_updates = true
# Google Analytics universal tracking code, only enabled if you specify an id here
;google_analytics_ua_id =
#################################### Security ####################################
[security]
# default admin user, created on startup
;admin_user = admin
# default admin password, can be changed before first start of grafana, or in profile settings
;admin_password = admin
# used for signing
;secret_key = SW2YcwTIb9zpOOhoPsMm
# Auto-login remember days
;login_remember_days = 7
;cookie_username = grafana_user
;cookie_remember_name = grafana_remember
# disable gravatar profile images
;disable_gravatar = fals
# data source proxy whielist (ip_or_domain:port separated by spaces)
;data_source_proxy_whielist =
[snapshots]
# snapshot sharing options
;external_enabled = true
;external_snapshot_url = https://snapshots-origin.raintank.io
;external_snapshot_name = Publish to snapshot.raintank.io
# remove expired snapshot
;snapshot_remove_expired = true
# remove snapshots after 90 days
;snapshot_TTL_days = 90
#################################### Users ####################################
[users]
# disable user signup / registration
;allow_sign_up = true
# Allow non admin users to create organizations
;allow_org_create = true
# Set to true to automatically assign new users to the default organization (id 1)
;auto_assign_org = true
# Default role new users will be automatically assigned (if disabled above is set to true)
;auto_assign_org_role = Viewer
# Background text for the user field on the login page
;login_hint = email or username
# Default UI theme ("dark" or "light")
;default_theme = dark
[auth]
# Set to true to disable (hide) the login form, useful if you use OAuth, defaults to false
;disable_login_form = false
# Set to true to disable the signout link in the side menu. useful if you use auth.proxy, defaults to false
;disable_signout_menu = false
#################################### Anonymous Auth ##########################
[auth.anonymous]
# enable anonymous access
;enabled = false
# specify organization name that should be used for unauthenticated users
;org_name = Grafana Sample
# specify role for unauthenticated users
;org_role = Viewer
#################################### Github Auth ##########################
[auth.github]
;enabled = false
;allow_sign_up = true
;client_id = some_id
;client_secret = some_secret
;scopes = user:email,read:org
;auth_url = https://github.com/login/oauth/authorize
;token_url = https://github.com/login/oauth/access_token
;api_url = https://api.github.com/user
;team_ids =
;allowed_organizations =
#################################### Google Auth ##########################
[auth.google]
;enabled = false
;allow_sign_up = true
;client_id = some_client_id
;client_secret = some_client_secret
;scopes = https://www.googleapis.com/auth/userinfo.profile https://www.googleapis.com/auth/userinfo.email
;auth_url = https://accounts.google.com/o/oauth2/auth
;token_url = https://accounts.google.com/o/oauth2/token
;api_url = https://www.googleapis.com/oauth2/v1/userinfo
;allowed_domains =
#################################### Generic OAuth ##########################
[auth.generic_oauth]
;enabled = false
;name = OAuth
;allow_sign_up = true
;client_id = some_id
;client_secret = some_secret
;scopes = user:email,read:org
;auth_url = https://foo.bar/login/oauth/authorize
;token_url = https://foo.bar/login/oauth/access_token
;api_url = https://foo.bar/user
;team_ids =
;allowed_organizations =
#################################### Grafana.com Auth ####################
[auth.grafana_com]
;enabled = false
;allow_sign_up = true
;client_id = some_id
;client_secret = some_secret
;scopes = user:email
;allowed_organizations =
#################################### Auth Proxy ##########################
[auth.proxy]
;enabled = false
;header_name = X-WEBAUTH-USER
;header_property = username
;auto_sign_up = true
;ldap_sync_ttl = 60
;whitelist = 192.168.1.1, 192.168.2.1
#################################### Basic Auth ##########################
[auth.basic]
;enabled = true
#################################### Auth LDAP ##########################
[auth.ldap]
;enabled = false
;config_file = /etc/grafana/ldap.toml
;allow_sign_up = true
#################################### SMTP / Emailing ##########################
[smtp]
enabled = true
host = smtp.gmail.com:465
user = gmailaccount@gmail.com
# If the password contains # or ; you have to wrap it with trippel quotes. Ex """#password;"""
password = YourPassword
;cert_file =
;key_file =
skip_verify = true
from_address = grafana@sample.server.com
;from_name = Grafana
[emails]
;welcome_email_on_sign_up = false
#################################### Logging ##########################
[log]
# Either "console", "file", "syslog". Default is console and file
# Use space to separate multiple modes, e.g. "console file"
;mode = console file
# Either "debug", "info", "warn", "error", "critical", default is "info"
;level = info
# optional settings to set different levels for specific loggers. Ex filters = sqlstore:debug
;filters =
# For "console" mode only
[log.console]
;level =
# log line format, valid options are text, console and json
;format = console
# For "file" mode only
[log.file]
;level =
# log line format, valid options are text, console and json
;format = text
# This enables automated log rotate(switch of following options), default is true
;log_rotate = true
# Max line number of single file, default is 1000000
;max_lines = 1000000
# Max size shift of single file, default is 28 means 1 << 28, 256MB
;max_size_shift = 28
# Segment log daily, default is true
;daily_rotate = true
# Expired days of log file(delete after max days), default is 7
;max_days = 7
[log.syslog]
;level =
# log line format, valid options are text, console and json
;format = text
# Syslog network type and address. This can be udp, tcp, or unix. If left blank, the default unix endpoints will be used.
;network =
;address =
# Syslog facility. user, daemon and local0 through local7 are valid.
;facility =
# Syslog tag. By default, the process' argv[0] is used.
;tag =
#################################### AMQP Event Publisher ##########################
[event_publisher]
;enabled = false
;rabbitmq_url = amqp://localhost/
;exchange = grafana_events
;#################################### Dashboard JSON files ##########################
[dashboards.json]
;enabled = false
;path = /var/lib/grafana/dashboards
#################################### Alerting ############################
[alerting]
# Disable alerting engine & UI features
;enabled = true
# Makes it possible to turn off alert rule execution but alerting UI is visible
;execute_alerts = true
#################################### Internal Grafana Metrics ##########################
# Metrics available at HTTP API Url /api/metrics
[metrics]
# Disable / Enable internal metrics
;enabled = true
# Publish interval
;interval_seconds = 10
# Send internal metrics to Graphite
[metrics.graphite]
# Enable by setting the address setting (ex localhost:2003)
;address =
;prefix = prod.grafana.%(instance_name)s.
#################################### Grafana.com integration ##########################
# Url used to to import dashboards directly from Grafana.com
[grafana_com]
;url = https://grafana.com
#################################### External image storage ##########################
[external_image_storage]
# Used for uploading images to public servers so they can be included in slack/email messages.
# you can choose between (s3, webdav)
;provider =
[external_image_storage.s3]
;bucket_url =
;access_key =
;secret_key =
[external_image_storage.webdav]
;url =
;public_url =
;username =
;password =
kind: ConfigMap
metadata:
name: config-grafana-ha
namespace: monitoring
I used Kubernetes' secret variables to provide authentication configuration values to Grafana Auth.
apiVersion: v1
kind: Secret
metadata:
name: grafana-secret-ha
namespace: monitoring
labels:
file: grafana-secret-ha
type: Opaque
data:
grafanadb-user: '<64value>'
grafanadb-password: '<64value>'
grafana-admin-password: '<64value>'
To generate the values with base 64, you can use the following command if you have a Linux shell:
echo -n <value> | base64
Click here for more information about the command.
We need a service to ensure that traffic can be delivered to all Pods according to the requests made to the cluster.
apiVersion: v1
kind: Service
metadata:
name: grafana-service
namespace: monitoring
labels:
app: grafana-deployment
component: grafana-core
spec:
type: ClusterIP
ports:
- port: 3000
selector:
app: grafana-deployment
component: grafana-core
Postgres
Now let's deploy our database instance, below you can find the YAML:
apiVersion: apps/v1beta2
kind: Deployment
metadata:
name: postgres
namespace: monitoring
labels:
app: postgres
spec:
replicas: 1
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:9.4
imagePullPolicy: "IfNotPresent"
ports:
- containerPort: 5432
env:
- name: POSTGRES_DB
value: grafanadb
- name: POSTGRES_USER
valueFrom:
secretKeyRef:
name: grafana-secret-ha
key: grafanadb-user
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: grafana-secret-ha
key: grafanadb-password
volumeMounts:
- mountPath: /var/lib/postgresql/
name: postgres-volume
volumes:
- name: postgres-volume
persistentVolumeClaim:
claimName: postgres-volume
Note that the database instance uses the previously-configured secret settings.
Let's now deploy the persistent volume to our database.
apiVersion: v1
kind: Service
metadata:
name: postgres-service
namespace: monitoring
labels:
app: postgres
spec:
type: ClusterIP
ports:
- port: 5432
selector:
app: postgres
With the service applied, we can refer via internal DNS to our database instance, in case our Pod will die, the IP exchange will not affect the communication between our Grafana and the database.
Always remember to use DNS when possible.
In order for the deployment of our database instance to be complete, we need to deploy a persistent volume to ensure that data from Postgres or any other DB is healthy
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-volume
namespace: monitoring
spec:
accessModes:
- ReadWriteOnce
storageClassName: managed-premium
resources:
requests:
storage: 10Gi
According to the above steps and adjustments according to your need, we have a Grafana HA cluster. I hope you enjoy and until the next post, I hope you use the material as a study base to try to understand what each line of code means, if you want to talk, feel free to contact me.