Deploying Grafana HA Kubernetes Cluster on Azure AKS

Image title


Introduction

This post aims to demonstrate how to deploy a Grafana high-availability cluster using disk persistence and data storage in a Postgres instance (You can choose other databases according to the documentation available on the official Grafana website). The cluster will be deployed on the Azure AKS (Azure Kubernetes Service) platform, stressing that to get the most out of the material, avoid copying and pasting and find out how each line works before applying yamls as if there were no tomorrow. I will leave some useful links during each deployment process to facilitate the studies of all.

Grafana HA

Ensure that n replicas use the same database to persist the data pertaining to the cluster. You can delve into the subject (recommended) by accessing Grafana's official documentation.

Kind StatefulSet Grafana

In order to be able to deploy Grafana consistently in this example, we use the kind StatefulSet. You can learn more through the following address. What really interested me in using this type is the guarantee of generating a volume for each pod dynamically.


apiVersion: apps/v1beta2
kind: StatefulSet
metadata:
  name: grafana-deployment
  namespace: monitoring
  labels:
    app: grafana-deployment
    component: grafana-core
spec:
  replicas: 3
  selector:
    matchLabels:
      app: grafana-deployment
      component: grafana-core
  serviceName: grafana-service
  template:
    metadata:
      labels:
        app: grafana-deployment
        component: grafana-core
    spec:
      containers:
      - image: grafana/grafana
        name: grafana-deployment
        resources:
          limits:
            cpu: 100m
            memory: 100Mi
          requests:
            cpu: 100m
            memory: 100Mi
        env:
        - name: GF_SERVER_DOMAIN
          value: "sample.server.com"
        - name: GF_SERVER_ROOT_URL
          value: "http://sample.server.com/"
        - name: GF_AUTH_BASIC_ENABLED
          value: "true"
        - name: GF_AUTH_ANONYMOUS_ENABLED
          value: "false"
        - name: GF_DATABASE_TYPE
          value: "postgres"
        - name: GF_DATABASE_HOST
          value: "postgres-service.monitoring.svc.cluster.local:5432"
        - name: GF_DATABASE_NAME
          value: "grafanadb"
        - name: GF_DATABASE_USER
          valueFrom:
            secretKeyRef:
              name: grafana-secret-ha
              key: grafanadb-user
        - name: GF_DATABASE_PASSWORD
          valueFrom:
            secretKeyRef:
              name: grafana-secret-ha
              key: grafanadb-password
        - name:  GF_SECURITY_ADMIN_PASSWORD
          valueFrom:
            secretKeyRef:
              name: grafana-secret-ha
              key: grafana-admin-password
        readinessProbe:
          httpGet:
            path: /login
            port: 3000
        volumeMounts:
        - name: config-grafana-ha
          mountPath: /etc/grafana/grafana.ini
          subPath: grafana.ini
      volumes:
      - configMap:
          defaultMode: 420
          name: config-grafana-ha
        name: config-grafana-ha
      initContainers:
      - name: fix-volume-permissions-to-grafana
        securityContext:
          runAsUser: 0
          runAsGroup: 0
        image: busybox
        command: ["sh", "-c", "chown -R 0:0 /var/lib/grafana"]
        volumeMounts:
        - name: grafana-volume
          mountPath: /var/lib/grafana
  volumeClaimTemplates:
  - metadata:
      name: grafana-volume
    spec:
      accessModes: ["ReadWriteOnce"]
      storageClassName: managed-premium
      resources:
        requests:
          storage: 5Gi


The above StatefulSet has three excerpts: one refers to the deployment of Grafana itself, and the second is an initContainer that adjusts the permissions of the  "/var/lib/grafana" file, as regards an impairment due to the new versions. I set the permissions as root, but it is not advisable in production environments. Finally, the assembly of our dynamic volumes using Azure's premium disk class. For the settings we will need to apply configmap settings, below:

apiVersion: v1
data:
  grafana.ini: |
    ##################### Grafana Configuration Example #####################
    #
    # Everything has defaults so you only need to uncomment things you want to
    # change
    # possible values : production, development
    ; app_mode = development
    # instance name, defaults to HOSTNAME environment variable value or hostname if HOSTNAME var is empty
    ; instance_name = ${HOSTNAME}
    #################################### Paths ####################################
    [paths]
    # Path to where grafana can store temp files, sessions, and the sqlite3 db (if that is used)
    #
    ;data = /var/lib/grafana
    #
    # Directory where grafana can store logs
    #
    ;logs = /var/log/grafana
    #
    # Directory where grafana will automatically scan and look for plugins
    #
    ;plugins = /var/lib/grafana/plugins
    #
    #################################### Server ####################################
    [server]
    # Protool (http, https, socket)
    ;protocol = http
    # The ip address to bind to, empty will bind to all interfaces
    ;http_addr =
    # The http port  to use
    ;http_port = 3000
    # The public facing domain name used to access grafana from a browser
    ;domain = localhost
    # Redirect to correct domain if host header does not match domain
    # Prevents DNS rebinding attacks
    ;enforce_domain = false
    # The full public facing url you use in browser, used for redirects and emails
    # If you use reverse proxy and sub path specify full url (with sub path)
    #root_url = "sample.server.com"
    # Log web requests
    ;router_logging = false
    # the path relative working path
    ;static_root_path = public
    # enable gzip
    ;enable_gzip = false
    # https certs & key file
    ;cert_file =
    ;cert_key =
    # Unix socket path
    ;socket =
    #################################### Database ####################################
    [database]
    # You can configure the database connection by specifying type, host, name, user and password
    # as seperate properties or as on string using the url propertie.
    # Either "mysql", "postgres" or "sqlite3", it's your choice
    # type = mysql
    # host = 127.0.0.1:3306
    ;name = grafana
    ;user = root
    # If the password contains # or ; you have to wrap it with trippel quotes. Ex """#password;"""
    ;password =
    # Use either URL or the previous fields to configure the database
    # Example: mysql://user:secret@host:port/database
    ;url =
    # For "postgres" only, either "disable", "require" or "verify-full"
    ;ssl_mode = disable
    # For "sqlite3" only, path relative to data_path setting
    ;path = grafana.db
    # Max conn setting default is 0 (mean not set)
    ;max_idle_conn =
    ;max_open_conn =
    #################################### Session ####################################
    [session]
    # Either "memory", "file", "redis", "mysql", "postgres", default is "file"
    ;provider = file
    # Provider config options
    # memory: not have any config yet
    # file: session dir path, is relative to grafana data_path
    # redis: config like redis server e.g. `addr=127.0.0.1:6379,pool_size=100,db=grafana`
    # mysql: go-sql-driver/mysql dsn config string, e.g. `user:password@tcp(127.0.0.1:3306)/database_name`
    # postgres: user=a password=b host=localhost port=5432 dbname=c sslmode=disable
    ;provider_config = sessions
    # Session cookie name
    ;cookie_name = grafana_sess
    # If you use session in https only, default is false
    ;cookie_secure = false
    # Session life time, default is 86400
    ;session_life_time = 86400
    #################################### Data proxy ###########################
    [dataproxy]
    # This enables data proxy logging, default is false
    ;logging = false
    #################################### Analytics ####################################
    [analytics]
    # Server reporting, sends usage counters to stats.grafana.org every 24 hours.
    # No ip addresses are being tracked, only simple counters to track
    # running instances, dashboard and error counts. It is very helpful to us.
    # Change this option to false to disable reporting.
    ;reporting_enabled = true
    # Set to false to disable all checks to https://grafana.net
    # for new vesions (grafana itself and plugins), check is used
    # in some UI views to notify that grafana or plugin update exists
    # This option does not cause any auto updates, nor send any information
    # only a GET request to http://grafana.com to get latest versions
    ;check_for_updates = true
    # Google Analytics universal tracking code, only enabled if you specify an id here
    ;google_analytics_ua_id =
    #################################### Security ####################################
    [security]
    # default admin user, created on startup
    ;admin_user = admin
    # default admin password, can be changed before first start of grafana,  or in profile settings
    ;admin_password = admin
    # used for signing
    ;secret_key = SW2YcwTIb9zpOOhoPsMm
    # Auto-login remember days
    ;login_remember_days = 7
    ;cookie_username = grafana_user
    ;cookie_remember_name = grafana_remember
    # disable gravatar profile images
    ;disable_gravatar = fals
    # data source proxy whielist (ip_or_domain:port separated by spaces)
    ;data_source_proxy_whielist =
    [snapshots]
    # snapshot sharing options
    ;external_enabled = true
    ;external_snapshot_url = https://snapshots-origin.raintank.io
    ;external_snapshot_name = Publish to snapshot.raintank.io
    # remove expired snapshot
    ;snapshot_remove_expired = true
    # remove snapshots after 90 days
    ;snapshot_TTL_days = 90
    #################################### Users ####################################
    [users]
    # disable user signup / registration
    ;allow_sign_up = true
    # Allow non admin users to create organizations
    ;allow_org_create = true
    # Set to true to automatically assign new users to the default organization (id 1)
    ;auto_assign_org = true
    # Default role new users will be automatically assigned (if disabled above is set to true)
    ;auto_assign_org_role = Viewer
    # Background text for the user field on the login page
    ;login_hint = email or username
    # Default UI theme ("dark" or "light")
    ;default_theme = dark
    [auth]
    # Set to true to disable (hide) the login form, useful if you use OAuth, defaults to false
    ;disable_login_form = false
    # Set to true to disable the signout link in the side menu. useful if you use auth.proxy, defaults to false
    ;disable_signout_menu = false
    #################################### Anonymous Auth ##########################
    [auth.anonymous]
    # enable anonymous access
    ;enabled = false
    # specify organization name that should be used for unauthenticated users
    ;org_name = Grafana Sample
    # specify role for unauthenticated users
    ;org_role = Viewer
    #################################### Github Auth ##########################
    [auth.github]
    ;enabled = false
    ;allow_sign_up = true
    ;client_id = some_id
    ;client_secret = some_secret
    ;scopes = user:email,read:org
    ;auth_url = https://github.com/login/oauth/authorize
    ;token_url = https://github.com/login/oauth/access_token
    ;api_url = https://api.github.com/user
    ;team_ids =
    ;allowed_organizations =
    #################################### Google Auth ##########################
    [auth.google]
    ;enabled = false
    ;allow_sign_up = true
    ;client_id = some_client_id
    ;client_secret = some_client_secret
    ;scopes = https://www.googleapis.com/auth/userinfo.profile https://www.googleapis.com/auth/userinfo.email
    ;auth_url = https://accounts.google.com/o/oauth2/auth
    ;token_url = https://accounts.google.com/o/oauth2/token
    ;api_url = https://www.googleapis.com/oauth2/v1/userinfo
    ;allowed_domains =
    #################################### Generic OAuth ##########################
    [auth.generic_oauth]
    ;enabled = false
    ;name = OAuth
    ;allow_sign_up = true
    ;client_id = some_id
    ;client_secret = some_secret
    ;scopes = user:email,read:org
    ;auth_url = https://foo.bar/login/oauth/authorize
    ;token_url = https://foo.bar/login/oauth/access_token
    ;api_url = https://foo.bar/user
    ;team_ids =
    ;allowed_organizations =
    #################################### Grafana.com Auth ####################
    [auth.grafana_com]
    ;enabled = false
    ;allow_sign_up = true
    ;client_id = some_id
    ;client_secret = some_secret
    ;scopes = user:email
    ;allowed_organizations =
    #################################### Auth Proxy ##########################
    [auth.proxy]
    ;enabled = false
    ;header_name = X-WEBAUTH-USER
    ;header_property = username
    ;auto_sign_up = true
    ;ldap_sync_ttl = 60
    ;whitelist = 192.168.1.1, 192.168.2.1
    #################################### Basic Auth ##########################
    [auth.basic]
    ;enabled = true
    #################################### Auth LDAP ##########################
    [auth.ldap]
    ;enabled = false
    ;config_file = /etc/grafana/ldap.toml
    ;allow_sign_up = true
    #################################### SMTP / Emailing ##########################
    [smtp]
    enabled = true
    host =  smtp.gmail.com:465
    user = gmailaccount@gmail.com
    # If the password contains # or ; you have to wrap it with trippel quotes. Ex """#password;"""
    password = YourPassword
    ;cert_file =
    ;key_file =
    skip_verify = true
    from_address = grafana@sample.server.com
    ;from_name = Grafana
    [emails]
    ;welcome_email_on_sign_up = false
    #################################### Logging ##########################
    [log]
    # Either "console", "file", "syslog". Default is console and  file
    # Use space to separate multiple modes, e.g. "console file"
    ;mode = console file
    # Either "debug", "info", "warn", "error", "critical", default is "info"
    ;level = info
    # optional settings to set different levels for specific loggers. Ex filters = sqlstore:debug
    ;filters =
    # For "console" mode only
    [log.console]
    ;level =
    # log line format, valid options are text, console and json
    ;format = console
    # For "file" mode only
    [log.file]
    ;level =
    # log line format, valid options are text, console and json
    ;format = text
    # This enables automated log rotate(switch of following options), default is true
    ;log_rotate = true
    # Max line number of single file, default is 1000000
    ;max_lines = 1000000
    # Max size shift of single file, default is 28 means 1 << 28, 256MB
    ;max_size_shift = 28
    # Segment log daily, default is true
    ;daily_rotate = true
    # Expired days of log file(delete after max days), default is 7
    ;max_days = 7
    [log.syslog]
    ;level =
    # log line format, valid options are text, console and json
    ;format = text
    # Syslog network type and address. This can be udp, tcp, or unix. If left blank, the default unix endpoints will be used.
    ;network =
    ;address =
    # Syslog facility. user, daemon and local0 through local7 are valid.
    ;facility =
    # Syslog tag. By default, the process' argv[0] is used.
    ;tag =
    #################################### AMQP Event Publisher ##########################
    [event_publisher]
    ;enabled = false
    ;rabbitmq_url = amqp://localhost/
    ;exchange = grafana_events
    ;#################################### Dashboard JSON files ##########################
    [dashboards.json]
    ;enabled = false
    ;path = /var/lib/grafana/dashboards
    #################################### Alerting ############################
    [alerting]
    # Disable alerting engine & UI features
    ;enabled = true
    # Makes it possible to turn off alert rule execution but alerting UI is visible
    ;execute_alerts = true
    #################################### Internal Grafana Metrics ##########################
    # Metrics available at HTTP API Url /api/metrics
    [metrics]
    # Disable / Enable internal metrics
    ;enabled           = true
    # Publish interval
    ;interval_seconds  = 10
    # Send internal metrics to Graphite
    [metrics.graphite]
    # Enable by setting the address setting (ex localhost:2003)
    ;address =
    ;prefix = prod.grafana.%(instance_name)s.
    #################################### Grafana.com integration  ##########################
    # Url used to to import dashboards directly from Grafana.com
    [grafana_com]
    ;url = https://grafana.com
    #################################### External image storage ##########################
    [external_image_storage]
    # Used for uploading images to public servers so they can be included in slack/email messages.
    # you can choose between (s3, webdav)
    ;provider =
    [external_image_storage.s3]
    ;bucket_url =
    ;access_key =
    ;secret_key =
    [external_image_storage.webdav]
    ;url =
    ;public_url =
    ;username =
    ;password =
kind: ConfigMap
metadata:
  name: config-grafana-ha
  namespace: monitoring


I used Kubernetes' secret variables to provide authentication configuration values to Grafana Auth.

apiVersion: v1
kind: Secret
metadata:
    name: grafana-secret-ha
    namespace: monitoring
    labels:
        file: grafana-secret-ha
type: Opaque
data:
    grafanadb-user: '<64value>'
    grafanadb-password: '<64value>'
    grafana-admin-password: '<64value>'


To generate the values with base 64, you can use the following command if you have a Linux shell:

echo -n <value> | base64


Click here for more information about the command.

We need a service to ensure that traffic can be delivered to all Pods according to the requests made to the cluster.

apiVersion: v1
kind: Service
metadata:
  name: grafana-service
  namespace: monitoring
  labels: 
    app: grafana-deployment
    component: grafana-core
spec:
  type: ClusterIP
  ports:
  - port: 3000
    selector:
      app: grafana-deployment
      component: grafana-core


Postgres

Now let's deploy our database instance, below you can find the YAML:

apiVersion: apps/v1beta2
kind: Deployment
metadata:
  name: postgres
  namespace: monitoring
  labels:
    app: postgres
spec:
  replicas: 1
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
      - name: postgres
        image: postgres:9.4
        imagePullPolicy: "IfNotPresent"
        ports:
        - containerPort: 5432
        env:
        - name: POSTGRES_DB
          value: grafanadb
        - name: POSTGRES_USER
          valueFrom:
            secretKeyRef:
              name: grafana-secret-ha
              key: grafanadb-user
        - name: POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              name: grafana-secret-ha
              key: grafanadb-password
        volumeMounts:
        - mountPath: /var/lib/postgresql/
          name: postgres-volume
      volumes:
      - name: postgres-volume
        persistentVolumeClaim:
          claimName: postgres-volume


Note that the database instance uses the previously-configured secret settings.

Let's now deploy the persistent volume to our database.

apiVersion: v1
kind: Service
metadata:
  name: postgres-service
  namespace: monitoring
  labels:
    app: postgres
spec:
  type: ClusterIP
  ports:
  - port: 5432
  selector:
    app: postgres


With the service applied, we can refer via internal DNS to our database instance, in case our Pod will die, the IP exchange will not affect the communication between our Grafana and the database.

Always remember to use DNS when possible.

In order for the deployment of our database instance to be complete, we need to deploy a persistent volume to ensure that data from Postgres or any other DB is healthy

---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: postgres-volume
  namespace: monitoring
spec:
  accessModes:
  - ReadWriteOnce
  storageClassName: managed-premium
  resources:
    requests:
      storage: 10Gi



According to the above steps and adjustments according to your need, we have a Grafana HA cluster. I hope you enjoy and until the next post, I hope you use the material as a study base to try to understand what each line of code means, if you want to talk, feel free to contact me.

 

 

 

 

Top