OpenShift Data Foundation on IBM Cloud Using Terraform, Part 1: Deployment

2025-02-10

OpenShift Data Foundation (ODF) is a Software Defined Storage solution that manages and stores large amounts of data across multiple environments in a scalable, efficient, and secure manner. It provides a unified view of data from different sources, including databases, file systems, and cloud storage. With ODF, organizations can manage structured and unstructured data from multiple sources, and gain insights from it through analytics and machine learning enabling them to build a modern data infrastructure that is flexible, agile, and cost-effective.

Under the hood, ODF utilizes Ceph as the underlying distributed storage system. Ceph stores the data and manages data replication and distribution across multiple nodes in the storage cluster, ensuring high availability and data redundancy. Rook, integrated with ODF as a storage orchestrator, automates the deployment and management of Ceph within the Kubernetes cluster. Rook handles tasks like data replication, scaling, and fault tolerance, relieving administrators from dealing with these complexities.

Summarizing:

Scalability: ODF can scale to handle large amounts of data, so it's great for organizations that have a lot of data or expect their data to grow in the future.
Adaptability: ODF can handle different types of data from multiple sources, and helps provide diverse data needs and wants with the ability to adapt to changes.
Security: ODF has built-in security features, it allows one to encrypt their data and comply with regulations.
High Availability: ODF replicates data across multiple nodes. This means that if one node fails, the data is still available from other nodes, ensuring that the system remains available.

What Is Terraform?

Terraform is an infrastructure as code tool that lets you build, change, and version cloud and on-prem resources safely and efficiently. Think of it as a way to automate the manual process of setting up and managing infrastructure resources, making it faster, more reliable, and easier to maintain over time.

Prerequisites

IBM Cloud VPC-based RedHat OpenShift Cluster must be configured with a worker flavor of at least 10 vCPUs and 32GB memory for each node (see Create an OpenShift Cluster using VPC infrastructure).
The ODF add-on supports n and n+1 cluster versions. For example, if you have version 4.12.0 of the add-on, it is supported on OpenShift cluster versions 4.12 and 4.13
Clone the GitHub repository with the required terraform files.

Deploying OpenShift Data Foundation

To deploy ODF through Terraform, all you need to get started with is the input.tfvars file provided in the add-on folder of the repository. Every use case starting from creation to deletion will only concern this file.

Let’s get started with a sample input.tfvars file populated with the default values as shown below:

 
   ```

# To enable ODF Add-On on your cluster

ibmcloud_api_key = ""

cluster = ""

region = ""

odfVersion = "4.12.0"

# To create the OcsCluster Custom Resource Definition, with the following specs

autoDiscoverDevices = "false"

billingType = "advanced"

clusterEncryption = "false"

hpcsBaseUrl = null

hpcsEncryption = "false"

hpcsInstanceId = null

hpcsSecretName = null

hpcsServiceName = null

hpcsTokenUrl = null

ignoreNoobaa = "false"

numOfOsd = "1"

ocsUpgrade = "false"

osdDevicePaths = null

osdSize = "250Gi"

osdStorageClassName = "ibmc-vpc-block-metro-10iops-tier"

workerNodes = null

```

For more information on the different parameters click here.

With this sample configuration, we deploy the 4.12.0 version of ODF on a ROKS VPC Cluster (4.12 or 4.13) provisioning 250 GB amount of storage. We disable any encryption and deploy ODF on all the worker nodes. The number of OSD is 1; i.e., we provision a total of 250 GB x 3 = 750 GB, with the 2 extra replicas acting as backups on different locations.

Now we run the following command:

 
   ```

terraform apply --var-file=input.tfvars

```

It’s as simple as that! After 20 minutes you shall see:

The ODF Add-On (4.12.0) gets installed on your ROKS VPC Cluster on IBM Cloud.
The OcsCluster Custom Resource Definition has been initialized.

Command:

 
   ```

kubectl get ocscluster -o yaml

```

Output:

 
   ```

apiVersion: v1

items:

- apiVersion: ocs.ibm.io/v1

  kind: OcsCluster

  metadata:

    creationTimestamp: "2023-09-05T07:22:37Z"

    finalizers:

    - finalizer.ocs.ibm.io

    generation: 1

    name: ocscluster-auto

    resourceVersion: "21595"

    uid: 39d4ab72-50b8-49ed-89f4-b40cbd6fcbaa

  spec:

    autoDiscoverDevices: false

    billingType: advanced

    clusterEncryption: false

    hpcsEncryption: false

    ignoreNoobaa: false

    numOfOsd: 1

    ocsUpgrade: false

    osdDevicePaths:

    - ""

    osdSize: 250Gi

    osdStorageClassName: ibmc-vpc-block-metro-10iops-tier

  status:

    storageClusterStatus: Ready

kind: List

metadata:

  resourceVersion: ""

```

After 10-15 minutes, the Ocscluster status is Ready! We have successfully deployed ODF on our ROKS VPC cluster.

Deploying an App With OpenShift Data Foundation

After you have deployed OpenShift Data Foundation for your ROKS cluster through Terraform, you can use the ODF storage classes to create a persistent volume claim (PVC). Then, refer to the PVC in your deployment so that your app can save and use data from the underlying ODF storage device.

List the ODF storage classes:

 
   ```

kubectl get sc | grep openshift

```

Output:

 
   ```

ocs-storagecluster-ceph-rbd openshift-storage.rbd.csi.ceph.com    Delete          

ocs-storagecluster-cephfs   openshift-storage.cephfs.csi.ceph.com Delete          

openshift-storage.noobaa.io openshift-storage.noobaa.io/obc       Delete          

```

Create a PVC using the below YAML file with the storage class that you’d like to use; for example, PVC using the ocs-storagecluster-cephfs storage class:

 
   ```

apiVersion: v1

kind: PersistentVolumeClaim

metadata:

  name: odf-pvc

spec:

  accessModes:

  - ReadWriteOnce

  resources:

    requests:

      storage: 10Gi

  storageClassName: ocs-storagecluster-cephfs

```

Create the PVC in your cluster.

 
   ``` 

kubectl create -f <my-pvc.yaml>

```

Create a YAML configuration file for a pod, that mounts the PVC that you created. The following example creates an nginx pod that writes the current date and time to a test.txt file.

pod.yaml

 
   ```

apiVersion: v1

kind: Pod

metadata:

  name: app

spec:

  containers:

  - name: app

    image: nginx

    command: ["/bin/sh"]

    args: ["-c", "while true; do echo $(date -u) >> /test/test.txt; sleep 600; done"]

    volumeMounts:

    - name: persistent-storage

      mountPath: /test

  volumes:

  - name: persistent-storage

    persistentVolumeClaim:

      claimName: odf-pvc

```

Create the pod in your cluster.

 
   ``` 

kubectl apply -f pod.yaml

```

Wait for your pod to be in the Running state.

 
   ``` 

kubectl get pods

```

Verify that the app can write data.
- Log in to your pod.

 
   ``` 

kubectl exec <app-pod-name> -it – bash

```

Display the contents of the test.txt file to confirm that your app can write data to your persistent storage.

 
   ``` 

cat /test/test.txt

```

Output:

 
   ``` 

Sat Sep 2 20:09:19 UTC 2023

Sat Sep 2 20:09:25 UTC 2023

Sat Sep 2 20:09:31 UTC 2023

Sat Sep 2 20:09:36 UTC 2023

Sat Sep 2 20:09:42 UTC 2023

Sat Sep 2 20:09:47 UTC 2023

```

Exit the pod.

 
   ```

exit

```

You have successfully deployed an application that uses OpenShift Data Foundation as the underlying storage!