Connect to AWS Athena Database in a Mule Application With Generic Database
This would be a short tutorial on connecting from MuleSoft to AWS Athena Database. We will use Generic Database Driver.
Please read through the following resources for further information before getting started, just in case.
- Introduction to MuleSoft Anypoint Studio
- Introduction to building API using MuleSoft
- MuleSoft Database Connector
- MuleSoft Database Connector: Configuring a generic connection
- Connect Amazon Athena with JDBC
We will get started and go through each process step by step. Let's assume that we have the following information before getting started:
- Athena S3 Output Location: <S3OutputLocation>
- AWS Region: <AwsRegion>
- UserName: <User>
- Password: <Password>
Get Started
We will start by creating a Project in Anypoint MuleSoft Platform. If you want to understand how to do this, please check my blog on Introduction to Building API Using MuleSoft.
Once we have created the application, the first step is to update our POM with the necessary JAR. I was not able to find a MAVEN Repository having the necessary JAR Files available to be used. However, I was able to get the JAR files.
The JAR files can be available for download here in my GitHub space.
Alternatively, you can download the JAR file from the AWS Documentation Connect Amazon Athena with JDBC.
In my case, I am using version 2.1.0.1000, the JDBC driver with AWS SDK.
Once the corresponding JAR is downloaded, we need to upload the same to the artifactory that we use. In my case, I will simply upload this to my Maven Repository using the following command.
mvn install:install-file \
-Dfile=<path-to-file> \
-DgroupId=<group-id> \
-DartifactId=<artifact-id> \
-Dversion=<version> \
-Dpackaging=<packaging> \
-DgeneratePom=true
In my case:
mvn install:install-file -Dfile=<path-to-file> -DgroupId=aws.athena -DartifactId=AthenaJDBC42 -Dversion=2.1.0.1000
I would overwrite the pom file that I have downloaded from the GitLab.
Set Up My Application POM File
Once completed, we must update my POM File at 2 places:
- Add the dependency to the
AthenaJDBC42
JAR.
<dependency>
<groupId>aws.athena</groupId>
<artifactId>AthenaJDBC42</artifactId>
<version>2.1.0.1000</version>
</dependency>
- Add the shared library configuration.
<plugin>
<groupId>org.mule.tools.maven</groupId>
<artifactId>mule-maven-plugin</artifactId>
<version>${mule.maven.plugin.version}</version>
<extensions>true</extensions>
<configuration>
<sharedLibraries>
<sharedLibrary>
<groupId>aws.athena</groupId>
<artifactId>AthenaJDBC42</artifactId>
</sharedLibrary>
</sharedLibraries>
<classifier>mule-application</classifier>
</configuration>
</plugin>
Starting the Code
Here is an example of the code used. We will head back to the actual XML and start by creating a flow consisting of the following, in order:
- HTTP Listener
- Logger
- Database Select
- Logger
- Map to convert the output to JSON
The Flow should look something like this.
Here is the code that I have in my XML:
<?xml version="1.0"encoding="UTF-8"?>
<mule xmlns:ee="http://www.mulesoft.org/schema/mule/ee/core"xmlns:db="http://www.mulesoft.org/schema/mule/db"
xmlns:http="http://www.mulesoft.org/schema/mule/http"
xmlns="http://www.mulesoft.org/schema/mule/core"xmlns:doc="http://www.mulesoft.org/schema/mule/documentation"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://www.mulesoft.org/schema/mule/core http://www.mulesoft.org/schema/mule/core/current/mule.xsd
http://www.mulesoft.org/schema/mule/http http://www.mulesoft.org/schema/mule/http/current/mule-http.xsd
http://www.mulesoft.org/schema/mule/db http://www.mulesoft.org/schema/mule/db/current/mule-db.xsd
http://www.mulesoft.org/schema/mule/ee/core http://www.mulesoft.org/schema/mule/ee/core/current/mule-ee.xsd">
<http:listener-config name="HTTP_Listener_config"doc:name="HTTP Listener config">
<http:listener-connection host="0.0.0.0"port="8081" />
</http:listener-config>
<db:config name="Athena_Database_Config">
<db:generic-connection
url="jdbc:awsathena://User=${athena.User};Password=${athena.Password};AwsRegion=${athena.AwsRegion};S3OutputLocation=${athena.S3OutputLocation}"
driverClassName="com.simba.athena.jdbc.Driver"/>
</db:config>
<flow name="test-athena-dbFlow"doc:id="cc20b26a-4cc0-4a81-b66a-bf97e310858a">
<http:listener doc:name="Listener"doc:id="d6463d2c-4895-43e7-a5f8-ae4e261cf187"config-ref="HTTP_Listener_config" path="/testAthena"/>
<logger level="INFO"doc:name="Logger" doc:id="21cb94b1-a3cc-436a-a97b-e362c2ab8989"message="Request Received"/>
<db:select doc:name="Select"doc:id="6210b6c3-bc8e-4e7d-87f1-dafa888fdf45"config-ref="Athena_Database_Config">
<db:sql ><![CDATA[${athena.sqlQuery}]]></db:sql>
</db:select>
<logger level="INFO"doc:name="Logger" doc:id="c17e9502-e118-4a06-a33f-ccc816fcd7b5"message="Database Connection Complete"/>
<ee:transform doc:name="Transform Message" doc:id="7de4dceb-e124-41f6-a51a-6718ae2bc6b5">
<ee:message >
<ee:set-payload ><![CDATA[%dw 2.0
output application/json
---
payload]]></ee:set-payload>
</ee:message>
</ee:transform>
</flow>
</mule>
Testing My Application
Finally, I will run my application in MuleSoft.
Once the application is deployed, up and kicking, we can call the API from Postman:
Conclusion
This blog explains the steps required to configure the Mule application to use generic database configuration to connect to Amazon Athena using the AWS Athena JDBC Driver. Please let me know your thoughts. The idea was to connect to the database. Of course, we should have proper error handling, HTTPs security, etc., in the application.