Fine-Tuning Performance, Resolving Common Issues in FinTech Application With MySQL

2024-12-02

Relational Databases are the bedrock of any FinTech application, especially for OLTP (Online transaction Processing). This foundational component in any application architecture usually poses challenges around scaling as the business expands rapidly. So, it is imperative that all database activities are monitored closely in the production environment and issues like long-running queries are tracked and resolved.

This article will explore the FinTech case study, which has built a Lending Platform. The company uses the MySQL database hosted in AWS as part of the AWS RDS service. It has multiple microservices using different database schemas hosted on the database instance. The MVP product offering was launched a few years back, and since then, they have been incorporating new features into the FinTech platform. We will cover commonly identified database issues and what was done to resolve these issues.

Common Mistakes

This section discusses common mistakes identified, the steps to resolve the issues, and additional guidelines.

Using Database Views Inside Stored Procedures

After it started seeing database growth, the major issue identified in the initial days with the MySQL database was that usage of views inside stored procedures resulted in long-running queries and full table scans. The pattern we saw with the developers was that they built multiple reusable views and used those in the stored procedures. The example below is a stored procedure invoking a view uv_consultations:

     MySQL 
   
 
 
   CREATE PROCEDURE `usp_get_consultation`(
    in_dso_ref_id varchar(45)
)
BEGIN
    select *
    from uv_consultations
    where
        dso_ref_id=in_dso_ref_id;
END
 
  

Here is what the view looks like:

     MySQL 
   
 
 
   CREATE OR REPLACE VIEW uv_consultations AS
    SELECT
        c.id AS consultation_id,
        c.dso_ref_id AS dso_ref_id,
        c.pat_id AS pat_id,
        p.first_name AS pat_first_name,
        p.last_name AS pat_last_name,
        p.dso_pat_id AS dso_pat_id,
        p.dob AS pat_dob,        
        COUNT(cn.id) AS notifs
    FROM
        ((((consultation c
        JOIN patient p ON ((c.pat_id = p.id)))
        LEFT JOIN application app ON ((c.id = app.consultation_id)))
        LEFT JOIN responsible_party rp ON ((app.primary_rp_id = rp.id)))
        LEFT JOIN consultation_notif cn ON (((cn.consultation_id = c.id)
            AND (cn.read = 0)
            AND (cn.hide = 0))))
    GROUP BY c.id

 
  

So, if we look at the view, it has multiple joins and a group by. It's a complex view query. MySQL executes views in two ways: view merging and view materialization.

View Merging

In this approach, MySQL merges the view query with the outer query in which the view has been referenced. The biggest advantage is that it uses underlying table indexes of the actual base table, improving the query timing. However, there are limitations to view merging. The query within the view must be a simple select statement with no aggregations, joins, sub-queries, or DISTINCT clauses.

View Materialization

When MySQL cannot perform a merge view, it defaults to View Materialization. In this approach, MySQL stores the view results in a temporary internal table and then uses the query on the view to query the internal table. The drawback is that it does not use base table indexes, causing the queries on the view to run slower. Views with GROUP BY, aggregation, DISTINCT, and complex joins trigger view materialization.

In the above example, the view has aggregation and multiple joins, so the stored procedure's execution results in view materialization, which causes it to execute slowly. To mitigate this, refactored the stored procedure to use the direct complex SQL query and avoid the intermittent layer of views.

Leading Wildcard String Comparison

If we use the LIKE operator and the leading wild card search, e.g., LIKE ‘%AMOL’, then the query will not use the index on that column. Below is the sample query:

     MySQL 
   
 
 
   SELECT
    COUNT(*) INTO v_ext_partner_service_count
FROM
    loan_ext_payload
WHERE
    loan_acct_id = v_loan_acct_id
AND ext_partner_service LIKE '%CREDIT_PULL'
 
  

For LIKE ‘CREDIT_PULL%’, MySQL uses the index efficiently as the indexes are structured in a way that this type of search makes them inherently faster. On the contrary, for the leading wildcard LIKE ‘%CREDIT_PULL’, MySQL execution engine looks at each entry in the index if it ends with ‘CREDIT_PULL.’ The index is optimized for prefixing (CREDIT_PULL% ), not suffixing (%CREDIT_PULL); its performance benefits are wiped out with leading wildcard string comparisons.

The recommendation is to avoid using prefixed wild card string comparison or search. If this is unavoidable, use a full-text search.

Using Functions as Part of the `WHERE` Clause

Using functions for filtering records using a WHERE condition can harm query performance. For instance, using FIND_IN_SET() method as in the below query:

     MySQL 
   
 
 
   SELECT
    code,
    code_type,
    message,
    message_es
FROM
    locale_messages
WHERE
    code_type = v_code_type
    AND FIND_IN_SET(code, v_codes) > 0;
 
  

FIND_IN_SET returns the position of a string if it is present (as a substring) within a list of strings. Using FIND_IN_SET in the query causes a full table scan as it does not use the index. MySQL has to scan each record to evaluate the function, which could be very expensive from a computing perspective for larger tables.

Similarly, mathematical, string, or date functions would have a similar side effect. Below are the examples:

     MySQL 
   
   SELECT
    loan_id, loan_acct_id
FROM
    loan_application
WHERE YEAR(create_date) = 2024;

     MySQL 
   
   SELECT
    loan_id, loan_acct_id
FROM
    loan_application
WHERE YEAR(create_date) = 2024;

     MySQL 
   
   SELECT
    loan_id, loan_acct_id
FROM loan_application
WHERE ABS(merchant_discount_fee) > 5;

If we cast columns as part of the where clause, it again leads to a full table scan, impacting the query performance. Example as below:

     MySQL 
   
   SELECT
    loan_id, loan_acct_id  
FROM
    loan_application
WHERE
    CAST(approval_datetime AS DATE) = '2024-06-01';

As a resolution for FIND_IN_SET, it enabled full-text search. For all other function-based where conditions, function-based indexes were created.

Performing Deletes as Part of Regular Transactions

Frequent delete operations as part of regular transaction processing can degrade database performance. Deleting rows from a table leads to index rebalancing. MySQL has to update the indexes on the columns for that table as it has to remove the index entries. For rollback and recovery, transaction logs are created for all delete operations. This can lead to rapid growth of transaction logs. The rows deleted in random order fragment tables and indexes degrade performance due to scattered reads.

To mitigate regular delete operations, rows can be soft-deleted, and then, an offline batch job could perform a hard delete during non-peak hours. Non-peak hours allow the system to use its resources effectively. Batch or bulk deletes are more performant than the deletes span across regular intervals.

Executing Multiple Inserts From the Application Instead of Batching

If we have to insert multiple child rows for the parent record from the application layer, we will execute a stored procedure for each insert instead of batching it. Each call on insert needs to establish a database connection from the application layer and execute individual stored procedures for each record, which could be inefficient.

To avoid multiple round trips to the database server, the multi-insert statement was built as below on the application layer and then sent to the database-stored procedure to execute it.

     MySQL 
   
   INSERT INTO verif_user_attrib_result
(id, application_id, user_attrib_id, doc_type_id, verif_result, verif_notes, verif_date, doc_upload_count)
values(148712,146235,1,NULL,1,NULL,NULL,0),
(148712,146235,2,NULL,1,NULL,NULL,0),
(148712,146235,3,NULL,1,NULL,NULL,0),
(148712,146235,4,NULL,-1,NULL,NULL,0);

     MySQL 
   
 
 
   CREATE PROCEDURE p_verif_create_user_attrib_results (
    IN v_user_attrib_result_multi_insert TEXT,
    out v_status int)
BEGIN
    SET v_status = 1;
    SET @qry_user_attrib_result = v_user_attrib_result_multi_insert;
    PREPARE stmt_user_attrib_result from @qry_user_attrib_result;
 
   EXECUTE stmt_user_attrib_result;
    DEALLOCATE PREPARE stmt_user_attrib_result;
    SET v_status = 0;
END
 
  

Recommendations

Connection Labels

In the microservices world, where multiple microservices connect to different or the same schemas of the database server, monitoring the connection requests from the different microservices often becomes difficult. If there is an issue with connection pools, connection draining, or database performance issues, we cannot identify which connection belongs to which service. This is where program_name connection attributes come in handy.

     Java 
   
   jdbc:mysql://db.fintech.com:3306/icwdev?cacheCallableStmts=true
&callableStmtCacheSize=1000&connectionAttributes=program_name:loan-application-api

This attribute labels every connection to the respective program name. This connection identifier helps segregate database issues with a specific service. Apart from diagnosing issues, it helps in enhanced monitoring. We can build query performance metrics and error rates for specific microservices.

Purging Large Tables

For large tables, we purge the data based on purging criteria. Utility tables managing user access tokens are periodically purged. To manage purging effectively, we implement partitioning on the tables. Partitioning is time-based

Index Columns Used in Join

We ensure that all columns listed in join conditions are indexed. By including index columns in join conditions, the database engine filters the data efficiently using indexed data, thus avoiding the entire table scan.

Keep Transactions Smaller

In high-concurrency applications, we must ensure smaller transactions for optimal database performance. Smaller transactions reduce the duration of data locking and minimize resource contention, which helps reduce resource contention, avoid deadlocks, and improve the transaction success rate.

Miscellaneous Common Asks

We need to ensure that UPDATE statements include a WHERE clause. At a FinTech organization, some complex stored procedures missed the WHERE clause, leading to unintended behaviors in the test environment.
Strict no to “SELECT *” in queries.

Conclusion

We have covered common pitfalls and resolutions for fine-tuning the application and underlying MySQL database performance through the FinTech application case study. Here are some important suggestions: avoid using complex views in stored procedures, stay away from leading wildcard searches, monitor queries with full table scans, and take appropriate corrective actions. Perform delete operations during offline hours through the batch job. Avoid multiple roundtrips to the server for multi-inserts; instead, use batching. Define a purging strategy. Have indexes defined on all join columns, and finally, try to keep the transaction scope smaller. With these recommendations incorporated, we can provide optimal database performance and faster end-user applications. Continuous monitoring and proactive issue resolution help maintain consistent application performance and availability.