The world’s leading publication for data science, AI, and ML professionals.

How to Enhance SQL Code Security and Maintainability

Introduction to SQL procedures, their applications and benefits, and how to encrypt SQL code

Photo by FlyD on Unsplash
Photo by FlyD on Unsplash

When you’re writing SQL code to accomplish certain tasks for your company, have you ever worried that your SQL code might get leaked and expose critical business logic to competitor companies? Or have you noticed that long and complex SQL code are very difficult to maintain and fix when issues arise? SQL procedures can address the problems mentioned above and serve as a key step for data professionals looking to advance their coding skills while not everyone has paid enough attention to this technique.

A SQL procedure, also known as a SQL stored procedure, is a database object that is created and stored in the database management system and can be executed with a single call. It’s a powerful tool for improving database Security, modularity, and code reusability.

People often confuse SQL UDFs with SQL procedures. Both techniques are used for improving performance, maintainability and security of SQL queries. They share many similarities: both enable developers to write a block of SQL code once and reuse it multiple times throughout the applications, and both can accept input parameters. Due to these similarities, the two techniques can sometimes achieve the same goals.

I’ll use the mock data promo_sales to explain on their similarities. promo_sales is the data for the sales performance from a department store and consists of fieldsSale_Person_ID , Department and Sales_Amount . According to the company policy, 20% of the sales amount is paid as the bonus to each department. We can write the Sql code to query the department summary including the average sales per person and the bonus for each department.

In order to boost the sales during the holiday season, the company’s senior management decided to provide an extra 10% bonus for the departments with sales amount over 2000K USD.

CREATE TABLE promo_sales(
  Sale_Person_ID VARCHAR(40) PRIMARY KEY,
  Department VARCHAR(40),
  Sales_Amount INT
);

INSERT INTO promo_sales VALUES ('001', 'Cosmetics', 500);
INSERT INTO promo_sales VALUES ('002', 'Cosmetics', 700);
INSERT INTO promo_sales VALUES ('003', 'Fashion', 1000);
INSERT INTO promo_sales VALUES ('004', 'Jewellery', 800);
INSERT INTO promo_sales VALUES ('005', 'Fashion', 850);
INSERT INTO promo_sales VALUES ('006', 'Kid', 500);
INSERT INTO promo_sales VALUES ('007', 'Cosmetics', 900);
INSERT INTO promo_sales VALUES ('008', 'Fashion', 600);
INSERT INTO promo_sales VALUES ('009', 'Fashion', 1200);
INSERT INTO promo_sales VALUES ('010', 'Jewellery', 900);
INSERT INTO promo_sales VALUES ('011', 'Kid', 700);
INSERT INTO promo_sales VALUES ('012', 'Fashion', 1500);
INSERT INTO promo_sales VALUES ('013', 'Cosmetics', 850);
INSERT INTO promo_sales VALUES ('014', 'Kid', 750);
INSERT INTO promo_sales VALUES ('015', 'Jewellery', 950);

If we need to update the logic of bonus calculation, we would have to rewrite the code. For projects with complex logics, this may lead to challenges. To resolve this problem, we can write the UDF consisting of the two statements which represent the logic before and after updates. This design significantly improve the code maintainability.

CREATE FUNCTION dbo.MultiStmt_GetDepartmentSummary()
RETURNS @DeptSummary TABLE 
(
    Department VARCHAR(40),
    Total_Sales INT,
    Number_of_Sales_Persons INT,
    Avg_Sales_Per_Person DECIMAL(10, 2),
    Bonus DECIMAL(10, 2) 
)
AS
BEGIN
    -- First Statement: Initialize the table variable with department sales summary
    INSERT INTO @DeptSummary (Department, Total_Sales, Number_of_Sales_Persons, Avg_Sales_Per_Person, Bonus)
    SELECT 
        Department,
        SUM(Sales_Amount) AS Total_Sales,
        COUNT(DISTINCT Sale_Person_ID) AS Number_of_Sales_Persons,
        AVG(Sales_Amount) AS Avg_Sales_Per_Person,
        SUM(Sales_Amount) * 0.2 AS Bonus
    FROM promo_sales
    GROUP BY Department;  

    -- Second Statement: Update rows in the table variable
    UPDATE @DeptSummary
    SET Bonus = Bonus * 1.1  
    WHERE Total_Sales > 2000;

    -- Return the final table
    RETURN;
END;
GO

-- Usage:
SELECT * FROM dbo.MultiStmt_GetDepartmentSummary();

Alternatively, we can use a SQL procedure to generate the same result.

-- Creating the stored procedure to achieve the same functionality
CREATE PROCEDURE dbo.GetDepartmentSummary_Proc
AS
BEGIN
    -- Create a temporary table to store the department summary
    CREATE TABLE #DeptSummary 
    (
        Department VARCHAR(40),
        Total_Sales INT,
        Number_of_Sales_Persons INT,
        Avg_Sales_Per_Person DECIMAL(10, 2),
        Bonus DECIMAL(10, 2)
    );

    -- First Statement: Insert department summary into the temporary table
    INSERT INTO #DeptSummary (Department, Total_Sales, Number_of_Sales_Persons, Avg_Sales_Per_Person, Bonus)
    SELECT 
        Department,
        SUM(Sales_Amount) AS Total_Sales,
        COUNT(DISTINCT Sale_Person_ID) AS Number_of_Sales_Persons,
        AVG(Sales_Amount) AS Avg_Sales_Per_Person,
        SUM(Sales_Amount) * 0.2 AS Bonus
    FROM promo_sales
    GROUP BY Department;

    -- Second Statement: Update rows in the temporary table 
    UPDATE #DeptSummary    
    SET Bonus = Bonus * 1.1  
    WHERE Total_Sales > 2000;

    -- Return the final table
    SELECT * FROM #DeptSummary;

    -- Clean up: Drop the temporary table
    DROP TABLE #DeptSummary;
END;
GO

-- Usage:
EXEC dbo.GetDepartmentSummary_Proc;
Image by the author (SQL procedure output)
Image by the author (SQL procedure output)

Although there are similarities, there are also differences which enable developers to have greater flexibilities when using stored procedure in SQL. Compared to SQL functions which must return a value and can have only input parameters, stored procedures don’t need to return results and have options to implement with input/output parameters. In today’s article, I will focus on the important features of SQL stored procedures.

If you’re interested in SQL User Defined Functions, you can refer to my another article ‘SQL User Defined Functions (UDFs)‘.

SQL User Defined Functions (UDFs)


Syntax For SQL Stored Procedures

Syntax of SQL Stored Procedure

A universal syntax of SQL stored procedure is:

CREATE PROCEDURE procedure_name(parameters)
AS
BEGIN;
//statements;

END;

EXEC procedure_name;

In this syntax, parameters are optional. When creating the SQL procedure dbo.GetDepartmentSummary_Proc , no parameters were assigned. The stored procedure contains several statements which perform different tasks, such as table creation, data insertion, variable calculation, variable updates, summary queries, table deletion, and so on. Unlike a SQL UDF, the RETURN statement is not used here to return a table, although it can optionally be used to return an integer status value. Another key difference is that a SQL procedure use EXEC statement to execute the procedure defined and obtain the intended result.

Creating a Stored Procedure with Default Parameters

A default parameter in a SQL procedure is the value assigned to a parameter that is automatically used if there’s no value provided when the procedure is executed. For the procedure created above, we can include a parameter that filters the results by a specific department. Without specifying the default parameter, the procedure will return the summary of all department.

-- Creating the stored procedure to achieve the same functionality
CREATE PROCEDURE dbo.GetDepartmentSummary_Proc
    @DepartmentFilter VARCHAR(40) = NULL 
AS
BEGIN
    -- Create a temporary table to store the department summaryeh 
    CREATE TABLE #DeptSummary 
    (
        Department VARCHAR(40),
        Total_Sales INT,
        Number_of_Sales_Persons INT,
        Avg_Sales_Per_Person DECIMAL(10, 2),
        Bonus DECIMAL(10, 2)
    );

    -- First Statement: Insert department summary into the temporary table
    INSERT INTO #DeptSummary (Department, Total_Sales, Number_of_Sales_Persons, Avg_Sales_Per_Person, Bonus)
    SELECT 
        Department,
        SUM(Sales_Amount) AS Total_Sales,
        COUNT(DISTINCT Sale_Person_ID) AS Number_of_Sales_Persons,
        AVG(Sales_Amount) AS Avg_Sales_Per_Person,
        SUM(Sales_Amount) * 0.2 AS Bonus
    FROM promo_sales
    WHERE (@DepartmentFilter IS NULL OR Department = @DepartmentFilter)
    GROUP BY Department;

    -- Second Statement: Update rows in the temporary table 
    UPDATE #DeptSummary    
    SET Bonus = Bonus * 1.1  
    WHERE Total_Sales > 2000;

    -- Return the final table
    SELECT * FROM #DeptSummary;

    -- Clean up: Drop the temporary table
    DROP TABLE #DeptSummary;
END;
GO

EXEC dbo.GetDepartmentSummary_Proc @DepartmentFilter = 'Cosmetics';
GO

In this example, the @DepartmentFilter parameter is defined but when it’s set as NULL , we still generate the summary of all departments. By adding the WHERE clause, we can dynamically filter the data based on the parameter’s value. After executing the procedure with the parameter, this approach makes the stored procedure flexible and reusable for different scenarios.

Creating a SQL Procedure with Output Parameters

An output parameter in a SQL procedure is a parameter that can return a value back to the caller after the procedure executes. Using the output parameter in a SQL procedure is considered more efficient by returning a single value rather than a result set. It allows developers to return additional information alongside the result set. This feature of SQL procedures provides more flexibility in handling data.

-- Creating the stored procedure to achieve the same functionality
CREATE PROCEDURE dbo.GetDepartmentSummary_Proc
    @TotalDepartments INT OUTPUT
AS
BEGIN
    -- Create a temporary table to store the department summary
    CREATE TABLE #DeptSummary 
    (
        Department VARCHAR(40),
        Total_Sales INT,
        Number_of_Sales_Persons INT,
        Avg_Sales_Per_Person DECIMAL(10, 2),
        Bonus DECIMAL(10, 2)
    );

    -- First Statement: Insert department summary into the temporary table
    INSERT INTO #DeptSummary (Department, Total_Sales, Number_of_Sales_Persons, Avg_Sales_Per_Person, Bonus)
    SELECT 
        Department,
        SUM(Sales_Amount) AS Total_Sales,
        COUNT(DISTINCT Sale_Person_ID) AS Number_of_Sales_Persons,
        AVG(Sales_Amount) AS Avg_Sales_Per_Person,
        SUM(Sales_Amount) * 0.2 AS Bonus
    FROM promo_sales
    GROUP BY Department;

    -- Set the output parameter to the total number of departments processed
    SELECT @TotalDepartments = COUNT(*) 
    FROM #DeptSummary;

    -- Second Statement: Update rows in the temporary table 
    UPDATE #DeptSummary    
    SET Bonus = Bonus * 1.1  
    WHERE Total_Sales > 2000;

    -- Return the final table
    SELECT * FROM #DeptSummary;

    -- Clean up: Drop the temporary table
    DROP TABLE #DeptSummary;
END;
GO

-- Usage:
-- Declare a variable to hold the output parameter value
DECLARE @TotalDepts INT;

-- Execute the procedure and pass the output parameter
EXEC dbo.GetDepartmentSummary_Proc 
    @TotalDepartments = @TotalDepts OUTPUT; -- Capture the output parameter

-- Display the value of the output parameter
PRINT 'Total Departments Processed: ' + CAST(@TotalDepts AS VARCHAR);

The output parameter @TotalDepartments INT OUTPUT is defined to store the total number of departments and the value of @TotalDepartments is set during the procedure execution and can be printed out after the procedure completes.


Encrypting Stored Procedure in SQL

Nowadays, data security has become a growing concern for many companies. Keeping sensitive data away from unauthorized users, complying with data privacy laws and regulations and maintaining data integrity require the encryption of stored procedures in SQL.

If you’re a data engineer or a data analyst from a key business department, learning how to encrypt your SQL procedures is a must-have skill. Encrypting a stored procedure is quite straightforward – you only need to use the ENCRYPTION keyword while creating the procedure. The keyword will encrypt and hide the source code. If someone attempts to retrieve the source code with the in-built function sp_helptext, the server will respond with "The text for object ‘procedure_name’ is encrypted."

CREATE PROCEDURE procedure_name(parameters)
WITH ENCRYPTION
AS
BEGIN;
//statements;

END;

sp_helptext procedure_name;

Conclusion

As one of the key advanced SQL techniques, stored procedures undoubtedly influence the performance, flexibility, maintainability and efficiency of your code and ultimately generate a significant impact on your database management and data analysis activities. There are always debates about which is better – SQL functions or stored procedures. In my view, the choice depends on the complexity of the data and logic, the functionality aimed to achieve, and the expected outputs. Which technique do you prefer? Please share your thoughts in the comments.

Thank you for reading! If you found this article helpful, please give it some claps! Follow me and subscribe via email to receive notifications whenever I publish a new article.


Related Articles