SQL - Procedural Programming#

Overview#

  • Queries and expressions in SQL are ideal for large set-at-a-time operations:

    • Operations that need to search the whole relation to answer a complex combination of tuples.

    • Using SQL allows database to use the query optimizer to find the best query execution plan for the query.

  • However, SQL is not efficient for some programming tasks:

    • Sequence of code: A block of queries that must be executed in an order

    • Conditionals: if/then/else statements (some simple ones can be handled, but cannot be combined with sequence of code

    • Loops

    • Store values in and use values from program variables

    • Error handling

Example Program#

  • We will see an example program that is hard to write using an SQL query, but trivial with a procedural language.

  • Find the top 3 most frequent y for each x in table R(x,y,z).

    • Not very easy!

    • However, suppose we have a query:

      SELECT x,y, count(*) num FROM R GROUP BY x,y ORDER BY x,y, num;
      

      Now, we can write a simple program to compute this as follows:

      Algorithm Top 3 y for each x:
         Run query:
              SELECT x,y, count(*) num
              FROM R
              GROUP BY x,y
              ORDER BY x,y, num;
         In a loop read each tuple such that:
             if a new x is found:
                 read the next three y values
                 (error handling if less than three y values are found)
                 skip remaining x values
      

Procedural SQL#

  • To enable the use of SQL for costly queries, while making it possible to write code/procedures on top of it, databases support a number of options.

  • The options belong in two main options:

    • Server side

    • Client side

Server side#

  • Languages make it possible to define procedures, functions and triggers

  • These programs are compiled and stored within the database server

  • They can also be called by queries

Client side#

  • Languages allow programs to integrate querying of the database with a procedural language

  • Coding in a host language with db hooks (C, C++, Java, Python, etc.) using the data structures of these languages

  • Coding in frameworks with their own data models (Java, Python, etc) with similar db hooks as in above.

Programming with SQL#

  • All programming paradigms support:

    • Methods to execute queries and update statements

      Executing any SQL statement and catching the outcome

    • Interpret errors returned if any

    • Input values from variables into queries, output query results into variables

    • Loop over query results (for queries returning multiple tuples)

    • Raise exceptions (which results in rollback of transactions)

    • Store and reuse queries in the shape of cursors

    • Starting and committing transactions

  • Client side programs also support:

    • Opening/closing connections, allocating/releasing database resources for queries

  • The actual syntax of the methods for doing this vary, but the principle is the same.

Server side language examples#

  • Generally database specific

  • Postgresql:

    • pl/pgsql: generic procedural language for postgresql

    • pl/pyhton: procedural language that is an extension of python (also see pl/tcl, pl/perl)

Client side language examples#

  • Database specific extensions of host languages, for example

    • libpq: C library for postgresql, uses library calls for specific to postgresql

    • OCCI: Oracle library for C

    Write code in C with hooks to the database in the form of functions and libraries, and compile using C compiler

    • ECPG: embedded programming in SQL, based on the embedded programming standard with a postgresql specific pre-compiler, an the standard C compiler

  • Standard libraries for connecting to the database

    • JDBC is a Java standard library, all databases implement the same functions (some non-standard data types will be different)

      To compile code with JDBC functions, the specific adapter (library) for a database must be used.

    • No specific standard for libraries based on C, but ODBC is a non-language specific standard, .NET tries to achieve the same

    • Python DB-API is a database independent framework for Python

      Similar to Java, a module for each database management system is needed. The postgresql module is called psycopg2.

Frameworks#

  • Frameworks are based on specific design principles for developing database backed applications

  • Examples:

    • Object-relational-mapping used by Rails, Hibernate, Django, WebObjects, .NET (different frameworks have different models)

    • Note that the frameworks can be built on top of other languages (such as Java + JDBC)

Client-side programming#

  • Client-side programming languages have constructs to connect to a database, keep a pool of open connections and close connections, in additions to constructors for executing queries.

Running and Debugging#

  • Depending on how tight the integration with the database, error checking becomes an issue.

  • Two sources of syntax errors:

    • Host language syntax errors

    • SQL syntax errors: note that often these are not checked until you send the query to the server. So, it is difficult to debug.

  • Some client-side programming paradigms are a hybrid of host language and additional constructs.

    • Example: Embedded SQL is an SQL standard for writing programs in a host language.

    • There is a precompilation phase for only the SQL code and variables with better error checking.

    • SQL calls are changed to calls in the host language after precompilation.

    • Though writing programs in this paradigm can be a bit awkward and it is a less popular platform as a result.

Performance#

  • One must be especially careful about sending large data sets over the network when writing client code.

  • When writing client-side code, you must balance the work that must be done in the server side and client side.

    • Very complex queries slow down performance.

    • Very simple queries lead to very large data sets being sent over the network.

  • Your programs must balance (A) the cost of query execution at the server, (B) cost of data transfer and © cost of processing the data at the client application.

    • In most CS courses, you learn how to improve C, run time cost.

    • In this course, we are learning about A and B in detail, to be coupled with C.

Call level interfaces for client-side programming#

  • Supplies the constructs for opening connections, running queries, looping over them etc.

  • SQL is completely handled with special function calls in the host language

  • JDBC is an industry standard for Java, supported by all databases using drivers.

  • Other database specific examples: OCCI for C in Oracle, Libpq for C in Postgresql

Python DB-API#

  • DB-API is a generic db interface for python (like JDBC).

  • psycopg2 is a python adapter that implements DB-API.

    import psycopg2 as dbapi2
    
    db = dbapi2.connect (database="python", user="python", password="python")
    cur = db.cursor()
    cur.execute ("INSERT INTO versions VALUES ('2007-10-18', '2.4.4', 'stable')")
    conn.commit ()
    cur.close()
    db.close()
    
  • Example 2:

    import psycopg2 as dbapi2
    
    db = dbapi2.connect (database="python", user="python", password="python")
    cur = db.cursor()
    
    cur.execute ("SELECT * FROM versions");
    rows = cur.fetchall()
    for i, row in enumerate(rows):
        print ("Row", i, "value = ", row)
    
    try:
        cur.execute ("""UPDATE versions SET status='stable' where version='2.6.0' """)
        cur.execute ("""UPDATE versions SET status='old' where version='2.4.4' """)
        db.commit()
    except Exception, e:
        db.rollback()
    
  • Note that in all these examples, there is no support for database types and any SQL level error checking. Programmer has to use the correct methods and debug errors which requires a high degree of expertise! Frameworks try to make this easier on programmers by providing a tighter connectivity!

  • What these examples do not show is the level of exception processing required to make these useful.

  • Let’s see another example. Note that psycopg is the newest implementation of the DB-API connection module.

    import psycopg
    from psycopg import DatabaseError, OperationalError, ProgrammingError, IntegrityError
    
    def main():
       conn = None
       try:
           # Connect to the PostgreSQL database
           conn = psycopg.connect(
               dbname="mydatabase",
               user="myuser",
               password="mypassword",
               host="localhost",
               port="5432"
           )
    
           # Create a cursor
           cur = conn.cursor()
    
           # --- SELECT statement ---
           print("Fetching employees...")
           cur.execute("SELECT id, name, salary FROM employees ORDER BY id;")
           rows = cur.fetchall()
           for row in rows:
               print(f"ID: {row[0]}, Name: {row[1]}, Salary: {row[2]}")
    
           # --- UPDATE statement ---
           print("\nUpdating salary for employee with ID=1...")
           cur.execute("UPDATE employees SET salary = salary * 1.1 WHERE id = 1;")
           print(f"{cur.rowcount} row(s) updated.")
    
           # Commit the transaction
           conn.commit()
    
       except OperationalError as e:
           print("Operational error (e.g., connection issue):", e)
    
       except ProgrammingError as e:
           print("Programming error (e.g., SQL syntax issue):", e)
    
       except IntegrityError as e:
           print("Integrity error (e.g., constraint violation):", e)
    
       except DatabaseError as e:
           print("General database error:", e)
    
       except Exception as e:
           print("Unexpected error:", e)
    
       finally:
           # Close the cursor and connection
           if conn is not None:
               try:
                  conn.close()
                   print("\nConnection closed.")
               except Exception as e:
                   print("Error closing connection:", e)
    
    
     if __name__ == "__main__":
        main()