data.Sql

The data.Sql class is used to integrate between Python and SQL server.

Initialisation

  • The sql class first needs to be imported.
    import sys
    sys.path.insert(0, r'[path to \lib here]')
    from data import Sql
  • Then initialise the sql object.
    dt = Sql(database, server*)

    database The SQL Server database, eg database123.
    server* The SQL Server (default XXVIR71200,00094).

Functions

  • get_list

    dt.get_list()

    Returns a list of all tables in the current connection. Note that this will also return views and system views.


  • get_cols

    dt.get_cols(table)

    Returns a list of columns in the table specified.

    table The name of the table in SQL server that we want a column list for.

  • get_table

    dt.get_table(table)

    Function used to import data from SQL Server to a Pandas DataFrame.

    table The name of the table we want to retrieve from SQL server

  • push_dataframe

    dt.push_dataframe(df, name*, batchsize*)

    Function used to upload a Pandas DataFrame (df) to SQL Server.

    df The dataframe to be push to the SQL database.
    name* The name of the table once pushed to the SQL database (default raw_data).
    batchsize* The number of rows to upload at a time (default 500).

  • push_raw

    dt.push_raw(path, sheet*, batchsize*, delimiter*, verbose*)

    Function used to upload a raw file (xlsx, xlsm, xls, csv or txt) to SQL Server. Note that xlsx files will take significantly longer to upload.

    path The dataframe to be push to the SQL database.
    sheet* The name or index (counting from 0) of an Excel sheet to upload (default 0). Only used for xlsx, xlsm and xls types.
    batchsize* The number of rows to upload at a time (default 500).
    delimiter* Used for txt files only, this specifies the text file delimiter (default ",").
    verbose* Boolean value indicated whether to print extra detail to the terminal or not (default False).

  • union

    dt.union(table_list, name*, join*, delimiter*, verbose*)

    Pass a list of table names to union them all together.

    path The dataframe to be push to the SQL database.
    table_list* A list of table names, example: to union [dbo].[d1] and [dbo].[d2], table_list = ["d1", "d2"].
    name* the name of the table created by the union (default "union").
    join* The union type, either "UNION" or "UNION ALL" (default "UNION").

  • drop

    dt.table(tables)

    Pass a table or list of table names to drop.

    tables A single table name as a string, or a list of table names as strings. For [dbo].[data] we would input "data".

  • manual

    dt.manual(query, response*, comment*, verbose*)

    Enter a manual statement/query.

    query User built SQL query to run through the SQL connection.
    response* Boolean value to specify whether to return a table from the user query or not (default False).
    comment* string input that translates into a comment in the dt.query string (default "manual query").
    verbose* Boolean value indicating whether to print extra detail to the terminal or not (default False).

  • nullify

    dt.nullify(table, replace*, verbose*)

    Run through all columns in a table and convert a specific string value to NULL values.

    table The name of the target table in SQL, eg [dbo].[data] = "data"
    replace* The cell values that should be replaced by a NULL value (default "").
    verbose* Boolean value indicating whether to print extra detail to the terminal or not (default False).

  • output_query

    dt.output_query(filename*)

    Output the SQL query built by Python into a .txt file either locally or in a user specified directory.

    table The name of the target table in SQL, eg [dbo].[data] = "data"
    replace* The cell values that should be replaced by a NULL value (default "").
    verbose* Boolean value indicating whether to print extra detail to the terminal or not (default False).

  • Troubleshooting

    ERROR 1222: Lock request time out period exceeded.

    The above error shows inside SQL Server Management Studio when there is an open Python-SQL connection. To close the connection simply navigate back into Python, where the Sql object is named dt type dt.cnxn.commit().