Database Access

Database Access

Python Database Access |

Relational databases are a useful tool for data storage and manipulation. SQL (Structured Query Language) is an industry standard for managing data in relational databases. It is supported in nearly every major programming language including Python with the sqlite3 core Python package. See this reference guide from LearnSQL.com for a summary of SQL commands.

Import Libraries

The first step is to import sqlite3. The pandas library is also used to export, analyze, and import SQL tables. Numpy and Matplotlib are used to create a data array and visualize data, respectively.

import sqlite3
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

Connect to Database

SQLite is a minimal version of SQL. SQLite stores the entire database in one file or in memory, making it a good choice for small projects or for on-device storage with local data.

:memory: - store the database in temporary memory
filename - write to the hard drive (e.g. database.db)

Write a data file if the database needs to persist or is also needed by another application. Use the connect function to connect to the database.

f = 'database.db' # write to 'database.db' or ':memory:'
cxn = sqlite3.connect(f)

Cursor Position

The database cursor position is needed to access the database.

cur = cxn.cursor()

SQL Queries

SQL Queries are commands to create a new table. Below is a list of basic commands for adding or dropping (deleting) tables:

CREATE TABLE - create a new table
DROP TABLE - delete table

Delete Table

Drop (delete) table tbl if it exists with command DROP TABLE IF EXISTS. The SQL statement is a string that is processed by cur.execute().

cur.execute('DROP TABLE IF EXISTS t1')

Create Table

Create a new table with name store with CREATE TABLE. The new table can have any of the SQLite datatypes:

int - Integer numbers
real - Decimal numbers
text - Character string
blob - Binary large object with photos or other files

cur.execute('CREATE TABLE t1 (id int PRIMARY KEY, \
tag text NOT NULL, \
val real NOT NULL)')

Each part of the SQL statement is explained in detail:

cur.execute() - The execute() command processes SQL statements
id int PRIMARY KEY - The id is the PRIMARY_KEY as a unique integer
tag text NOT NULL - The tag is a text type that cannot be blank
val real NOT NULL - The name val is a real type (floating point number)

Commit Changes

The commit() function saves the changes. Until the commit() function runs, other applications cannot access the database updates.

cxn.commit()

Row Operations

Now that the blank table is created, rows can be inserted, updated, or deleted.

INSERT INTO - add rows to a table
SELECT - find rows in a table
UPDATE - update rows in a table
DELETE - delete rows in a table

The VALUES (?, ?, ?) allows a tuple to determine the (id, name, val) as variables.

cmd = 'INSERT INTO t1 (id, tag, val) VALUES (?, ?, ?)'
cur.execute(cmd, (0,'P',5))
cur.execute(cmd, (1,'I',10.0))
cur.execute(cmd, (2,'D',0.2))
cxn.commit()

✅ Activity

Insert a new record (row) into t1 with the following values:

id = 3
tag = 'FF'
val = -0.01

After adding the row, commit the change to the database.

cur.execute(cmd,(3,'FF',-0.01))
cxn.commit()

Read SQL Table with Pandas

Select all value from table t1 to export to a Pandas DataFrame t1

t1 = pd.read_sql_query('SELECT * from t1',cxn)
t1

	id	tag	val
0	0	P	5.0
1	1	I	10.0
2	2	D	0.2

Write SQL Table with Pandas

Create a new DataFrame table with Pandas. The parameters P, I, D are adjustable values from a PID controller. The FF term is a feedforward element that is optionally added to controllers to reject disturbances.

t2 = pd.DataFrame({'tag':['P','I','D','FF'],
'desc':['Proportional','Integral','Derivative','Feedforward']})
t2.index.names = ['id']
t2

	tag	desc
id
0	P	Proportional
1	I	Integral
2	D	Derivative
3	FF	Feedforward

Write a new table t2 to the database with to_sql(). The if_exists argument has options fail, replace, and append. Use replace to remove any existing table and replace it with the updated table.

t2.to_sql('t2',cxn,if_exists='replace')

Join Tables with Pandas

Pandas joins tables with the df.join() function. Joining tables is also shown in the Pandas Time-Series tutorial.

First, retrieve the t1 table and set the index to id.

t1=pd.read_sql_query('SELECT * from t1',cxn)
t1.set_index('id',inplace=True)
t1

	tag	val
id
0	P	5.0
1	I	10.0
2	D	0.2

Also retrieve table t2 and set the index in the same way.

t2=pd.read_sql_query('SELECT * from t2',cxn)
t2.set_index('id',inplace=True)
t2

	tag	desc
id
0	P	Proportional
1	I	Integral
2	D	Derivative
3	FF	Feedforward

Join the two tables with an inner join that only includes the id rows that are common between t1 and t2. An outer join includes all rows and inserts NaN for any missing values.

t3 = t1.join(t2['desc'],on='id',how='inner')
t3

	tag	val	desc
id
0	P	5.0	Proportional
1	I	10.0	Integral
2	D	0.2	Derivative

SQL statements are more efficient when dealing with large databases because the source tables do not need to be imported before the JOIN operation. However, the sqlite3 commands are limited, such as no OUTER JOIN.

Join Tables with SQL Statement

JOIN - join two tables

JOIN combine tables t1 and t2 into a single table t3.

t3 = pd.read_sql_query('SELECT t1.tag,t1.val,t2.desc \
FROM t1 INNER JOIN t2 \
ON t1.id=t2.id;',cxn)
t3

	tag	val	desc
0	P	5.0	Proportional
1	I	10.0	Integral
2	D	0.2	Derivative

Each part of the SQL statement is explained in more detail:

SELECT t1.tag,t1.val,t2.desc - The new t3 table has tag and val from t1 and desc from t2
FROM t1 INNER JOIN t2 - A join only includes the rows that have common key elements
ON t1.id=t2.id - Join the two tables by matching t1.id and t2.id

Pandas processes the SQL query and returns the result to the new DataFrame t3.

Filter with SQL WHERE

Filters select a subset of rows that meet a set of conditions. The WHERE statement indicates which rows are selected based on a condition.

Table t4 contains a history of P, I, D tuning parameters from a PID controller. A new record (row) is created every time the tuning parameters are changed. Create the new table t4.

t4 = pd.DataFrame({'id':[0,0,1,2,0,0,0,1,1,2],
'hist':[4,4.5,20,0,6.5,6,5,15,10,0.2]})
t4.set_index('id',inplace=True)
t4.to_sql('t4',cxn,if_exists='replace')
t4

	hist
id
0	4.0
0	4.5
1	20.0
2	0.0
0	6.5
0	6.0
0	5.0
1	15.0
1	10.0
2	0.2

Show only the rows with id=0 with WHERE id=0.

pd.read_sql_query('SELECT * FROM t4 WHERE id=0',cxn)

	id	hist
0	0	4.0
1	0	4.5
2	0	6.5
3	0	6.0
4	0	5.0

Merge Tables

Join t1, t2, and t4 with the common id with an INNER JOIN to create table t5.

t5 = pd.read_sql_query("SELECT t1.tag,t2.desc,t4.hist \
FROM t1 \
INNER JOIN t4 ON t1.id=t4.id \
INNER JOIN t2 ON t1.id=t2.id \
WHERE t1.tag='P';",cxn)
t5

	tag	desc	hist
0	P	Proportional	4.0
1	P	Proportional	4.5
2	P	Proportional	6.5
3	P	Proportional	6.0
4	P	Proportional	5.0

Create a plot of the Proportional term P to show how it has changed.

plt.plot(t5['hist'],'r-o',label=t5['desc'].iloc[0])
plt.legend(); plt.ylabel(t5['tag'].iloc[0]); plt.show()

The table merge can also be completed with Pandas with the history of P, I and D shown on a single plot.

!join tables
t5 = t1.join(t2['desc'],on='id').join(t4['hist'],on='id')
!plot values
t5['hist'].plot(kind='bar')
plt.xticks(np.arange(0,len(t5)),t5['tag'],rotation='horizontal')
plt.ylabel('Value'); plt.xlabel('Parameter'); plt.grid()
!filter table to show only 'P' values
t5[t5['tag']=='P']

	tag	val	desc	hist
id
0	P	5.0	Proportional	4.0
0	P	5.0	Proportional	4.5
0	P	5.0	Proportional	6.5
0	P	5.0	Proportional	6.0
0	P	5.0	Proportional	5.0

✅ Activity

Create a new table t0 that contains 3 SQLite datatypes.

opID - Operator ID as integer (primary key)
name - Operator name as text (required)
photo - Operator profile photo as blob (optional)

New records (rows) will be added later. Use DROP TABLE IF EXISTS t0 to drop the table if it already exists.

cur.execute('DROP TABLE IF EXISTS t0')

Create the table t0 and insert records:

opID: 1 name: David Stevens
opID: 2 name: Rebecca Okume
opID: 3 name: Mike Fisher

Commit the changes once the records are added.

The new table t0 can now to be used to record details of who made the changes to the PID parameters by recording the opID in t4 when a new row is added to the history. It is not required to add a photo or augment t4 for this activity.

cur.execute('CREATE TABLE t0 (opID int PRIMARY KEY, \
name text NOT NULL)')
cmd = 'INSERT INTO t0 (opID, name) VALUES (?, ?)'
cur.execute(cmd, (1,'David Stevens'))
cur.execute(cmd, (2,'Rebecca Okume'))
cur.execute(cmd, (3,'Mike Fisher'))
cxn.commit()

Close Database Connection

Close the database connection with cxn.close(). This releases the database so that it prevents a connection after it is closed.

cxn.close()

✅ Knowledge Check

1. Which SQL statement is used to add rows to a table in a relational database?

A. DROP TABLE

Incorrect. DROP TABLE is used to delete a table from the database.

B. SELECT

Incorrect. SELECT is used to fetch rows from a table.

C. INSERT INTO

Correct. INSERT INTO is used to add rows to a table.

D. UPDATE

Incorrect. UPDATE is used to modify existing rows in a table.

2. What is the main difference between SQLite and other SQL databases?

A. SQLite does not use SQL

Incorrect. SQLite does use SQL. It is a minimal version of SQL.

B. SQLite can only store text data

Incorrect. SQLite supports various datatypes including int, real, text, and blob.

C. SQLite stores the entire database in one file or in memory

Correct. SQLite stores the entire database in a single file or in memory, making it suitable for small projects or on-device storage.

D. SQLite requires a server setup

Incorrect. SQLite doesn't require a server setup. It's serverless, zero-configuration, and self-contained.

Options:

fixed
fluid

orange
blue
green
pink
cyan
red
violet