Building a Database
Basic Database Terms
A database is a file or a collection of files; created in a program like
Microsoft Access, Microsoft
SQL Server, MySQL, Oracle, FoxPro, etc.; to assemble your data into a single entity (or file)
in which each piece of data relates to other pieces of data.
The database's organization makes it easy for you, or for that matter for PHP, to find
the information you want from your database.
Here are some definitions that relate to databases:
- table: the main element of a database. A collection of data
grouped by relevance.
- field: a category of information in a table.
- field name: the name that you give to each category (field).
- row: a record, or single set of information consisting of all of the fields
for one item in the table.
- key: a unique identifier (often a number) that refers to a specific row in a
table.
- cross-referenced data: the fact that each piece of data in the table has
two reference points: 1) the row that it is in, and 2) the field (in that row)
that it is in.
- data relationship: a method of grouping data with related data, or by common
traits among the data. This is the main advantage of a database over a plain text
document containing the data.
After you put your data into a table in a database, you need to link your
database to PHP
so PHP can serve up the data to your pages. This link is sometimes
provided by a datasource, and the connection
to your datasource may be through ODBC (Open Database Connectivity),
but your webmaster or PHP administrator will handle that.
For an ODBC connection, your webmaster or PHP administrator must
tell PHP these things about your database:
- The fact that the database does, in fact, exist.
- Where the database exists on the server's hard drive.
- What kind of database it is (Access, SQL Server, MySQL, Oracle, or whatever).
- The fact that we want PHP to create a data pipeline in and out of the
database so it's ready for quick access by our pages.
But in many PHP installations, the link to a MySQL database is made without
a datasource or ODBC, using a set of special mysqli( ) PHP functions.
The connection that we will be making for this course
will be one of these non-ODBC connections. We will see more about this
connection later.
Basic Database Relationships
Each item of data in a table has two reference points:
- Row
- Field
When you run a
query, which means that you are asking for information from the database,
you get
rows (also called
records) from the database. The
fields in each row
contain the data that you want. You normally display, or do calculations with,
the data in these fields, in your Web page(s).
But the data relationships in a database extend even farther, when you link or relate data from
two or more tables. In the schema that we're
using, there are links/data relationships between these tables:
- Customers and Orders
- Orders and Order_Items
- Products and Order_Items
- Suppliers and Products
One of the main things that these
relationships allow us to do is store data in our database
only once, and share it among the other tables that are related to it. These
relationships
allow us to save time and storage space by not having to duplicate a bunch of data.
These relationships are normally made with keys in the tables.
A primary key is a value in a field that uniquely identifies a record. It prevents what
look like duplicates. For instance, if the same customer makes two identical orders on the
same day, the only way we can tell if it is actually two orders and not an accidentally-duplicated
row is the primary key of the Orders table. In this particular example, the primary key is
the OrderID field of the Orders table.
A foreign key is a value in a field that relates that record to a record in another table.
The foreign key field contains the value of a primary key field in the other table.
An example of a foreign key is
the OrderID field of the Order_Items table.
For our purposes as PHP programmers, the data relationships that are established by this
primary key-to-foreign key relationship mainly enforce
data integrity in our tables. For example, if there is an order in the Orders table for
customer John Smith, we wouldn't want to delete John Smith from the Customers table. If we did,
the Orders table would have a CustomerID that no longer referred to a real customer. The data
relationship that we make between the CustomerID in the Customers table (primary key)
and the CustomerID in the Orders table (foreign key) prevents you from deleting John
Smith if there is an order that he made in the Orders table. The database system itself, in our case
MySQL, enforces this restriction by creating an error situation if you try to delete
John Smith, for example.
In order to clarify and explain what the primary key and the
foreign key do for you, look at the following two samples of
data that could be in your Customers and Orders tables (I have
purposely left out some columns, to keep things a bit simpler.) (Also, "PK" means "primary
key", and "FK" means "foreign key".):
Customers
| CustomerID (PK) |
First_Name |
Last_Name |
Street_Address |
| 1 |
Jim |
Link |
123 Main St. |
| 2 |
Sam |
Jones |
547 Elm St. |
| 3 |
Bill |
Smith |
345-B Oak St. |
Orders
| OrderID (PK) |
CustomerID (FK) |
Order_Date |
Delivery_Date |
| 1 |
2 |
9/18/07 |
9/24/07 |
| 2 |
3 |
9/15/07 |
9/21/07 |
| 3 |
2 |
9/15/07 |
9/20/07 |
Please note these points about the above table samples:
- CustomerID is the primary key (PK) in the Customers
table, and is the foreign key (FK) in the Orders table.
-
By using the CustomerID as the foreign key in the Orders table,
we have avoided having duplicate data in our database. In other words,
if we had stored the user's names and addresses in the Orders table
instead of in the Customers table, the same data would (potentially)
exist in more than one record/row of the Orders table. This is a situation
that properly-designed databases seek to avoid.
- The main thing that the PK-FK relationship does for you,
is to prevent your inserting data into the Orders table that
does not belong to an existing customer. If, for example, you
tried to insert a row into the Orders table with a CustomerID
(FK) of 6, the SQL engine would complain
very loudly about your not being able to insert the data due
to referential integrity constraints in the relationship.
What this ultimately means is that you can only insert a CustomerID
into the Orders table which already exists in the Customers table.
- In order to retrieve related data in a normal SQL query, you
have to join the tables in the query. We will see how to
do joins later in the semester.
Building a MySQL Database
When you are going to make a database, you should make a diagram, called a
schema, of
the tables and fields in the database.
A good way to get started is to make a list of the major categories that your data will
be organized into. In this course, the major categories of data will be:
- Customers
- Orders that the customers place
- The items (and their quantities) that are in the orders
- The product details for the items in the orders
- The brand or supplier of the products in the orders
Each major category will probably become a table in the database.
Then you will need to list the fields that will be in each table.
IMPORTANT NOTE: The field names in your database tables CAN NOT BE the same as any
database key words. For instance, your Orders table CAN NOT have a field named `Order` because
one of the the MySQL keyword phrases is ORDER BY.
Indicate how each table will receive its input. In
the schema for this course,
the Customers, Orders, and Order_Items tables will receive input from an
online form or set of forms that the
customers will fill in and submit. But at least two tables will have Admin forms
that you (or your executive assistant) will fill in and submit: Suppliers and
Products.
Look for places where data will be duplicated. You want to avoid, as much as possible,
duplicating data. Here is where you make use of the
data relationships that a database makes possible. Linking or relating
data is what you do when you share data between two or more tables. For instance, you could,
if you wanted, store the customer's name and address in the Orders table, but that's not a
good idea. The same customer can make multiple orders. You don't want the address information
duplicated in the Orders table. It's better to store the address once, in the Customers table,
and relate the Orders and Customers table through the CustomerID field.
A primary key is a value in a field, that uniquely identifies a record in a table.
In MySQL, the primary key should be an
auto_increment data type.
An auto_increment data field is a numeric field that is automatically
filled in by the database as you add rows to the table. (We'll talk about
adding rows later.) This feature guarantees that the field will be
unique for each row, which makes it an ideal data type for the
primary key.
Let's use the above database design concepts to build some of the tables
in the schema
that we will be using for this class.
Each of you should log into your database on your Web site's hosting server.
-
Log in to your hosting service's Control Panel.
-
If you are using Freehostia, click on the "MySQL" icon in the "Manage your sites / databases" panel.
You can also
click on "Databases, MySQL Databases" in the top main menu.
-
If you haven't already created your database, click the "ADD MYSQL" button on the MySQL page.
After you have created your database, make sure you write down
the database name and password.
Let's make the Suppliers table in the class schema:
-
Click "phpMyAdmin" in the listing of your database in the MySQL panel.
-
Click the name of your database, in the left navigation area of phpMyAdmin.
-
In the "Create table..." section, in the "Name:" field, type in Suppliers.
-
Type 8 in the "Number of fields:" field.
-
Click the "Go" button.
-
In the first row of the grid that displays, in the "Column" field, type SupplierID.
-
Select BIGINT in the "Type" field.
-
Select "Primary" in the "Index" dropdown list.
-
Click the "A_I" (auto-increment) checkbox.
-
In the second row, in the "Column" field, type Name.
-
Select VARCHAR in the "Type" field.
-
Type 50 in the "Length/Values" field.
-
In the third row, in the "Column" field, type Address.
-
Select VARCHAR in the "Type" field.
-
Type 50 in the "Length/Values" field.
-
In the fourth row, in the "Column" field, type City.
-
Select VARCHAR in the "Type" field.
-
Type 50 in the "Length/Values" field.
-
In the fifth row, in the "Column" field, type State.
-
Select CHAR in the "Type" field.
-
Type 2 in the "Length/Values" field.
-
In the sixth row, in the "Column" field, type Zip.
-
Select VARCHAR in the "Type" field.
-
Type 10 in the "Length/Values" field.
-
In the seventh row, in the "Column" field, type Phone.
-
Select VARCHAR in the "Type" field.
-
Type 50 in the "Length/Values" field.
-
Click the checkbox in the "Null" column.
-
In the eighth row, in the "Column" field, type Fax.
-
Select VARCHAR in the "Type" field.
-
Type 50 in the "Length/Values" field.
-
Click the checkbox in the "Null" column.
-
Select "InnoDB" in the "Storage Engine:" dropdown list.
-
In the "Collation:" field under the grid, select "ascii_general_ci".
-
Click the "Save" button.
Now let's make the Products table in the class schema:
-
Click the name of your database, in the left navigation area of phpMyAdmin.
-
Click "New" under your database name on the left.
-
In the "Table name:" field, type in Products.
-
In the first column of the grid that displays, in the "Column" field, type ProductID.
-
Select BIGINT in the "Type" field.
-
Select "Primary" in the "Index" dropdown list.
-
Click the "AUTO_INCREMENT" checkbox.
-
In the second column, in the "Column" field, type SupplierID.
-
Select BIGINT in the "Type" field.
-
Select "INDEX" in the "Index" dropdown list. This setting
tells MySQL to use this field as a Foreign Key.
-
Select "InnoDB" in the "Storage Engine:" dropdown list.
-
In the "Collation:" field under the grid, select "ascii_general_ci".
-
Type 6 in the "Add ... column(s)" field.
-
Click "Go".
-
In the third row, in the "Column" field, type Name.
-
Select VARCHAR in the "Type" field.
-
Type 50 in the "Length/Values" field.
-
In the fourth row, in the "Column" field, type Color.
-
Select VARCHAR in the "Type" field.
-
Type 50 in the "Length/Values" field.
-
In the fifth row, in the "Column" field, type UPC.
-
Select VARCHAR in the "Type" field.
-
Type 50 in the "Length/Values" field.
-
In the sixth row, in the "Column" field, type Quantity_In_Stock.
-
Select INT in the "Type" field.
-
In the seventh row, in the "Column" field, type Wholesale_Price.
-
Select DOUBLE in the "Type" field.
-
In the eithth row, in the "Column" field, type Retail_Price.
-
Select DOUBLE in the "Type" field.
-
Click the "Save" button.
Foreign Key Relationships
Now, let's relate the two tables together with a Primary Key-to-Foreign Key
relationship:
-
In the left menu, select the Products table. Please NOTE CAREFULLY
that the Products table is the one which contains the foreign key
in the relationship that is being created.
-
Click on the "Structure" tab.
-
Now select the "Relation view" link. (It may be a tab in some versions of phpMyAdmin.)
- In the DCCCD District's system, the "Relation view" link is underneath the list of column names, next to "Print View".
-
Drop down the "Internal relation" select list next to SupplierID and select
suppliers.SupplierID. (Some versions of phpMyAdmin have separate Table and
Column select lists.)
-
Also drop down the "Foreign key constraint" dropdown list next to SupplierID
and select suppliers.SupplierID.
-
Click the "Save" button.
- In GoDaddy and other similar hosting systems, the "Relation view" link is next to the "Table Structure" button just under the "Structure" tab.
-
You need to name this contraint, so enter a name (such as SupplierID_FK or something like that) in the "Constraint Name" field.
-
Leave "ON DELETE" and "ON UPDATE" as they are.
-
Drop down the first "Column" select list and select
SupplierID.
-
Drop down "Table" and select "suppliers".
-
Drop down the other "Column" select list and select
SupplierID.
-
Click the "Save" button.
[Optional Section]
Varchar vs. Char
Sometimes a question comes up along these lines: "What is the difference between the data types
varchar and char, and when should I use them?"
The simple answer is that you should use whichever type makes sense for the data that you will storing.
A more helpful answer is that the expected data to be stored in the column will usually determine
your
choice of data type. The relevant considerations are:
-
If your expected data can have both very few characters in it, and also a lot of
characters, then VARCHAR makes sense.
-
If your expected data will always have the same amount of characters in it, it makes sense to
use CHAR.
-
If trailing spaces (space characters after normal characters, at the end of the data) must be
retained in the data when you retrieve it from the table, you must use VARCHAR.
This situation occurs because CHAR fields automatically trim
trailing spaces when you retrieve the data from the table, whereas VARCHAR fields keep the
trailing spaces.
The varchar data types take up 1 or 2 extra bytes for each field in the internal storage of the
database, but this extra space is not usually significant nor wasteful. Use whatever data type makes
sense for your data, keeping in mind the considerations above.