Richland College Multimedia Learning Center

Digital Media Programming with PHP and MySQL

Digital Media Programming with PHP and MySQL

Database Security

Introduction

It is possible for bad-meaning programmers to put harmful code into some of the fields in your database tables. This handout will give you some ways to prevent these attacks on your database.


Can We Sanitize the Input?

It is absolutely vital to sanitize (validate) user input to insure that the data do not contain dangerous code, whether to the SQL server or to the HTML code itself. One idea is to strip out "bad stuff", such as quotes or semicolons or escapes, but this is a misguided attempt. Though it is easy to point out some dangerous characters, it is harder to point out all of them.

The language of the web (meaning: HTML and the HTTP protocol) is full of special characters and strange markup (including alternate ways of representing the same characters), and efforts to authoritatively identify all of the "bad stuff" are unlikely to be successful.

Instead, rather than "remove known bad data", it is better to "remove everything but known good data". The distinction is crucial. Stated another way, our goal is to retain the good stuff, and to throw away everything else.

Here is an example. An email address can contain only these characters:

abcdefghijklmnopqrstuvwxyz
ABCDEFGHIJKLMNOPQRSTUVWXYZ
0123456789
@.-_+

If we reject characters that could not be valid, and reject them early in the processing - presumably with an error message - we not only prevent SQL Injection, but we also catch mere typos. We store only good stuff in our database.

But in the real world, "sanitizing the input" is harder than it sounds. We can't just remove the bad stuff, such as extra quotes (which, as we will see a bit later in this handout, can hide some very bad SQL code) because even "regular" characters can be troublesome.

Here is an example of how hard it is to keep out bad code. In the following SELECT SQL statement, an integer ID value is being compared against the user input (a numeric login id). If a bad guy can insert the clause

OR 1=1
into your query, he can get all of the rows of data from the table:

  SELECT fieldlist
  FROM table
  WHERE id = 23 OR 1=1;  -- 1=1 always matches!  But how would you filter out this bad code?

In practice, there are so few database fields for which it is possible to outright exclude many of the dangerous characters, that there needs to be some other way to keep out bad code . We will need to make use of some other ways of avoiding SQL injection, as described in the rest of the sections of this handout.


Validate the Data Types

You need to make sure that the given input has the expected data type. PHP has a wide variety of input validating functions such as is_numeric(), ctype_digit(), and good Perl compatible Regular Expressions support.

If your database is expecting numeric input, consider verifying the data with is_numeric(), or change the data's type using settype(). This approach is useful for form validation, which we will see in the "PHP Form Validation" class in a week or two.

Another very useful approach is to use the sprintf() function to build the query string. This is the approach that many of my example queries have used.

If you do use the sprintf() function, you should use the precision specifier for all string values. The precision specifier limits how many characters are accepted by sprintf() and is an effective way to prevent extra (SQL injection) code from being added to your query by an intruder. Most of the sample code later in this handout, uses this approach.


Escape/Quotesafe the Input

Even if we could sanitize a phone number or email address, we cannot take this approach with a "name" field unless we want to exclude names like "Bill O'Reilly" from our database. A quote/apostrophe is a valid character for this field.

You can include an actual single quote in an SQL string by putting two of them together, so this suggests the obvious - but wrong! - technique of preprocessing every string to replicate the single quotes:

  SELECT fieldlist
  FROM customers
  WHERE name = 'Bill O''Reilly';  -- works OK

However, this naïve approach can be beaten because most databases support other string escape mechanisms. MySQL, for instance, also permits \' to escape a quote. So, for example, if the (malicious) user types

  \''; DROP TABLE users; -- 

into our "name" field and we replicate the single-quotes, our page can wind up running a query like this one:

  SELECT fieldlist
  FROM customers
  WHERE name = '\''; DROP TABLE users; --';       -- Boom!  We just lost our "users" table!

The expression '\'' is a complete string (containing just one single quote), and the SQL statement above drops our "users" table. The trouble doesn't stop with backslashes either; there are Unicode, other encodings, and SQL parsing oddities all hiding in the weeds to damage or destroy our database's data.

Getting quotes right is notoriously difficult, which is why many database interface languages provide a function that does it for you.

One example is the MySQLi function real_escape_string(). This method should be used if it is available. You can see this function in actual use in the addnewtrans.php and updatetrans.php pages in the handout "Changing a Database's Contents with PHP".

This method real_escape_string() replaces certain characters that could be typed into an input form, which could potentially cause errors in the database execution, or could be used by bad people to attack your database.

In particular, if the person typing information into your input form uses an apostrophe (') anywhere in the text that they are typing, the MySQL engine will most likely stop processing because of an internal error. This real_escape_string() method is a handy way to allow the user to include those apostrophes, without risking extra code inserted by bad guys.


Use Bound Parameters

Even if you can't use bound parameters (and for our MySQL installation, you can't), you should use the sprintf() function to build your query string. The sprintf() function, properly used, is a reasonable approximation to the use of bound parameters. See the final paragraph in this section for some more information on using sprintf().

In this technique, a SQL statement (string) is created with placeholders - sprintf() formatting codes - in place of each data value; and the SQL query string is then converted ("prepared", in SQL parlance) into a format that is useful to the SQL query engine in the database.

Then this prepared query is "executed" by the database.

Each piece of information destined for the table is passed as a positional parameter to the query string, and at no point do the contents of these variables have anything to do with SQL statement parsing. Quotes, semicolons, backslashes, SQL comment notation - none of this has any impact, because it's "just data" to the sprintf() function call. There simply is nothing to subvert, so the application is largely immune to SQL injection attacks.

There also may be some performance benefits if this prepared query is reused multiple times (it only has to be parsed once), but this is minor compared to the enormous security benefits. This is probably the single most important step you can take to secure a web application against SQL attacks.

If you do use the sprintf() function, you should use the precision specifier for all string values. The precision specifier limits how many characters are accepted by sprintf() and is an effective way to prevent extra (SQL injection) code from being added to your query by an intruder. Most of my sample code uses this approach.


Segregate the Users

Never connect to the database from your PHP pages as a superuser or as the database owner. Always use customized users with very limited privileges to connect to the database from your PHP pages. Your PHP pages ought to use a database connection with the most limited rights possible: query-only access to most tables, and no access to any admin or security (e.g.: login) tables, if possible.

Note: Your student user accounts for this class are very limited and are safe, in this regard.

If your Web site requires the user to log in, in the real world you should have two separate connections available. When the user is logging in, the most restrictive (limited) database connection should be used. After your PHP login page has determined that the user is legitimate, you can then switch that session to a database connection with more rights.

Talk to your webmaster and/or database administrator about having these two different login accounts, in any real-world sites that you work on.


Storing Passwords

Passwords should never be stored in a database as plain text. If they are stored as plain text, and a hacker is able to gain access to the table that the passwords are stored in, your system is wide open to attack.

PHP has several functions that can be used to encrypt your passwords before they are stored in a database. Some of these functions are:

But md5() is no longer considered secure. The brief research that I did on this topic indicates that the currently-accepted method to use for password encryption is password_hash(). So I suggest that you use password_hash() to encrypt the user's password before you store it in the database. The php.net reference page for password_hash() is here: http://php.net/manual/en/function.password-hash.php

The corresponding built-in PHP function that is used to verify the user's password in a login page is password_verify(). The php.net reference page for password_verify() is here: http://php.net/manual/en/function.password-verify.php

Here is a sample PHP page which encrypts a password using password_hash( and several other, older methods. Note that the encrypted password that is returned from password_hash() is the most complicated and strongest.

This page is testEncryptPassword.php:

<!DOCTYPE html>
<html>
<head>
  <meta charset="UTF-8">
  <title>Encrypt Password Sample</title>
  <style type="text/css">
    body { font-family: Arial, sans-serif; }
  </style>
</head>

<body>
  <h3>Sample Password Encryption</h3>
  <?php
    $password = "periwinkle";  // Pretend that the user typed in this password in a "Create a Password" page.
    
    $encryptPwd1 = md5($password);
    $encryptPwd2 = hash("md5", $password);
    
    $ctx = hash_init("md5");
    hash_update($ctx, $password);
    $encryptPwd3 = hash_final($ctx);
    
    $encryptPwd4 = password_hash($password, PASSWORD_DEFAULT);
    
    echo "<ul>";
    echo "<li>encryptPwd1: $encryptPwd1</li>";
    echo "<li>encryptPwd2: $encryptPwd2</li>";
    echo "<li>encryptPwd3: $encryptPwd3</li>";
    echo "<li>encryptPwd4: $encryptPwd4</li>";
    echo "</ul>";
  ?>
</body>
</html>

You can see the above page running here.

Now here are some pages that show a more "real-world" way of getting and storing a user login, with a user name and a password.

Please note carefully that these sample pages are being run from my jimlink.net server, not the school server. I want you to see them running on a secure HTTP connection, which I have available on my server.

The first page, which is a standard HTML form, is getEncryptedPassword.html:

<!DOCTYPE html>
<html>
<head>
  <meta charset="UTF-8">
  <title>Create a Login Account (Sample)</title>
  <link rel="stylesheet" type="text/css" href="PHPstyles.css" />
  <script type="text/javascript" src="includes/jquery.js"></script>
  <script type="text/javascript">
    $(document).ready(function(){
      $(':submit').css("cursor", "pointer");
    });
  </script>
  <script type="text/javascript">
    function verifyInfo()
    {
      var retValue = true;
      var userName = document.getElementById("username").value;
      var pwd1 = document.getElementById("password").value;
      var pwd2 = document.getElementById("password2").value;
      
      if (userName == "" || pwd1 == "" || pwd2 == "")
      {
        alert("Please type in all of the required information.");
        document.getElementById("username").focus();
        retValue = false;
      }
      
      if (pwd1 != pwd2)
      {
        alert("Your password values do not match.  Please try again.");
        document.getElementById("password").value = "";
        document.getElementById("password2").value = "";
        document.getElementById("password").focus();
        retValue = false;
      }
      
      return retValue;
    }
    
    function setCursorPos()
    {
      document.getElementById("username").focus();
    }
  </script>
</head>

<body onload="setCursorPos();">
  <div id="mainDiv">
    <h3>Create a Login Account (Sample)</h3>
    <form method="post" action="storeEncryptedPassword.php"
          onsubmit="return verifyInfo();">
    <table cellspacing="0">
      <tr>
        <td>User name: </td>
        <td><input type="text" name="username" id="username" maxlength="50" /></td>
      </tr>
      <tr>
        <td>Password: </td>
        <td><input type="password" name="password" id="password" maxlength="50" /></td>
      </tr>
      <tr>
        <td>Password (again): </td>
        <td><input type="password" name="password2" id="password2" maxlength="50" /></td>
      </tr>
      <tr>
        <td colspan="2"><input type="submit" value="Create the Login Account" /></td>
      </tr>
    </table>
    </form>
  </div>
</body>
</html>

The action page for the above form is storeEncryptedPassword.php:

<?php
  require("includes/isSQLInject.php");
  require("includes/CheckURLandFORMscope.php");
  require_once "config.php";
  $dbobject = new mysqli(DBSERVER, DBUSER, DBPWD, DBNAME);
  /* check connection */
  $db_error = $dbobject->connect_errno;
  if ($db_error)
  {
      printf("<p>Connect failed: %s</p>\n", $db_error);
      exit();
  }
  if (empty($_POST['username']) || empty($_POST['password']))
  {
    echo "You probably got to this page without going through the input form.";
    echo "<br>";
    echo "<a href='getEncryptedPassword.html'>Go to the input form</a>";
    exit();
  }
?>
<!DOCTYPE html>
<html>
<head>
  <meta charset="UTF-8">
  <title>Store the Encrypted Password (Sample)</title>
  <link rel="stylesheet" type="text/css" href="PHPstyles.css" />
</head>

<body>
  <div id="mainDiv">
    <h3>Store the Encrypted Password (Sample)</h3>
    <?php
      error_reporting (E_ERROR | E_WARNING | E_PARSE | E_NOTICE);
      try
      {
        $usernameLocal = $_POST["username"];
        $pwd1 = $_POST["password"];
        
        if ($usernameLocal == "" || $pwd1 == "")
        {
          echo "<h4>Please go back and provide all of the required information.</h4>";
        }
        else
        {
          $sqlStr1 = sprintf("SELECT * FROM login WHERE UserName = '%.50s'", $usernameLocal);
          
          $result = $dbobject->query ( $sqlStr1 );
          $rowInfo = $result->fetch_assoc();
          if ($result->num_rows > 0)
          {
          	$resultBool = true;
          }
          else
          {
          	$resultBool = false;
          }
        }
          
        if ($resultBool) // We found the login name
        {
          echo "<h4>That user name already exists.  Please try again.</h4>";
        }
        else
        {
          $pwdEncrypted = password_hash($pwd1, PASSWORD_DEFAULT);
          $sqlStr2 = sprintf("Insert Into login (UserName, Pwd) VALUES ('%.50s','%.255s')", $usernameLocal, $pwdEncrypted);
          
          $result = $dbobject->query ( $sqlStr2 );
          
          if ($result === false)
          {
            throw new Exception("Database error 2.");
          }
          else
          {
            echo "<h4>Success!</h4>";
          }
        }
      }
      catch (Exception $e)
      {
        echo 'Caught exception: ',  $e->getMessage(), "<br />";
        exit('Stop processing...');
      }
    ?>
  </div>
</body>
</html>

You can use the above pages to create a fake login account here.

After you use the above pages to create a fake login account, you may use this page to pretend to log in. This is page pretendLogin.php:

<?php
  session_start();
?>
<!DOCTYPE html>
<html>
<head>
  <meta charset="UTF-8">
  <title>Login to your Fake Account</title>
  <link rel="stylesheet" type="text/css" href="PHPstyles.css" />
  <script type="text/javascript" src="includes/jquery.js"></script>
</head>

<body onload="setCursorPos();">
  <div id="mainDiv">
    <h3>Login to Your Account (Sample)</h3>
<?php
  if (isset($_SESSION['error_message']))
  {
    echo "<h3 class='error_emphasis'>{$_SESSION['error_message']}</h3>";
    unset($_SESSION['error_message']);
  }
?>
    <form method="post" action="pretendLoginAction.php">
      <table id="form_table">
        <tr>
          <td class="form_label col1">Username: </td>
          <td class="form_input col2"><input type="text" size="60" name="username" id="username"></td>
        </tr>
        <tr>
          <td class="form_label">Password: </td>
          <td class="form_input"><input type="password" size="60" name="password" id="password"></td>
        </tr>
        <tr>
          <td></td>
          <td><input type="submit" value="Log In"></td>
        </tr>
      </table>
    </form>
  </div>
</body>
</html>

And the fake login processing page is this page, which is pretendLoginAction.php:

<?php
  session_start();
  require_once "config.php";
  if (empty($_POST['username']) || empty($_POST['password']))
	{
		$_SESSION['error_message'] = "Some of your login information is missing.  Please try again.";
		$filename = "pretendLogin.php";
		if (strpos($_SERVER['SERVER_NAME'],'localhost') === false)
		{
			header("Location: " . constant("SECUREURL") . "/PHPStuff/" . $filename);
			exit();
		} else {
			header("Location: " . constant("LOCALURL") . "/PHPStuff/" . $filename);
			exit();
		}
	}
	else
	{
    $mysqli = new mysqli(DBSERVER, DBUSER, DBPWD, DBNAME);
    /* check connection */
    $db_error = $mysqli->connect_errno;
    if ($db_error)
    {
      printf("<p>Connect failed: %s</p>\n", $db_error);
      exit();
    }
    $rawPassword = $_POST['password'];
    $query  = sprintf("
SELECT *
FROM login
WHERE UserName = '%s'", $_POST['username']);
		$result = $mysqli->query($query, MYSQLI_USE_RESULT);
		if ($result) {
			$row = $result->fetch_assoc();
			if (!empty($row['Pwd']) && password_verify($rawPassword, $row['Pwd'])) {
				unset($_SESSION['error_message']);
				$filename = "fakeSuccessfulLogin.php";
			} else {
				$_SESSION['error_message'] = 'Your login information is incorrect.  Please try again.';
  			$filename = "pretendLogin.php";
			}
			if (strpos($_SERVER['SERVER_NAME'],'localhost') === false)
			{
				header("Location: " . constant("SECUREURL") . "/PHPStuff/" . $filename);
				exit();
			} else {
				header("Location: " . constant("LOCALURL") . "/PHPStuff/" . $filename);
				exit();
			}
		}
	}
?>

Finally, if you successfully fake your login, here is the "success" page, which is fakeSuccessfulLogin.php:

<!DOCTYPE html>
<html>
<head>
  <meta charset="UTF-8">
  <title>Successful Login</title>
  <link rel="stylesheet" type="text/css" href="PHPstyles.css" />
  <script type="text/javascript" src="includes/jquery.js"></script>
</head>

<body">
  <div id="mainDiv">
    <h3>Successful Login!</h3>
  </div>
</body>
</html>

Please note these points about the above code:


Catching SQL Injection

Here is the code for the first include file in the previous section's action page. The include file is isSQLInject.php:

<?php
/**
 * Tests a string for possible SQL injection.
 */
function isSQLInject($input)
{
  $listSQLInject = "<script,select,insert,update,delete,formfields,tablename,drop,iframe,onload,sql," .
      "declare,cast ,cast(,'cast, cast (,char(, CHAR(4000),CHAR(,varchar,exec,execute,sp_sqlExecute," .
      "HTTP-EQUIV,url=,/url,.exe,.js,/script,formfields,server=,datasource,--,values,CATEGORYID," .
      "DECLARE,'-',||,concat,(M),(SM),LOAD_FILE,ASCII,COLLATE,1=1,UNION,queryout,cmdshell,xp_,*.exe," .
      sysobjects,xtype,syscolumns,NOT EXIST,WAITFOR,BENCHMARK,pg_sleep,'x?,(),mysql,verynx,convert," .
      "schema,with value";
  $arraySQLInject = explode(",", $listSQLInject);
  
  foreach($arraySQLInject as $injectValue)
  {
    $pos = stripos($input,$injectValue);
    if ($pos !== false)
    {
      return true;
    }
  }
  return false;
}
?>

And now here is the code for the second include file, CheckURLandFORMscope.php:

<?php
try
{
  foreach ($_REQUEST as $field => $value)
  {
    if (is_array($value))
    {
      //skip it, it's probably checkboxes
    }
    else
    {
      if (isSQLInject($value) === true)
      {
        $displayValue = htmlspecialchars($value);
        throw new Exception("Injection Warning in field '$field', value '$displayValue'!");
      }
    }
  }
}
catch (Exception $e)
{
    echo 'Caught exception: ',  $e->getMessage(), "
"; $to = 'jlink@dcccd.edu'; $subject = 'SQL Injection Warning'; $message = $e->getMessage(); $message = wordwrap($message, 70); $headers = 'From: ' . 'admin@jimlink.net'; if ($to != "" && $subject != "" && $message != "" && $headers != "From: ") { $outcome = mail($to, $subject, $message, $headers); } exit('Stop processing...'); }

Please note that the second include file, CheckURLandFORMscope.php, does a check at the top of the foreach loop to see if the value is an array. If the value is an array, this code skips it. Interestingly, this skipping of the test for arrays is fairly safe, because array values are most like coming into the form from HTML checkboxes, which should not have any injection code in them.

Of course, if you use these two SQL-injection-preventing pages in your own PHP pages, you will need to make at least these modifications:

Here are some additional hints and suggestions regarding the use of this code to catch SQL injection attacks:


Further Reading

The following articles provide some background information regarding this issue of SQL injection. They also provide some hints regarding how to avoid SQL injection. The previous sections in this handout were drawn in part from these articles.