When it comes to validation and sanitization in PHP, many wheels are frequently reinvented. Stop! Did you know that PHP offers a core API for this? The Filter extension has been available since PHP 5.2 (you really should be on 5.3 or higher) and is enabled by default.
This extension filters data by either validating or sanitizing it. This is especially useful when the data source contains unknown (or foreign) data, like user supplied input. For example, this data may come from an HTML form.
To summarize the documentation, the Filter API offers to filter the data both by validation and by sanitization. Validation filters return the original data if valid and false otherwise, while sanitization filters return a sanitized version of the data. It also comes in two flavours of input argument, with a general purpose function for any type of scalar argument, and with a specialized version for filtering fields of the HTTP-oriented superglobals. These functions accept the argument to be filtered, the name of the filter to apply, and an optional configuration array.
Lets work on a practical example now. Suppose that we wanted to determine the validity of an email. Use the Filter API, and I repeat, use the Filter API!
$emailAddress = 'jane.doe@vanilla.org';
if( !filter_var( $emailAddress, FILTER_VALIDATE_EMAIL ) ) {
die( 'email address is invalid!' );
}
Beautifully simple and concise. The best part is avoiding having to reinvent the wheel and worry about the RFC 822 spec. Now suppose that we’d like to check that the value is an integer within some range – a numerical month for example!
$month = 9;
$filterArgs = [ 'min_range'=> 1, 'max_range' => 12 ];
if( !filter_var( $month, FILTER_VALIDATE_INT, $filterArgs ) ) {
die( 'month is invalid!' );
}
Now, an invalid month might sometimes be within the tolerance of the logic. Suppose that we would like to default to January in the case that the month is invalid or perhaps not set. Validation filters use the default option if provided in the configuration array – lets leverage this feature.
$filterArgs = [ 'default' => 1, 'min_range' => 1, 'max_range' => 12 ];
$month = filter_var( 13, FILTER_VALIDATE_INT, $filterArgs );
Now lets sanitize some input from a webform. When receiving a chunk of text that will not be interpreted, it’s a good idea to sanitize by stripping tags and bad byte sequences. As a rule of thumb, if the data is being persisted somewhere such as the database, it should be sanitized before that happens.
// Filter $_POST['user_bio']
$shortBio = filter_input( INPUT_POST, 'user_bio', FILTER_SANITIZE_FULL_SPECIAL_CHARS );
The advantages of validation/sanitization aside, using the Filter API to filter HTTP input also circumvents the nuisance of warnings generated by unset map keys that plagues PHP. Even if the input isn’t to be filtered at that point or doesn’t require filtering, the default filter, FILTER_UNSAFE_RAW, can be used to read the input as-is.
/* comparison of reading $_GET[ 'action' ] */
// Reading action with the classic approach
$action = isset( $_GET['action'] ) ? $_GET[ 'action' ] : null;
// Reading action with the Filter API
$action = filter_input( INPUT_GET, 'action', FILTER_UNSAFE_RAW );
And there it is, the Filter API. I recommend reading the official documentation – we covered the essentials but there is even more to be leveraged. The API offers a rich set of filters, many of which offer optional mode flags which affect their behaviour. Example: the integer filter can be set to accept hex and octal notation. The API also supports custom validators and sanitizers.
Reasons to use the Filter API
- Saves time with out-of-the-box solutions implemented
- Results in more compact code
- Integrates with existing filter logic
- Maintained by PHP core