Escaping Outputs. Ultimate Tutorial

It is about how to protect your code from XSS attacks properly without over-escaping.

/2 comments

My story begins with how I worked with almost every metabox plugin on the market and couldn’t find one that fits my needs. I need it fast, simple and secure. As a result I created my own meta box plugin and now I’m sharing my experience in this tutorial.

Rule #1. Trust No One and Nothing

When we are talking about escaping it is usually about what we get from a database. But remember, that the database is not a trusted data source. Just let me show you an example:

echo '<label for="' . $id . '">' . $label . '</label>'; ...

We a have one variable inside a HTML attribute and another one – directly in HTML tag. Let’s suppose that $label is coming from the database and contains something like this:

'<script>window.location = "https://not-rudrastyh.com";</script>'

Instad of displaying the label in a form, your website users will be just redirected to some shady website. Looks good? Not.

The same thing applies to $id variable which is inside the HTML attribute. And this is how:

'"><script>window.location = "https://not-rudrastyh.com";</script>'

Maybe redirecting doesn’t look so dagerous for you, but what if there some kind of bitcoin mining script?

The Cure

To prevent that kind of crap from happening in the above example, all we have to do is to wrap the output in esc_attr() and esc_html() accordingly. Here is how:

echo '<label for="' . esc_attr( $id ) . '">' . esc_html( $label ) . '</label>'; ...

Great, now you know the basics, let’s continue with this 😁

When?

That’s a very good question, because in official WordPress documentation it is said, that some of WordPress functions take care of preparing the data for output and as an example they mention the_title() function.

Let’s check it now! I am not even saying about changing the title in phpMyAdmin, which is also possible of course, so let’s create a post with the title like this:

JavaScript in WordPress post title

And on the website pages where the title is going to be printed either with the_title() or get_the_title(), we got this:

JavaScript browser alert

But look, WP_Posts_List_Table in /wp-admin is not broken, although it uses the same get_the_title() function to print titles.

Escaped post titles in WordPress admin
Titles are escaped twice here by the way.

What does it mean?! 🤔

In WordPress admin pages function get_the_title() escaped with esc_html() this way:

add_filter( 'the_title', 'esc_html' );

WordPress escapes everything where it is really important and at the same time it gives the freedom to its users when we talk about a website front end.

And here is what I think – it is on your choice to decide whether to escape the titles etc or not when creating templates for your custom theme, but if you’re developing a plugin, or some kind of UI for WordPress admin, escaping is always a must.

esc_attr()

As you can understand from the function name – it prepares the data for the usage inside HTML attributes.

  • Removes incorrect utf8,
  • Converts < (less than), > (greater than), & (ampersand), " (double quote)  and ' (single quote) characters to HTML entities,
  • Will never double encode entities.

Example:

echo '<a href="" title="View post: ' . esc_attr( get_the_title() ) . '">...';

It also possible to add additional escaping with the filter attribute_escape.

Keep in mind that:

  • Do NOT use esc_attr() to escape data for src, href attributes – use esc_url() instead,
  • Do NOT use it for value attributes as well, because it could lead to lost HTML entities and incorrect values stored in database, use esc_textarea() instead. It is because esc_attr() doesn’t double encode entities.

esc_html()

Prepares the text for its usage inside HTML. The only difference from esc_attr() function is that is has a different filter hook connected to function output – esc_html instead of attribute_escape.

Example – let’s suppose you have a string like <div class="block"> and you want to display it within your website content.

$string = '<div class="block">';
echo esc_html( $string ); // outputs &lt;div class=&quot;block&quot;&gt;

esc_url()

Checks, tries to fix and cleans URLs. Here is how in the same order:

  • Replaces spaces with %20,
  • Removes symbols that are not allowed in URLs like backslashes,
  • In case URL protocol is mailto:, removes symbols %0d, %0a, %0D, %0A from the string using private _deep_replace() function which mean that strings like %0%0%0AAA will be converted to empty strings rather than the %0%0AA that str_replace() would return,
  • Replaces ;// with :// in case someone made a miskate,
  • If the URL doesn’t contain a scheme, http:// will be prepended unless it is a relative link starting with /, # or ? or a php file).
  • If the 3rd function parameter $_context is equal to display (which is by default), ampersands will be replaced with & and single quotes with ',
  • Encodes square brackets with %5B and %5D,
  • Checks if URL protocol is allowed, if not – returns empty string,
  • At the very end clean_url filter hook is applied to the result.
echo '<a href="' . esc_url( $url ) . '">...</a>';

Allowed Protocols

By default WordPress has a list of good protocols which can be retrieved with the function wp_allowed_protocols(), here is the list

  • http / https
  • ftp / ftps
  • mailto
  • news
  • irc
  • gopher
  • nntp
  • feed
  • telnet
  • mms
  • sms
  • rtsp
  • svn
  • tel
  • fax
  • xmpp
  • webcal
  • urn

As I said above, if your URL neither relative nor contains any of these protocols, empty string will be returned. But what if you let’s say want to escape the skype link like this skype:rudrastyh?call?

You can add “skype” to the list of allowed protocols using this filter hook kses_allowed_protocols. Example:

add_filter( 'kses_allowed_protocols', function( $protocols ) {
 
	$protocols[] = 'skype';
	return $protocols;
 
});

Another way is to specify a protocol directly while escaping:

$url = 'skype:rudrastyh?call';
echo '<a href="' . esc_url( $url, array( 'skype' ) ) . '">Call Misha</a>';

esc_js()

Escapes a string for its usage as an inline JavaScript, like onclick="", onsubmit="" or inside a <script> tag. Please note, the text strings in JavaScript in this case must be always wrapped in single quotes!

  • Removes incorrect utf8,
  • Escapes single quotes ',
  • Converts < (less than), > (greater than), & (ampersand), " (double quote) characters into HTML entities, < > & " accordingly,
  • Adds \n at the end of the lines.

Do you see now the difference between esc_js() and esc_attr()?

Let’s take a look at the example:

<?php
$text = "some single ' quote
then the next line and <b>html code</b>";
?>
<script>
	alert('<?php echo esc_js($text) ?>');
</script>

If you are not going to use esc_js() in this example, there will be a JavaScript error and nothin happens, but in our case we get an alert message in browser like this:

esc_js() usage example

esc_textarea()

Prepares a string for the usage inside a <textarea> tag.

  • Converts < (less than), > (greater than), & (ampersand), " (double quote)  and ' (single quote) characters to HTML entities,

Escaping with Localization

It is also worth mentioning a couple of localization functions like esc_html__(), esc_html_e(), esc_html_x(), esc_attr__(), esc_attr_e(), esc_attr_x() which are not only translate string but also escape them.

Example:

esc_html_e( 'Hello World', 'some_text_domain' );
// absolutely the same
echo esc_html( __( 'Hello World', 'some_text_domain' ) );

Related Posts

Misha Rudrastyh

Misha Rudrastyh

I love WordPress, WooCommerce and Gutenberg so much. 11 yrs of experience.

Need some custom developer help? Let me know

Follow Misha

Table of Contents

Need some help with WordPress?

If you need some professional developer help, I will be happy to assist you.

Contact me Who I am?

Comments — 2

Leave a comment

php js HTML CSS Code

I will only use your personal information to contact you. Privacy Policy