Escaping a regular expression in PHP. Escaping (or what you need to know to work with text in text) Special characters in single and double quotes

  • Translation
  • Tutorial

SQL injections, cross-site request forgery, corrupted XML... Scary, scary things that we would all like to be protected from, but we just want to know why it’s all happening. This article explains the fundamental concept behind it all: strings and handling strings within strings.

The main problem It's just text. Yes, just the text - that’s the main problem. Almost everything in a computer system is represented by text (which in turn is represented by bytes). Is it possible that some texts are intended for computers, while others are intended for people. But both of them still remain text. To understand what I'm talking about, here's a small example:
Homo Sapiens Suppose, there is the English text, which I don"t wanna translate into Russian
You won't believe it: this is text. Some people call it XML, but it's just text. It may not be suitable for showing to an English teacher, but it is still just text. You can print it on a poster and go to rallies with it, you can write it in a letter to your mother... it's text.

However, we want certain parts of this text to have some meaning to our computer. We want the computer to be able to extract the author of the text and the text itself separately so that we can do something with it. For example, convert the above to this:
Suppose, there is the English text, which I don"t wanna translate into Russian by Homo Sapiens
How does the computer know how to do this? Well, because we very conveniently wrapped certain parts of the text with special words in funny parentheses, like and. Since we've done this, we can write a program that looks for these specific parts, extracts the text, and uses it for some invention of our own.

In other words, we used certain rules in our text to indicate some special meaning that someone else, following the same rules, could use.
Okay, this isn't all that hard to understand. What if we want to use these funny parentheses that have some special meaning in our text, but without using this very meaning?.. Something like this:
Homo Sapiens< n and y >
The "" characters are nothing special. They can legally be used anywhere, in any text, as in the example above. But what about our idea of ​​special words, like? Does this mean that it is also some kind of keyword? In XML - perhaps yes. Or perhaps not. This is ambiguous. Since computers are not very good at dealing with ambiguities, something can end up giving an unexpected result if we don’t dot the i’s ourselves and resolve the ambiguities.
This dilemma can be solved by replacing ambiguous symbols with something unambiguous.
Homo Sapiens Basic math tells us that if x< n and y >n, x cannot be larger than y.
Now, the text should become completely unambiguous. "".
The technical definition of this is shielding, we escape special characters when we don't want them to have their own special meaning.
escape |iˈskāp| [no obj. ] break free [ with obj. ] not to notice / not to remember [...] [ with obj. ] IT: a reason to be interpreted differently [...]
If certain characters or sequences of characters in a text have special meanings, then there must be rules that specify how to handle situations where those characters must be used without invoking their special meaning. Or, in other words, escaping answers the question: “If these symbols are so special, how can I use them in my text?”.
As you can see in the example above, the ampersand (&) is also a special character. But what if we want to write "


If your users are good and kind, they will post quotes from old philosophers, and the messages will look something like this:

Posted by Plato on January 2, 15:31

I am said to have said "Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat."


If users are smart, they will probably talk about math, and the messages will be like this:

Posted by Pascal on November 23, 04:12

Basic math tells us that if x< n and y >n, x cannot be larger than y.


Hmm... These desecrators of our brackets again. Well, from a technical point of view they may be ambiguous, but the browser will forgive us for that, right?


Okay, STOP, what the hell? Some prankster introduced javascript tags to your forum? Anyone looking at this message on your site is now downloading and executing scripts in the context of your site that can do who knows what. And this is not good.

Not to be taken literally In the above cases, we want to somehow tell our DB or browser that this is just text, don't do anything with it! In other words, we want to "remove" the special meanings of all special characters and keywords from any information provided by the user, because we don't trust him. What to do?

What? What are you saying, boy? Oh, you say, "shielding"? And you're absolutely right, take a cookie!
If we apply escaping to the user data before merging it with the query, then the problem is solved. For our database queries it will be something like:
$name = $_POST["name"]; $name = mysql_real_escape_string($name); $query = "SELECT phone_number FROM users WHERE name = "$name""; $result = mysql_query($query);
Just one line of code, but now no one can "hack" our database anymore. Let's see again what the SQL queries will look like, depending on the user input:
Alex
SELECT phone_number FROM users WHERE name = "Alex"
Mc Donalds
SELECT phone_number FROM users WHERE name = "Mc\"Donalds"
Joe"; DROP TABLE users; --
SELECT phone_number FROM users WHERE name = "Joe\"; DROP TABLE users; --"
mysql_real_escape_string indiscriminately places a forward slash in front of anything that might have some special meaning.


We apply the htmlspecialchars function to all user data before outputting it. Now the pest's message looks like this:

Posted by JackTR on July 18, 12:56


Note that the values ​​received from users are not actually "corrupted". Any browser will parse this as HTML and display everything on the screen in the correct form.

Which brings us back to... All of the above demonstrates a problem common to many systems: text in text must be escaped if it is not supposed to have special characters. When placing text values ​​in SQL, they must be escaped according to SQL rules. When placing text values ​​in HTML, they must be escaped according to HTML rules. When placing text values ​​in (technology name), they must be escaped according to (technology name) rules. That's all. For completeness, there are, of course, other ways to deal with user input that may or may not contain special characters:
  • Validation
    You can check if user input matches some given specification. If you require a number to be entered and the user enters something else, the program should inform the user and cancel the input. If all this is organized correctly, then there is no risk of catching "DROP TABLE users" where the user was supposed to enter "42". This is not very practical for avoiding HTML/SQL injections, because... Often you need to accept free-format text that may contain tricks. Typically, validation is used in addition to other measures.
  • Sanitization
    You can also “quietly” remove any symbols that you consider dangerous. For example, simply remove anything that looks like an HTML tag to avoid being added to your forum. The problem is that you can remove perfectly legal parts of the text.
    Prepared SQL statements
    There are special functions that do what we wanted: make the database understand the differences between the SQL query itself and the information provided by users. In PHP they look something like this:
    $stmt = $pdo->prepare("SELECT phone_number FROM users WHERE name = ?"); $stmt->execute($_POST["name"]);
    In this case, sending occurs in two stages, clearly distinguishing between the request and variables. The database has the ability to first understand the structure of the request and then fill it with values.

  • In the real world, these are all used together for different levels of protection. You should always use validation to ensure that the user is entering the correct data. You can then (but are not required to) scan the entered data. If a user is clearly trying to sell you some script, you can simply delete it. Then, you should always, always escape user data before putting it into an SQL query (the same goes for HTML).

2007.11.08 16:07

I encountered a problem with automatically adding quotes in PHP when entering information into the database.

After some digging on the Internet, I discovered that the problem can be solved by changing the server settings using the directives in .htaccess: magic_quotes_gpc and magic_quotes_runtime.

They say (and I even believe it) that the developers of the PHP language, being unable to force the bulk of PHP programmers to write high-quality code, decided to take care of the security of our DBMS and introduced the automatic addition of slashes before special characters. Slashes are added based on php.ini directives (magic_quotes_gpc and magic_quotes_runtime).

Directives are collectively called “magic quotes”, but I call them “hell quotes”. Indeed, in a well-written application there is no need for automatic quotation; moreover, extra quotes get in the way and have to be removed.

The first directive - magic_quotes_gpc - means that PHP automatically adds slashes to data coming from the user - from POST, GET requests and cookies. The second variable - magic_quotes_runtime - means that slashes are added to data received during script execution - for example, from a file or database. Thus, some functions that present such information perform quotation marks.

If you want to refuse such an intrusive service, then either you (in that rare and happy situation when you are the full owner of the server) disable these configuration variables in the php.ini file, or (unless, of course, you are hosting the site on free hosting) you can make changes to the .htaccess file. This is a file that contains local - for one directory, and not for the entire server - apache settings. And add the following lines to it.


Friends!
We are in the middle of a big cleanup!
Possible errors and curvature of pages
It is not possible to view everything quickly!
If you find any error, then if it’s not difficult for you, write the address...
You are here now:

http://site/page/php/039_php_kavyichki.html

Php quotes, single quotes, double quotes, escaping

Today we will deal with quotes, but not simple ones, but quotes in PHP and plus escaping quotes, options.

Let's start with the fact that PHP allows the use of both double and single quotes.

And there are several options for escaping quotes.

Where are the quotes located on the keyboard?

If you are going to work in code, then you need to know where the quotes are on the keyboard in the Latin layout - this is

letter E – lowercase:

and the letter e is spelled out - these are single quotes:

php double and single quote symbol

Naturally, you will need the quote character, i.e. If you need to print a quote without having it fire, that's what quote characters are for.

Double quote character:

"

Single quote character:

" Php escaping quotes.

What do you actually mean by escaping quotes in PHP.

Let's look at an example, because with examples it is always easier to understand what we are talking about!

Let's write the php code:

echo "php quotes";

But if we paste this code here, then I’m afraid that you would never see these lines!?

Why? Yes, simply because the code will not work.

I made it especially for you, with this code inserted into it, and if you want, you can see what would come of it!

Why did this happen?

Because inside echo there are additional quotes, which are perceived as PHP code, and if there are more of them than expected, then an error occurs!

What to do in this case!?

You need to replace the double quotes with single quotes.

Let's take the same code and change the double quotes to single quotes.

echo '"php quotes"';

Let's see what we got!

Those. We took the top code and pasted it directly into this page and this is the output:


2. The second option for escaping quotes. There are situations when using single quotes is impossible!

For this case, a left slash is used. Before each element that needs to be escaped you need to put such a slash.

Let's take the previous entry and do the same thing, only using a slash:

echo " php quotes";

Let's see the result:

How to remove quotes.

In order to display quotes on the screen, and so that they are not PHP code - no matter how strange it may sound! Quotes need to be changed to html entities, for example:

Single quote via html code::

" - single quote " " "

Double quote via html code:

" - double quote " " "

Such a quote will look like a quote on the screen, but will no longer appear in the Php code...

And further!

All PHP code must be treated carefully! If, for example, you wrote something in Word, some text, then you started writing code in it, as happened for me the first time.

And I couldn’t understand what the problem was - he really didn’t want to show me simple code. I was ready to break the computer with a sledgehammer!!!

But it turned out that the quotes in the code editor are different from Word. And it’s impossible to understand if you haven’t been through it!

Slash, from the English slash, is a backslash that inexplicably suddenly appears in your data. It is added to some special characters, but it is mainly used for placing quotation marks. The slash is only needed when working with a database. And it is absolutely necessary. In all other cases, it only gets in the way. Now we will look at both cases and learn how to write programs that do not depend on PHP settings.


The php.ini directives are responsible for automatically adding slashes



magic_quotes_gpc
magic_quotes_runtime



The first one - if enabled - automatically adds slashes to data coming from the user - from POST, GET requests and cookies. The second is from those received during script execution - for example, from a file. But there is not always access to PHP settings, especially if the program is written for distribution.


For your own safety, read the ENTIRE text, regardless of your case.


1. If you work WITHOUT a database
This means that you do not need to automatically add slashes. If PHP was added, then you need to get rid of it.


You can check whether PHP has been added using the get_magic_quotes_gpc() function.
The stripslashes() function removes slashes.
Now all we have to do is check, and if PHP has added it, then go through all the variables in the script and remove the slashes. This can be done with one function, using the $GLOBALS array, which contains all the variables present in the script:



if (get_magic_quotes_gpc()) strips($GLOBALS);


function strips(&$el) (
if (is_array($el)) (
foreach($el as $k=>$v) (
if($k!="GLOBALS") (
strips($el[$k]);
}
}
) else (
$el = stripslashes($el);
}
}



Slashes will be removed both from global arrays and from all variables that are formed when register_globals=on.


Here we need to make a small digression. Iterating through the $GLOBALS array is only required if you have register_globals enabled and you are using variables that are automatically assigned to the values ​​passed to the script. If you don’t use them, then just remove the slashes from the required arrays -
$_POST, $_GET and so on.


To get rid of adding slashes when retrieving data from a file, just write at the beginning of the script:



set_magic_quotes_runtime(0);



2. If you work with MySQL
Two basic rules for writing queries in mysql:


  • In all variables, special characters must be escaped with slashes.
    Important note. Added slashes do NOT go into the database. They are only needed in the request.
    When hitting the base, slashes are discarded. Accordingly, widespread
    It is a mistake to use stripslashes when retrieving data from the database.

  • All string variables must be enclosed in quotes (single or double, but single ones are more convenient and more often used). For simplicity, you can also enclose numeric variables in quotes - mysql itself converts them to the desired form. That is, for reliability, any data inserted into the request must be enclosed in quotation marks. Just doing addslashes() would be wrong. What if PHP itself has already been added? This needs to be checked. The get_magic_quotes_gpc() function is used for this.
    If the data came from the user’s browser using the GET or POST method, then you should write it like this:

    if (!get_magic_quotes_gpc()) $var=addslashes($var);



    If the data is taken from a file (which happens rarely, but still), then

    if (!get_magic_quotes_runtime()) $var=addslashes($var);



    But what's interesting. Especially for mysql, the latest versions of PHP introduced the mysql_escape_string() function, which escapes one more character than addslashes. It probably makes sense to use it.
    If you have a special function for composing queries, then escaping can be inserted into it. If not, you can use this function:

function adds(&$el,$level=0) (
if (is_array($el)) (
foreach($el as $k=>$v) adds($el[$k],$level+1);
) else (
$el = addslashes($el);
if (!$level) return $el;
}
}

This function has two uses.
If you specify a string as a parameter, the function will return it with escaped special characters.
Convenient for inserting into a request, like



"SELECT * FROM table WHERE name="".adds($name).""";



If the parameter is an array, then the function will not return anything, but will simply “traverse” all its elements recursively. For example, adds($_POST); will do the normal magic_quotes work for this array.


Note that none of the functions that add slashes add them to the "%" and "_" search metacharacters used in the LIKE operator. Therefore, if you use this operator, add the slashes manually.



$data=preg_replace("/(%|_)/","\\\\\1",$data);




Escaping rules may differ for other DBMSs.


Note:.
When displaying value in input tags of forms, slashes do not help. In order for the entire text in such a field to be displayed, value must be enclosed in quotes, and the htmlspecialchars function must be applied to the output data.
Example:



Result:

In the first version (with double quotes), we used escaping of the dollar special character, due to which this special character ceased to have its special purpose (variable designation) and turned into an ordinary dollar sign.

In the second option (with single quotes), as you already know, the PHP interpreter did not even try to find variables in the line, and therefore escaping was not required.

Special characters in PHP

Especially for blog readers Site on! I have prepared a small list of special characters in the PHP programming language:

  • \n new line
  • \r carriage return
  • \t horizontal tab
  • \\ backslash (backslash)
  • \$dollar sign
  • \" double quote

Let's look at the work of special characters using the example of \n - a special character that makes a new line (like Enter), but browsers do not (and should not) understand it and ignore it, but the result of its work can be seen in the source code of the page:

Result:

Source code (Ctrl + U):

If the special character \n is not displayed in any way for visitors in the browser, then what is its meaning?

Firstly, using special characters and \n in particular, you can conveniently format the code on the page (as in the example above).

Secondly, \n can be used, for example, during write operations to a file, to make a wrap (Enter) and continue writing on a new line.

An alternative to this formatting is .

Heredoc syntax in PHP

Result:

Source code (Ctrl + U):

The result speaks for itself, now let's figure out how everything works:

  • The line starts with three angle brackets
Continuing the topic:
Windows

Xiaomi continues to “close the blind spots” in its assortment, simultaneously releasing screwdrivers, curtains, everything in a row, and then suddenly for some reason it was necessary to “go back” and do...