Skip to content

Fix #67495: pdo_dblib incorrectly escapes binary and unicode values #2017

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
126 changes: 79 additions & 47 deletions ext/pdo_dblib/dblib_driver.c
Original file line number Diff line number Diff line change
Expand Up @@ -143,60 +143,92 @@ static zend_long dblib_handle_doer(pdo_dbh_t *dbh, const char *sql, size_t sql_l
return DBCOUNT(H->link);
}

static int dblib_handle_quoter(pdo_dbh_t *dbh, const char *unquoted, size_t unquotedlen, char **quoted, size_t *quotedlen, enum pdo_param_type paramtype)
static int dblib_handle_quoter(pdo_dbh_t *dbh, const char *unquoted, size_t unquoted_len, char **quoted, size_t *quoted_len, enum pdo_param_type param_type)
{
const char *hex = "0123456789abcdef";
char *q;
*quoted_len = 0;
char is_ascii = 1;
char is_int = 1;

switch (PDO_PARAM_TYPE(param_type)) {
case PDO_PARAM_LOB:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that this type is meant for streams:
http://php.net/manual/en/pdo.lobs.php

I agree that PDO's types are a bit limiting, but it'll potentially confuse things to overload this type for this use case. Do you know how other PDO drivers handle this content? I haven't had experience with something like this. Another approach I've seen, with a different API, involved a data type specific to non-stream binary data.

/*
* Binary safe quoting
*/
*quoted_len += (unquoted_len * 2) + 2; // 2 chars per byte +2 for "0x" prefix
q = *quoted = safe_emalloc(2, *quoted_len, 3);

*q++ = '0';
*q++ = 'x';
for (size_t i = 0; i < unquoted_len; i++) {
*q++ = hex[(*unquoted >> 4) & 0xF];
*q++ = hex[(*unquoted++) & 0xF];
}
break;

int useBinaryEncoding = 0;
const char * hex = "0123456789abcdef";
size_t i;
char * q;
*quotedlen = 0;

/*
* Detect quoted length and if we should use binary encoding
*/
for(i=0;i<unquotedlen;i++) {
if( 32 > unquoted[i] || 127 < unquoted[i] ) {
useBinaryEncoding = 1;
case PDO_PARAM_BOOL:
*quoted_len = 3;
if (!unquoted_len // empty strings treated as FALSE anyway
|| (unquoted_len == 1 && (*unquoted == '0' || *unquoted == ' ')) // one-char strings depends on a value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does a space count as false but two spaces don't? It would be cleaner To say that empty strings and '0' are false, everything else true. But I wonder if there's a case here I'm not thinking of.

) {
*quoted = estrndup("'0'", 3);

} else { // strings with length > 1 and one-char strings with a `non-zero` value treated as TRUE
*quoted = estrndup("'1'", 3);
}
break;
}
if(unquoted[i] == '\'') ++*quotedlen;
++*quotedlen;
}

if(useBinaryEncoding) {
/*
* Binary safe quoting
* Will implicitly convert for all data types except Text, DateTime & SmallDateTime
*
*/
*quotedlen = (unquotedlen * 2) + 2; /* 2 chars per byte +2 for "0x" prefix */
q = *quoted = emalloc(*quotedlen+1); /* Add byte for terminal null */

*q++ = '0';
*q++ = 'x';
for (i=0;i<unquotedlen;i++) {
*q++ = hex[ (*unquoted>>4)&0xF];
*q++ = hex[ (*unquoted++)&0xF];
}
} else {
/* Alpha/Numeric Quoting */
*quotedlen += 2; /* +2 for opening, closing quotes */
q = *quoted = emalloc(*quotedlen+1); /* Add byte for terminal null */
*q++ = '\'';

for (i=0;i<unquotedlen;i++) {
if (unquoted[i] == '\'') {
*q++ = '\'';
*q++ = '\'';
} else {
*q++ = unquoted[i];

case PDO_PARAM_INT:
for (size_t i = 0; i < unquoted_len; i++) { // check if it contains only digits
if (unquoted[i] < '0' || unquoted[i] > '9') {
is_int = 0;
break;
}
}
}
*q++ = '\'';
}

*q = 0;
if (is_int) { // no quoting needed, just copy unquoted string
*quoted_len = unquoted_len;
*quoted = estrndup(unquoted, *quoted_len);
break;
}
// continue as if it's a string


default:
// count quoted length
for (size_t i = 0; i < unquoted_len; i++) {
++*quoted_len;
if (unquoted[i] == '\'') {
++*quoted_len;
};

if (is_ascii && (unquoted[i] < 32 || unquoted[i] > 126)) {
is_ascii = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple practical issues with doing this automatically:

  • While pdo_dblib is commonly used with MSSQL, that's not guaranteed. Some DBs might not like this syntax.
  • It could force undesirable casts. You care about the destination column type, not the data you're passing.

There are probably other scenarios where being clever isn't advantageous. I'd thought about creating syntax like this:
$db->quote('Съешь же ещё этих мягких французских булок, да выпей чаю.', PDO::PARAM_STR | PDO::PARAM_STR_UNICODE);

That way, it's totally explicit. Which is generally a desirable trait when talking to databases. You could also have some driver attributes that let you default into unicode mode if that's desired. You'd then want a PDO::PARAM_STR_ASCII type for overriding it.

}
}

// +2 for opening, closing quotes and +1 for `N` in case of non-ascii string
*quoted_len += (is_ascii ? 2 : 3);
q = *quoted = safe_emalloc(2, *quoted_len, 3);
if (!is_ascii) {
*q++ = 'N';
}
*q++ = '\'';

for (size_t i = 0; i < unquoted_len; i++) {
if (unquoted[i] == '\'') {
*q++ = '\'';
*q++ = '\'';
} else {
*q++ = unquoted[i];
}
}
*q = '\'';
break;
}

return 1;
}
Expand Down
40 changes: 32 additions & 8 deletions ext/pdo_dblib/tests/pdo_dblib_quote.phpt
Original file line number Diff line number Diff line change
Expand Up @@ -3,22 +3,46 @@ PDO_DBLIB: Ensure quote function returns expected results
--SKIPIF--
<?php
if (!extension_loaded('pdo_dblib')) die('skip not loaded');
require dirname(__FILE__) . '/config.inc';
require __DIR__ . '/config.inc';
?>
--FILE--
<?php
require dirname(__FILE__) . '/config.inc';
require __DIR__ . '/config.inc';

var_dump($db->quote(42));
var_dump($db->quote('42', PDO::PARAM_INT));
var_dump($db->quote('4.2', PDO::PARAM_INT));
var_dump($db->quote(true));
var_dump($db->quote(false));
var_dump($db->quote(true, PDO::PARAM_BOOL));
var_dump($db->quote(false, PDO::PARAM_BOOL));
var_dump($db->quote(42, PDO::PARAM_INT));
var_dump($db->quote('42', PDO::PARAM_BOOL));
var_dump($db->quote('', PDO::PARAM_BOOL));
var_dump($db->quote(' ', PDO::PARAM_BOOL));
var_dump($db->quote('0', PDO::PARAM_BOOL));
var_dump($db->quote(null));
var_dump($db->quote(null, PDO::PARAM_NULL));
var_dump($db->quote('\'', PDO::PARAM_STR));
var_dump($db->quote('foo', PDO::PARAM_STR));
var_dump($db->quote('Съешь же ещё этих мягких французских булок, да выпей чаю.'));
var_dump($db->quote('The quick brown fox jumps over the lazy dog.'));
var_dump($db->quote(hex2bin('90243445b265537314e25b1a0ef96e9d'), PDO::PARAM_LOB));
var_dump($db->quote("foo'bar"));

?>
--EXPECT--
string(4) "'42'"
string(2) "42"
string(5) "'4.2'"
string(3) "'1'"
string(2) "''"
string(4) "'42'"
string(3) "'1'"
string(3) "'0'"
string(3) "'1'"
string(3) "'0'"
string(3) "'0'"
string(3) "'0'"
string(2) "''"
string(2) "''"
string(4) "''''"
string(5) "'foo'"
string(106) "N'Съешь же ещё этих мягких французских булок, да выпей чаю.'"
string(46) "'The quick brown fox jumps over the lazy dog.'"
string(34) "0x90243445b265537314e25b1a0ef96e9d"
string(10) "'foo''bar'"