SiteCrafting Blah Blah Blog
Feb. 28, 2007 at 1:25pm
PHP Strings - Gotcha!
Yesterday I was working with a script that saves an uploaded image to a database. If you've ever seen a binary file as a string, there's a ton of gobbelty-gook that is unreadable to people. For some very odd and inexplicible reason, my script only saved the first four character's worth of data to the database, and that's not very helpful. I did some digging, and revealed a weird gotcha in the way PHP treats strings.
My first debug technique is to print the data I'm working with. The image data looked normal - that is to say it had lots of weird characters. Since I was using my nifty function to generate the sql for the INSERT, I printed that out as well. The data was fine going into the function, but everything after the 4th character was missing.
I wasn't sure what was going on, so I rewrote the sql, and noticed a curious thing. There were a lot of instances of the '\' character followed by the '0' character. If you've ever worked with C/C++, you know the single character '\0' means 'end of string'. It occurred to me that some PHP functions treat string data very differently, even though PHP is built on C or C++.
My guess is that the echo command works something like this (in C/C++/Java psuedo code):
public void echo(char* str) {
int i = 0;
while(i < str.length) {
stdout << str[i];
}
}Basically, it puts every character of the string to stdout. When it encounters an errant '\0', it keeps going because it's not the end of the string. If you use addslashes() before you do that, it's ok because '\\' is treated as a literal '\' and not an escape character.On the other hand, when I used foreach($array as $key => $value) to break apart the data array in the insert function, that worked much differently, ie it worked like this:
while(str[i] != '\0') {
...do stuff...
}Notice the difference? '\0' is actually an implied end of string, not the actual end of string. This is because in C and C++, you have to actually declare the length of your character array that holds strings because they don't have a string datatype. So that command stopped prematurely, basically meaning that foreach($array as $key => $value) is not binary safe because foreach copies the array and doesn't use the original. Using foreach($array as $key => &$value) (note the &) may work because it uses a reference to the original data.
Posted in Coding Techniques, PHP by Dave Poole
Comments (0)
Add your comment below