Only yesterday I was debating post title’s with Niki Van Cleemput claiming to him that I believed all sort and makes of people should fully comprehend what the post is all about by reading the title. He rebutted that a bit by saying he’s not targeting an audience like that but instead he’s focused on people who are already down into the subject. Both are true I guess and do not mutually exclude each other. I kinda target an audience that needs a push here and there, so rudely spoken, know less than I do at the moment. I just want to return the favour after 26 six years into computers getting help from others taking the time to figure stuff out with me.
This article is a prequel to another one, which will be storing those uploaded binary files into CouchDB backend. These files reside on remote machines and will be uploaded to the target server which will stock, create a CRC code and use that to store them in uniquely in a CouchDB database. I was about to post one big article when I realised that finding how to upload files with a PHP Curl client program isn’t really easy to find and deserves it’s own post.
Enable file uploads for PHP
I’m using nginx (of course), so this means I have php-fpm running. Hence any change that I have to make to PHP will NOT be in the cli nor the apache2 subdirectories on the Debian family of supreme operating systems.
drwxr-xr-x 82 root root 4096 2011-11-11 14:02 ../
drwxr-xr-x 2 root root 4096 2011-11-08 15:17 apache2/
drwxr-xr-x 2 root root 4096 2011-11-08 17:16 cgi/
drwxr-xr-x 2 root root 4096 2011-11-08 17:19 cli/
drwxr-xr-x 2 root root 4096 2011-11-08 15:18 conf.d/
drwxr-xr-x 2 root root 4096 2011-11-08 17:17 fpm/
root@cartman:/etc/php5#
As you can see: it’s in the fpm subdirectory. How handy is that! If that doesn’t work for you for some weird reason you should try the cgi one next. Search for a section called File Uploads in the php.ini file in that subdir. Check if the parameters allow you to upload a file of the size you will be trying AND check if the target temp directory exists, is writable by the nginx user (www-data) and is filled out in the config. Glance over the other settings too.
; File Uploads ;
;;;;;;;;;;;;;;;;
; Whether to allow HTTP file uploads.
; http://php.net/file-uploads
file_uploads = On
; Temporary directory for HTTP uploaded files (will use system default if not
; specified).
; http://php.net/upload-tmp-dir
upload_tmp_dir = /var/www/nginx-default/uploads
; Maximum allowed size for uploaded files.
; http://php.net/upload-max-filesize
upload_max_filesize = 2M
; Maximum number of files that can be uploaded via a single request
max_file_uploads = 20
You will also want to protect that directory a bit since it is under the webroot, depending on your level of protections, you can set these in nginx.conf inside a server section :
location /var/www/nginx-default/incoming {
autoindex off;
}
# or you can totaly deny this (and I suggest to do so) access to the client
location /var/www/nginx-default/incoming {
deny all;
access_log off;
log_not_found off;
}
If that is all ok , restart php-fpm and now you are ready to create a script that will accept those files. When I test stuff I always try to not customize too much with subdirs and other things that might screw my work up forcing me to focus on tangent instead of my actual goal. Later on, I think about deploying it decently. Good code is easy to throw around.
So, just navigate to the nginx-default webroot and create the scripts we use here and directories in there (incoming / uploaded ) etc.:
# mkdir incoming
# mkdir uploaded
Pay attention for file permissions too.
Create php code to accept the files
I want to make one thing clear first, I do NOT support putting make-up into code at all. In fact what we are doing here is create an API between couchDB and a client. The API here will be in PHP, the client too since I will be using PHP Curl. This is not needed, most people should know that the client can be anything. I’m just working on some things that require this, so this approach will be used in real life although with more detailed proper coding.
The reason there’s HTML inside my PHP file this time is for reasons of debugging. The upload mechanism inside PHP is kinda strange to comprehend for a newbee. But even an experienced programmer that never focussed on this before will probably raise some eyebrow figuring out where in the hell that uploaded file went. Also, if you haven’t actually written that curl client, you can’t test with it yet. So at first, we will be using our browser to create the backend script first, once we know that works, we will dig into the client. And in the last phase once the files are able to arrive at the destination we will introduce these into couchDB. That sure sounds like a plan, perhaps not how others would to it, but having a plan is a good thing. If you fail to plan, your plan will fail.
So let’s create this backend. We will call it upload.php since we feel so inspired today. Since this is no pre-school class, I’m not going to explain this line per line but I will point out later where you need to pay attention to.
error_reporting(E_ALL);
ini_set('memory_limit', '20M');
ini_set("upload_max_filesize", "1M"); // This is per file apparantly
ini_set("post_max_size", "2M");
$allowed_extensions = array("txt","csv","htm","html","xml", "css","doc","xls","rtf","ppt","pdf",
"swf","flv","avi", "wmv","mov","jpg","jpeg","gif","png");
$target_dir = "incoming";
// so we don't have to apply makeup
echo "<pre>";
// Just fix that silly $_FILES layout and never look back
fix_files_superglobal();
/* I will not be handing out too much information to the end user about the upload progress, if it fails I want
that to happen silently, which is why we can nest these checks like below without else's
*/
if(!empty($_FILES)) {
foreach ($_FILES as $file) {
echo sprintf("\n");
// always double check web input
if (strlen($file['tmp_name']) > 0 and $file['error'] === UPLOAD_ERR_OK) {
// I just like shorter notations
$source_file = $file["tmp_name"];
$target_file = $file["name"];
$file_size = $file["size"];
echo sprintf("Uploaded file '%s' accepted\n",$target_file);
echo "Size was: " . display_filesize($file_size / 1024) . "\n";
// Extra security step, see the PHP manual for these functions!
if (is_uploaded_file($source_file)) {
echo sprintf("File comes from a POST operation, that is ok\n");
// Find the dot in the name
$dot_pos = strripos($target_file, '.');
// strip extention, one could alternatively use preg_split here instead or even path_info
$file_basename = substr($target_file, 0, $dot_pos);
$file_ext = substr($target_file, $dot_pos+1);
// Alternatives:
// $file_ext = pathinfo($target_file, PATHINFO_EXTENSION);
// $file_ext = strrchr($target_file, '.');
// Do not use explode, it doesn't handle filenames like foobar-1.1.1.tar gracefully
// You definitely don't want to use split(), this function has been DEPRECATED as of PHP 5.3.0
// Is there an extension at all since we require one!
if (!empty($file_ext)) {
echo sprintf("Accepted extensions are: %s\n",implode($allowed_extensions,", "));
// See if this type of file is allowed according to our list above
if (in_array($file_ext, $allowed_extensions)) {
echo sprintf("Extension accepted : %s\n",$file_ext);
// Ok, now we are pretty much sure that about everything is in order.
// Prepend a path to the cleaned target file name
$save_file=$target_dir . DIRECTORY_SEPARATOR . clean_name($target_file);
// if you don't call this function it will not be saved at all
if (!file_exists($save_file)) {
echo sprintf("Saving to filename : %s\n",$save_file);
move_uploaded_file($source_file, $save_file);
} else {
echo sprintf("Not overwriting existing filename : %s\n",$save_file);
}
}
}
}
}
}
}
echo "\n";
echo "</pre>";
/* Fixes the messed up array doing multiple file uploads using a single array post var like : file[1], file[2] */
function fix_files_superglobal() {
$new_files = array();
foreach($_FILES as $key => $attributes ) {
// echo sprintf("%s => %s", $key , $attributes);
foreach($attributes as $tagname => $tags ) {
// echo sprintf("%s => %s\n", $tagname , $val);
if (is_array($tags)) {
foreach($tags as $file_key => $value ) {
$new_files[$file_key][$tagname] = $value;
}
}
}
}
/* Only copy this back if we have content, when we don't we are dealing with
a single file or form fields not like file[f1], file [f2], but just plain 'file' */
if (!empty($new_files)) {
$_FILES = $new_files;
}
}
function display_filesize($filesize){
if(is_numeric($filesize)){
$decr = 1024; $step = 0;
$prefix = array('Byte','KB','MB','GB','TB','PB');
while(($filesize / $decr) > 0.9){
$filesize = $filesize / $decr;
$step++;
}
return round($filesize,2).' '.$prefix[$step];
} else {
return 'NaN';
}
}
function clean_name ($name) {
/* - remove extra spaces/convert to _,
- remove non 0-9a-Z._- characters,
- remove leading/trailing spaces */
return $safe_filename = preg_replace( array("/\s+/", "/[^-\.\w]+/"), array("_", ""), $name);
}
?>
<html>
<title>Testing multiple file upload functions</title>
<body>
<form action="upload.php" method="post" enctype="multipart/form-data">
<label for="file">Filename:</label>
<input type="file" name="file1" id="file1" />
<input type="file" name="file2" id="file2" />
<br />
<input type="submit" name="submit" value="Submit" />
</form>
</body>
</html>
If you happen to see an error like this you have a permission problem or you are trying to move a file from outside the webserver, make sure you keep the directories involved in the document root and not below. This is very hard to figure out if you don’t. Ownership of the directories should be the same as the one running nginx, by default www-data.
failed to open stream: Permission denied in ....
Once that is all set up the backend should work and we can start creating the curl client. This one is actually really easy. Notice the @ before the filename, that one is golden as that will make it send the file content.
// same as <input type="file" name="file[one]"> in the frontend
$post = array(
"file[one]"=>"@file1.ext",
"file[two]"=>"@file2.ext"
);
$url="http://cartman/upload.php";
/* Do it with curl */
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_VERBOSE, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible;)");
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $post);
// Now exec it
$server_output = curl_exec($ch);
$curlinfo = curl_getinfo($ch);
// This is great for debugging, like this you can see everyting the backend script pumps out
print_r($server_output);
// Check this if you want info concerning the transfer
print_r($curlinfo);
curl_close($ch);
?>
By now this should already work now, send it 2 files to test this. My uploads are working at this point.
$_FILES is flawed
Another thing you need to be aware of, when using square brackets in your input name form attributes to group the properties of your object you are screwed. $_FILES then creates an array where the properties of your uploaded files become properties of your named object and the properties you were expecting become properties of the uploaded file properties.
Like this:
file[1] = "file2.ext"
will create a $_FILES array in the receiving php script as
(
[file] => Array
(
[name] => Array
(
[0] => file1.ext
[1] => file2.ext
)
[type] => Array
(
[0] => application/octet-stream
[1] => application/octet-stream
)
[tmp_name] => Array
(
[0] => /var/www/nginx-default/uploads/phpZSKUdw
[1] => /var/www/nginx-default/uploads/phpbbE4j9
)
[error] => Array
(
[0] => 0
[1] => 0
)
[size] => Array
(
[0] => 374
[1] => 374
)
)
)
It doesn’t take a genius to figure out that this is screwed and totally non-logical to do so. We would rather have:
(
[file1] => Array
(
[name] => file1.ext
[type] => application/octet-stream
[tmp_name] => /var/www/nginx-default/uploads/phpeGWPpg
[error] => 0
[size] => 374
)
[file2] => Array
(
[name] => file2.ext
[type] => application/octet-stream
[tmp_name] => /var/www/nginx-default/uploads/phpeGWPpg
[error] => 0
[size] => 374
)
)
The function fix_files_superglobal() is created for that, I’ve seen ton’s of difficult approaches handling this conversion but my solution is more bullet-proof than most. I wasn’t satisfied with what I could use from others so I wrote this.
I hope this was helpful for someone.