Large file upload times out, but still uploads

We’re using OnDemand 3.1.14 and setup /etc/ood/config/nginx_stage.yml with

nginx_file_upload_max: ‘42949680000’ and /etc/ood/config/apps/dashboard/env with FILE_UPLOAD_MAX=‘42949680000. When I upload a file larger then about 22GB (smaller then that works fine), I get an error that the file failed to upload. I can watch /var/tmp get the data and then have Rack start to copy it over to /tmp and then fail and things clean up.

I found this discussion:

And changed Timeout and ProxyTimeout to 900 in Apache /etc/httpd/conf.d/ood-portal.conf, restarted httpd, and while I still get the failure dialog:

The file does in fact upload and show up in the directory I told it too. I’ve run md5sum on both the source and destination files, and they match, so it is uploading the entire file, correctly, even though OOD reports it failed.

Therefore, I am guessing there is some other timeout value I need to increase to match the Apache Timeout and Proxytimeout 900 setting, likely inside NGINX that tells it to wait longer before reporting the failed upload, but I can’t find any information on that.

Suggestions? Thanks!

Sorry you’re running into this.

Uploads in OOD aren’t really designed for very big files, so once the file gets to up to and beyond that 20GB size, things can get unreliable because of how Passenger/Rack and temporary storage are handled during the transer.

We recommend using a dedicated file transfer solution (like Globus) for data that big. OOD integrates with Globus and other providers for exactly this reason, since those tools are built to reliably handle very big transfers with better integrity checking of the data.

For smaller files, tweaking Apache/NGINX timeouts can help, but for >20GB transfers the best path is very likely a tool made specifically for that.

You can find the entry for Globus integration on this page (just ctrl-f for globus):

Thanks for providing that information. It’s good to have an upper size limit for the portal transfers. For anything larger then that we’ll direct our users to an alternate solution.