I’m trying to upload a file 4g or bigger and it fails
I’ve updated nginx_file_upload_max: ‘12884901888’ (12g)
I remapped pun_tmp_root to same filesystem where files will are being uploaded.
pun_tmp_root: ‘/home/ondemand/%{user}’
Watching the upload I see tmp file grows to the size of the upload and a copy operation starts copying it to its intended target. Once filesize get around 1.7g it stops and the original temp file zeros out than starts growing again.
Any thoughts suggestions to correct. We want to allow our users upload 10G files.
Example of it looping:
total 3.8G
drwx------. 2 jwaters root 1 Dec 2 13:30 .
drwxr-xr-x. 7 root root 5 Dec 2 12:15 …
-rw-------. 1 jwaters jwaters 3.8G Dec 2 13:32 .nfsb54965d4c24057f900000014
total 3914121
drwx------. 2 jwaters root 1 Dec 2 13:30 .
drwxr-xr-x. 7 root root 5 Dec 2 12:15 …
-rw-------. 1 jwaters jwaters 4008058880 Dec 2 13:32 .nfsb54965d4c24057f900000014
total 3.8G
drwx------. 2 jwaters root 1 Dec 2 13:30 .
drwxr-xr-x. 7 root root 5 Dec 2 12:15 …
-rw-------. 1 jwaters jwaters 3.8G Dec 2 13:32 .nfsb54965d4c24057f900000014
total 1
drwx------. 2 jwaters root 1 Dec 2 13:33 .
drwxr-xr-x. 7 root root 5 Dec 2 12:15 …
-rw-------. 1 jwaters jwaters 0 Dec 2 13:33 .nfsfc2b247f427d472c00000015
total 1.0K
drwx------. 2 jwaters root 1 Dec 2 13:33 .
drwxr-xr-x. 7 root root 5 Dec 2 12:15 …
-rw-------. 1 jwaters jwaters 0 Dec 2 13:33 .nfsfc2b247f427d472c00000015
total 1
drwx------. 2 jwaters root 1 Dec 2 13:33 .
drwxr-xr-x. 7 root root 5 Dec 2 12:15 …
-rw-------. 1 jwaters jwaters 0 Dec 2 13:33 .nfsfc2b247f427d472c00000015
total 1.0K
drwx------. 2 jwaters root 1 Dec 2 13:33 .
drwxr-xr-x. 7 root root 5 Dec 2 12:15 …
-rw-------. 1 jwaters jwaters 0 Dec 2 13:33 .nfsfc2b247f427d472c00000015
total 146497
drwx------. 2 jwaters root 1 Dec 2 13:33 .
drwxr-xr-x. 7 root root 5 Dec 2 12:15 …
-rw-------. 1 jwaters jwaters 150011904 Dec 2 13:33 .nfsfc2b247f427d472c00000015
total 144M
drwx------. 2 jwaters root 1 Dec 2 13:33 .
drwxr-xr-x. 7 root root 5 Dec 2 12:15 …
-rw-------. 1 jwaters jwaters 144M Dec 2 13:33 .nfsfc2b247f427d472c00000015
total 393217
drwx------. 2 jwaters root 1 Dec 2 13:33 .
drwxr-xr-x. 7 root root 5 Dec 2 12:15 …
-rw-------. 1 jwaters jwaters 381878272 Dec 2 13:33 .nfsfc2b247f427d472c00000015
total 385M
drwx------. 2 jwaters root 1 Dec 2 13:33 .
drwxr-xr-x. 7 root root 5 Dec 2 12:15 …
-rw-------. 1 jwaters jwaters 365M Dec 2 13:33 .nfsfc2b247f427d472c00000015
Can you please check the following settings to see if they are set in accordance with your needs? These settings below are default, so you will need to adjust to fit your needs. If you can check these nginx settings, I’ll take a look at this issue in the morning. It sounds like a connection timeout or client_max_body_size setting to small.
client_max_body_size <%= nginx_file_upload_max %>;
proxy_connect_timeout 600;
proxy_send_timeout 600;
proxy_read_timeout 600; #send_timeout 600; #said it is a duplicate not sure where it is set
client_header_timeout 3m;
client_body_timeout 3m; #client_max_body_size 5M; #defined above
I put them in to force settings and no change. Start testing client_body/header settings and see.
To make sure I understand what you did.
Looks like you used the values that I sent you.
You will want to change those values to reflect what you need. The timeout values are in seconds, so you may want try 3600 as the timeout values and work backwards from there. 3600 seconds being 1 hour.
I set them for their defaults and bumped them all way up to 7200 (2hr) to see if they would make a difference, with no joy.
We are running v2.0.18
centos 7, with selinux set permissive
3G file uploads and I can see filehandle growning.
nginx 29708 jwaters 10u REG 0,39 2753312798 14609299515652899220 /home/ondemand/jwaters/client_body/0000000001 (deleted)
Once file upload hits 100%, the UI says upload failed.
The move still runs anyway
cd /home/corvid/jwaters/tmp/upload_test
ls -la *tgz; ls -lah *tgz
-rw-------. 1 jwaters jwaters 982843392 Dec 3 10:04 parallel_studio_xe_2020_cluster_edition.tgz
-rw-------. 1 jwaters jwaters 993M Dec 3 10:04 parallel_studio_xe_2020_cluster_edition.tgz
Which completes, sha1sum the files and they match.
Seems more an issue with webUI and handling trapped error.
UI still failing on +3g files to distributed filesystem (beegfs kernel mount or nfs mounted)
Any know issues when using distributed or nfs mount for storage?