Guild icon
S3Drive
Community / support / [Win10] Uploading a huge folder causes absurd memory usage
Avatar
I have a folder containing about ~4,000 files, with a total size of about ~250 GBs - with file sizes ranging between 16 KBs to 4 GBs. If I upload it through S3Drive's "upload folder" menu, the memory usage skyrockets to more than 32 GBs - causing my computer to panic under the memory pressure, and pagefile.sys to be as big as 80 GBs. I reduced the amount of upload workers to 2, but the problem persists. I cannot upload this folder to S3 through S3Drive.
Avatar
Hi @Erwan, Thanks for your feedback. I assume that you're using the: Upload folder functionality or perhaps you drag&drop? As an alternative/workaround you can use Sync with mode copy, select local folder as the source path and then destination folder at the remote end. In principle this shall allow you to upload that folder before we can dig into improve memory usage issues of "Upload folder" functionality.
I reduced the amount of upload workers to 2, but the problem persists.
With that setting, how many upload do you see in parallel when visiting Transfers tab?
Avatar
With 2 concurrent workers, I see 30 files being transferred.
8:17 PM
Please ignore the low transfer speed, it's caused by my local hard drive.
8:18 PM
Is S3Drive supposed to behave that way ? I do not know the implementation details of concurrent workers in this context.
8:19 PM
Here is a screenshot of my profile settings for further context :
8:21 PM
here the high chunk size is needed so that my S3 provider accepts my objects upload. Some objects can be as high as 100 GBs and my provider has a 1,000 parts limit on MPU.
Avatar
Avatar
Erwan
Is S3Drive supposed to behave that way ? I do not know the implementation details of concurrent workers in this context.
We've included: Concurrent workers setting recently, it only applies to number of multipart upload within a single file upload. Separately, based on the number of files to upload S3Drive spun out 30 workers to process the upload list. (We don't have a setting yet to limit this, but we happily may include in a next release). That's maybe too much in general, as it means that there maybe up to 60 requests going in parallel, that is 30 files multiplied by two workers (that assumes that each file was over Start threshold as defined above, for files below that file only single worker is used). Regardless of that setting I still find it hard to believe that 60 requests would require 32GB memory, there must've been some issue with something in our app. UPDATE: Ah wait, big part size might be the culprit. With a current design, part size must fit in memory. If there are 30 file uploads with 2 workers each, that gives worst case 30x2x100MB memory usage, that's still "only" 6GB though. It sounds like we may need to dig into this deeper. It would be interesting to know if solution with creating Sync entry helps, as this would give us confirmation that in fact something is inherently wrong with our Folder upload logic. (edited)
Avatar
Thank you for the explanation. I will try out the sync feature at a later time, maybe tomorrow evening if I can. If I may suggest, I feel like the Concurrent workers setting might need a tooltip or a better explanation of what it does. As I understood it before your message, I thought it was the maximum number of upload connections globally to S3. Nonetheless, I would be happy to see a concurrent file upload limit in a later release, if that's not too complicated to implement (as I see you have a large number of features being worked on at the moment).
8:32 PM
Please note that the total memory usage might be even higher, as I see my system memory being maxed out, my C: SSD thrashes hard, hinting that the system tries to swap to disk.
8:33 PM
When this issue happened yesterday, my pagefile reached 80 GBs. I'ved tried it out too today before killing the S3Drive process, and my pagefile only hit 10 GBs as my system came to a crawl.
8:34 PM
I've stopped it early as I didn't want to reboot my computer to free up the space used by the pagefile.
8:35 PM
I would be happy to provide you with more information, if you need any.
Avatar
Avatar
Erwan
Thank you for the explanation. I will try out the sync feature at a later time, maybe tomorrow evening if I can. If I may suggest, I feel like the Concurrent workers setting might need a tooltip or a better explanation of what it does. As I understood it before your message, I thought it was the maximum number of upload connections globally to S3. Nonetheless, I would be happy to see a concurrent file upload limit in a later release, if that's not too complicated to implement (as I see you have a large number of features being worked on at the moment).
Nonetheless, I would be happy to see a concurrent file upload limit in a later release
We'll try to squeeze it in. Shouldn't be extremely hard, but it sounds like it may allow user to reduce memory hungriness and help isolate the problem. For instance if issue is still present once there is a single worker it may be easier for us to troubleshoot the issue.
If I may suggest, I feel like the Concurrent workers setting might need a tooltip or a better explanation of what it does
You're right, we'll try to address that.
I would be happy to provide you with more information, if you need any.
Thanks I will get back to you once I know something. Quick question, do you have E2E encryption enabled?
(edited)
👍 1
Avatar
(Ah, I forgot to mention, when S3Drive tries to upload thousands of objects in a single go, the UI freezes very frequently, making aborting the upload process very difficult.) But this is probably a result of the system going OOM. (edited)
Avatar
Avatar
Tom
Nonetheless, I would be happy to see a concurrent file upload limit in a later release
We'll try to squeeze it in. Shouldn't be extremely hard, but it sounds like it may allow user to reduce memory hungriness and help isolate the problem. For instance if issue is still present once there is a single worker it may be easier for us to troubleshoot the issue.
If I may suggest, I feel like the Concurrent workers setting might need a tooltip or a better explanation of what it does
You're right, we'll try to address that.
I would be happy to provide you with more information, if you need any.
Thanks I will get back to you once I know something. Quick question, do you have E2E encryption enabled?
(edited)
I do not have E2E encryption enabled.
8:38 PM
All settings are stock, except for the concurrent workers being reduced to 2, and part size increased to 100 MBs.
8:40 PM
I may have had MD5 verification turned on for a brief moment, but I believe this isn't the root cause of the issue.
Avatar
Avatar
Erwan
(Ah, I forgot to mention, when S3Drive tries to upload thousands of objects in a single go, the UI freezes very frequently, making aborting the upload process very difficult.) But this is probably a result of the system going OOM. (edited)
There is an ongoing work to address this issue partially, but it's a longer one. There are multiple causes and we gradually try to address them one by one, but it's not all not entirely straightforward. Basically when we do any processing we need to move away "heavy" things away to separate threads, as any processing in the main thread (whilst most convenient from dev point of view) slows down the main thread, which given our underlying tech makes UI slow. In most cases these heavy things are already moved out to a separate thread. The challenge is that if there are plenty of file uploads running, then all the single-threaded code responsible for preparing uploads, scheduling, transfer updates etc. becomes the culprit itself. Finally there are app aspects like Sync and mount where threading works very much differently to our app (it's Golang/Rclone) and we don't always have control over it, but based on your comments it sounds like the UI slowdown actually happened when you've used S3Drive native logic, is that right? Was it Upload folder or Upload files, or both? Thanks for your valuable input so far and I hope we can improve this all up to expectations. (edited)
👍 1
Avatar
The UI slowdown happened during a folder upload through S3Drive's native upload logic. This doesn't happen with mount (and memory usage stays minimal this way, though upload speed isn't as fast), I have a feeling this may be linked to the memory usage issue, I don't know how it is implemented for S3Drive, but I feel like building the list, rendering it and moving things between threads may cause some memory overhead.
8:53 PM
I'll conduct more thorough tests tomorrow evening, and I should thank you for your time so far, I really appreciate your speed and interest in these issues :)
Avatar
Avatar
Erwan
The UI slowdown happened during a folder upload through S3Drive's native upload logic. This doesn't happen with mount (and memory usage stays minimal this way, though upload speed isn't as fast), I have a feeling this may be linked to the memory usage issue, I don't know how it is implemented for S3Drive, but I feel like building the list, rendering it and moving things between threads may cause some memory overhead.
Fair enough, that at least tells us where our focus shall be. I need to say (it's not an our excuse though) that it's pretty challenging to provide consistent behaviour between platforms, as the OS underlyings differ, but we're gradually building the required expertise to handle all these gotchas.
Avatar
Avatar
Tom
Fair enough, that at least tells us where our focus shall be. I need to say (it's not an our excuse though) that it's pretty challenging to provide consistent behaviour between platforms, as the OS underlyings differ, but we're gradually building the required expertise to handle all these gotchas.
(an off-topic bit : this reminds me of KDE's 15-minute bugs initiatives) I absolutely understand, I've made cross-platform apps with flutter in the past, I've been there. I the second we start talking about cross-platform we start having issues and edge-cases like there's no tomorrow.
👍 1
8:56 PM
Not as advanced as S3Drive for sure, but I know how it is.
8:58 PM
I'm quite happy with the product so far, even gone as far as purchasing a perpetual license :)
Avatar
----------------------------- An update on the situation : I've tried using the sync workaround in copy mode from local to remote, and the memory usage stayed minimal that way, that makes me think the problem might be linked to the UI (not 100% certain, but it's confirmed on my part that the bug happens only with a folder upload using the native S3Drive method)
Exported 26 message(s)
Timezone: UTC+0