First commercial app, need some advice for storage

Hello fcc forums.

This is the first serious app I want to deploy, serious as I intend to get paid by visitors for the service it gives.

need advice on how to store a large amount of files (lets say it can get up to 10GB) that i want the user to download, can i generate that in browser or should i use a cloud storage service ?
The files are temporary meaning after download they will be deleted.

Too little information. Are those video files/streams? Look at video hosting (syntax.fm recommends mux). Big files fetched not too often? AWS, Azure or GCP with edge caching.

Btw, I can recommend a book: https://dataintensive.net/

1 Like

I’m not sure what you mean by “generate in browser”.

If you need to store files on a server somewhere, as they are shared/available to multiple users, even if only temporary, you’d probably want some kind of dedicated high performance “blob” storage, AKA a “bucket”. Prime examples in the cloud would be an S3 bucket.

Most cloud providers have support for some kind of “blob” storage, so I’d shop around. You could also have this data saved directly into your instance disk on some server, but this isn’t available with all architectures, could get messy, and probably wont perform as well as using a dedicated storage.

The 10gb metric could get concerning, or not be a big deal depending on how the app works.

If your uploading movies every 2 seconds, so someone else can download them, not only is that illegal, but your network usage would go through the roof, as thats roughly a 10gb movie for every user. Whereas if your uploading pictures for some social media app where things disappear (like Snapchat maybe?) you’d go up to 10gb, but then go back down over time. This would mean your network costs are lower as you “grow” to 10gb slower, but wont pay for it all at once.

So as a summary:

  1. Depending on how it works 10gb could result in other issues
  2. 10gb by itself isn’t a big deal
  3. There are dedicated “bucket” solutions to handle large/large amounts of files

Costs for this shouldn’t run to high, but there will be costs. So I’d also provide a rough budget. AFAIK there isn’t any free solutions out there with this sort of scale support, so you’d probably be looking for something cheap and limited.

Good luck, keep building, keep learning :+1:

2 Likes

The user will be uploading images, up to 10mb in size and i return up to 10,000 combinations (so the size I am returning will vary, it can actually surpass 10GB but i will put a limit)
The files will be up for download for a limited period of time (I will have to check the costs to decide how long).

I don’t expect too big of a traffic, it will be great if i achieve 500 visits per day.

I am aware there is no free solution for this, just wondering if it can be done, as for the can it be generated in browser I asked it because of the download attribute for the <a> tag, as the user will be downloading multiple small size files .

I am sorry I am not too clear in my questions as I am still trying to figure out what options I have before even choosing one (and yes, i should have asked this a month ago before building the front end, didn’t plan too well :grimacing:)

thank you for helping out

This sounds like you can “fake it”, and have up to 10k combinations, but only need to “return” a tiny subset of this.

Think google results, where millions of hits are returned, but the UI only shows the top 20.

You’d still need to save/hold-onto those 10k, but you probably wont need/use them all the time.

ah but the problem is i want the user to dowload all those files, i already just fake them to show them on the website.
the user have lets say up to 10 - 15mb of data to input, he makes some work with them on the app then downloads the result that could get very large, maybe i dont even need storage for that as everything can be done in browser, I am trying to better understand what options i have before commiting to a solution that might takes some days to build

The issue with doing this level of work in the browser will require the user’s computer to provide all the RAM required to handle it. 10gb is in the realm of a large majority of browsers would crash as the computer wont have that much. Let alone have the user have a good experience.

However, if there is 10mb of data for the user and the app returns this level of data, then sure you can do it in the browser. However, this becomes less about the data and more about run-time performance. Depending on what your doing 10mb could be easily handled by the browser (load up 10mb of image data), or it could crush runtime performance if done slowly, for example ?nodejs go get me 10mb of factorials", see you at the heat death of the universe!

Overall I think we are still missing what your trying to do and just making guessing based on a few data points of “data” you’ve given. (10mb input, 10gb at some level?)

So it’s tough to give a solid estimate or guess where the pain points would be, since not only is that a wide range, but that data means different things in different contexts. There is also the performance aspect of it, where you could take large amounts of data and just handle it wrong and end up in a situation where the app just doesn’t work due to performance issues, never mind available RAM/HD space available.

1 Like

Sorry I was very unclear about what I am trying to do.

The app is just a simple idea that I had because of receiving multiple requests to generate a large amount of combinations of images from people that are not into coding, so I wanted to simplify it to them and have them do it by themselves in an app, I realize there is already software that can do this but there are additional functionality that I added that makes the app more appealing to the targeted customer.

So the user starts by inputting some images, the user input will never surpass 10mb no matter how many images he inputs.
I need to return at max 10,000 combinations of those images for the user to download, because there is a large amount it can get quite large, so whats the best way to do that ?
Maybe I could have made a desktop app and mobile app, but I have never built anything using electron, and not too experienced with react native.
Is the web solution viable ? can it be done with reasonable costs ?
thank you for helping @bradtaniguchi

I think the detail of how you go from 1 image to 10k “combinations to download” is where the biggest concern is. Are you holding only 10k images? Or are you generating those? Generating 10gb of data sounds increadible CPU+GPU intensive, along with the actual memory concerns of doing such.

There is also the question of what sorta of stuff your doing to generate/find those combinations. This is probably where specific technical elements matter heavily. “Generating” 10k images in JS would be nearly unusable, where-as using some lower end libraries wouldn’t be as big of an issue.

The overall idea might be simple, but the execution at this scale wont be as simple. As I mentioned before, due to the amount of data we are talking about there are plenty of ways to do this slightly wrong and end up with an unworkable app. The storage concerns are only a slight part of this.

If you have reservations about what your doing, odds are you are probably missing out on some key parts of what you need to do and it won’t actually work out.

We are still missing out on the details of what your actually trying to do beyond some of the technical aspects. The technical requirements are usually built around the “functional requirements” (what you want the app to do for end-users) which are only vaguely described. Some example questions one would have would be like the following:

  1. What images are returned from the image(s) I upload? Why that many?
  2. Why would I want 10gb of data to download?
  3. How long would this process take?
  4. Why would I want to use this app?
  5. How often would this be used? By Whom?

It’s possible you’re being vague about the project requirements for other reasons (like it being proprietary for example). This is fine, it just makes understanding the problem harder when the problem is vague if you know what I mean. :slight_smile:

1 Like

No i don’t hold 10k images all at once, only the images the user inputs.
But I need the user to download a max of 10k images (the maximum size for each image will approximately be 10mb in worst case).

  1. the images returned are combinations of the input with possibly added effects, its the user who wants that many.
  2. this is just worst case scenario, and also its possible to not download it in one go.
  3. I don’t have an exact idea until i test it, its not a big deal if it takes too long, just trying to find the correct way to do it.

I am sorry for being vague, I do it by caution, and only expose the relevant parts to the problem.

I am thinking of triggering a multi download, where each file is downloaded before moving to the next one.

So if your generating 10k images in the worst case, you’d have to optimize your code to at least handle the worst case. Image manipulation is a difficult problem in regards to CPU usage. You could always generate the images on the fly and directly “send” them to the user’s browser to download. This would prevent you from needing to create all the images upfront, and then having the user download all those images.

However the problem your describing is the sort of problem that could be done in a multitude of ways that result in a worthless user experience.

Something as simple as using just JavaScript on the client-side could result in the browser crashing, or taking way to long to render each image. JS is fast, but JS is single threaded, if its doing complex CPU calculations in regards to applying filters to an image the app will literally freeze until its done.

Not only that, you could see runtimes in the hours and thats if it doesn’t crash due to a memory leak. I’m not sure how you’d apply your filters, or if how your planning would be supported on the client-side, but it would be something to test to make sure it would scale into the average and worst case scenarios.

I could see these sorts of problems being more manageable on the back-end, where you could use a faster run-time than nodejs (which is also single threaded). Its one thing to freeze up the client-side for hours on end, its another to freeze your back-end server for hours on end. Using a whole other back-end stack, or delegating the complex CPU work to another run-time might be the only option.

This again depends on how your planning on applying the manipulations itself. There are some work-arounds in regards to doing this on the client-side only (webassembly anyone?) but I’d verify that route works before trying to scale up.

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.