React - setting up a project with create-react-app

Hi,

I’m currently diving deep into React (the first framework I’ve tried btw) and I’m setting up the environment with create-react-app. Having only coded in plain JavaScript so far, I’m used to my project folders rarely being larger than 100kb, however with React, they are now about 150MB, due to the fat node_modules folder.

As I tend to create tons of small projects when I’m learning something, I was wondering if I really need all that stuff. Is it because I’m going the fast/easy way with create-react-app? Is there a better way to set up a project?

1 Like

Yes, as the old saying goes, “Node modules are the heaviest objects in the universe.” For better or worse, that is how Node does things.

I’m used to my project folders rarely being larger than 100kb, however with React, they are now about 150MB, due to the fat node_modules folder.

Are you worried about storage space on your computer or are you worried about the size of the eventual app? Most modern computers won’t even notice a project that size, even several. As to the final size, the node modules won’t go into that, just what the app ends up needing. You’ll do a build step where things will get removed and minimized.

As I tend to create tons of small projects when I’m learning something, I was wondering if I really need all that stuff.

Yes. It puts it there because it needs it, at least in the development stage.

Is it because I’m going the fast/easy way with create-react-app? Is there a better way to set up a project?

I doubt it would make much of a difference. You could probably leave off a few dev packages if you did it manually. There might be a leaner way, but I wouldn’t worry about it. Things like WebPack make your life a lot easier.

As I tend to create tons of small projects when I’m learning something, …

Well, again, most modern computers won’t even blink at this. But you can always delete your node_modules from any project you don’t need and easily restore them with an npm install whenever you need it again.

2 Likes

Hi @jsdisco. Like @kevinSmith mentioned above, in production, the files will be minified. Actually my modest laptop became so slow that i had to get rid of the many react projects on it. If you are just learning react and working on small throw-away projects and you are concerned about space, you can make use of codesandbox.io.

My hardware has seen many winters. But if it’s just files lazily lying around there, it shouldn’t become a problem.

As to the final size, the node modules won’t go into that, just what the app ends up needing. You’ll do a build step where things will get removed and minimized.

You could probably leave off a few dev packages if you did it manually. There might be a leaner way, but I wouldn’t worry about it. Things like WebPack make your life a lot easier.

I’m beginning to get an idea of the whole, well, thing. I like to know what’s going on in my projects, in fact I normally know every byte by name because I typed it myself. I’ll just accept that it doesn’t work this way with frameworks. I’ll check out WebPack, though.

Well, again, most modern computers won’t even blink at this. But you can always delete your node_modules from any project you don’t need and easily restore them with an npm install whenever you need it again.

That gave me the idea to start a project with an empty modules folder and just wait for the error avalanche and install everything manually to get an idea of what’s needed and why. Then I realised that said folder contains 5000 subfolders. But I’ve put it on my bucket list.

@nibble Incidentally, I’ve tried codesandbox.io today and uploaded my current project, it’s supposed to let you browse through a number of retro games, all of which I’ve already built in JavaScript and which I’m now rebuilding in React:

It still has tons of bugs though. I’ll figure them out one by one, but I wouldn’t mind if someone has any comments if I’m doing something obviously wrong, so feel free to check it out.

I’m beginning to get an idea of the whole, well, thing. I like to know what’s going on in my projects, in fact I normally know every byte by name because I typed it myself. I’ll just accept that it doesn’t work this way with frameworks. I’ll check out WebPack, though.

Yeah, afaik, CRA is using web pack behind the scenes. That is what allows you to break your app into a bunch of files and use babel, etc. Then when you run npm run build, it will create a production ready version in your app in the “build” folder (or “dist” or whatever it’s set to). The node modules don’t end up in the build, just the embedded code that you need.

That gave me the idea to start a project with an empty modules folder and just wait for the error avalanche and install everything manually to get an idea of what’s needed and why.

Why it’s needed is because package.json says so. There is a list of dependencies in there. npm downloads those files into node_modules. But each of those has a package.json with dependencies, so those need to be downloaded, etc. The files in node_modules are just JS files. You can actually go and mess with them - it’s not a great idea but I’ve done it a few times to try and debug something that is going wrong inside the packaged itself.

There are other ways to start React apps. Check out this one–it’s tiny. You might not need all that extra stuff.

There are also other package managers like pnpm which tries to combat the ridiculous node_modules bloat you get.

I have only used it a few times and for the most part, it works well. I did run into a few issues but I guess that is to be expected. As I recall I had an issue with one project where I was testing GitHub actions with a Slack bot, but I can’t really remember what the problem was.

With CRA, out of the box the end result is going to have 2 direct dependencies (React and ReactDOM)

  • React has 3 dependencies:
    • loose-envify for process.env vars, which is tiny, and which in turn has 1 tiny dependency.
    • object-assign, which is a tiny Object.assign polyfill for older browsers.
    • prop-types, which 3 dependencies: the 2 dependencies above + react-is which is tiny.
  • ReactDOM has 4 dependencies:
    • the 3 above
    • scheduler, which has 2 dependencies (loose-envify and object-assign)

So in total, 8 dependencies, of which React and ReactDOM are the only large-ish ones (they’re still pretty small). Basically, a few hundred kilobytes of code (a few tens once optimised).

All the rest is tooling to allow you to build the code. That’s most of the stuff you can see in node_modules. JavaScript as a language has no built-in tooling, so everything has to be a library. With CRA, most everything is built on top of the module bundler Webpack, with Babel (that, critically for React, can turn JSX to JS), and ESLint to find errors in the code. CRA needs to be able to handle most scenarios, so for example it lets you import CSS files or SVG files in your code even though those aren’t JS files – there has to be the tooling in place to let you do that. And all of the tools are modular, so there is normally a core package then lots of packages that provide the various functionalities needed. Lots and lots of packages (this isn’t a bad thing btw, it’s just a tradeoff, and a good tradeoff at that). Here’s an overview:

https://levelup.gitconnected.com/what-does-create-react-app-actually-do-73c899443d61

So it can:

  • build your React app JS code using your code + the react and react dom libraries
  • convert that to code that will run in older browsers even if you’re using newer features.
  • build an optimised version for production.
  • build the HTML index file it’s going to be rendered to
  • take CSS files, allow them to be used as modules in the code but split them out into a single output CSS file
  • add in any browser-specific prefixes needed to the CSS to allow it to work in all browsers.
  • run a development server that lets all this happen locally
  • set up the test runner (Jest) with the assertion libraries and test framework that let you start writing tests immediately.
  • include setup for a service worker that will allow the app to work offline.
  • set up linting to make sure errors get automatically caught as you’re developing.
  • include a11y linting to help highlight accessibility issues that might need fixing.
  • full support for Typescript with a flag at setup.
  • full support for Flow with a flag at setup.
  • full support for plug’n’play (Yarn, doesn’t use node_modules).

So this all works pretty well, but it’s fragile. It’s a load of complex tools glued together with some scripts. And generally more tools get dumped on top of that as you try to do more things. So it’s more than reasonable to worry about the size and the ability to understand it all: lots of people are worried about similar things.

So state of the art & current bleeding edge (sorry, massive wall of text) :

node_modules itself is a bit of an issue, as is the explosion of dependencies that can happen. The Yarn package manager doesn’t have node_modules and is very good at deduplication: this works well in practice (CRA for example goes from about 250Mb of node modules to about 50Mb of cached zip files). The packages are still there, but the size reduction means it’s feasible to commit everything to a repo (which in turn removes any install steps). I think things will move toward the Yarn model, but there’s also pnpm that’s been mentioned (uses a global cache afaik, so you only install a given dependency once on a given machine). I think Yarn will win out, or at least how it works will be absorbed into NPM and node_modules will disappear as a thing, but maybe not :man_shrugging: .

They don’t really reduce the dependency complexity in any way: all the dependencies are still there, just in a different place.

One of the main issues with React is that, because it uses JSX, it needs a compilation step to turn the JSX into JS. This can be done in-browser via (eg) Babel, but that’s not feasible IRL because it’s too slow. React as a library isn’t designed to run in a browser either – FB don’t produce ESModule builds, which are necessary to be able to do that, it is assumed there will always be a build step.

Preact, which shares the same API as React, can do that, and can also use a library called HTM to write components (it looks like JSX, but uses JS template strings). This does have constraints (particularly w/r/t adding dependencies), but it works pretty well. You technically don’t need node_nodules at all if you do this, you can just write JS and import the stuff you need in the JS files from a CDN like Skypack or Unpkg or JSDeliver. See Preact’s documentation. Note that the Vue UI library can also do this (and note that one of the highlighted usecases is simply replacing code that would have used JQuery with code that uses Vue).

Generally, though, you do want to bundle your JavaScript unless you can guarantee that the client has a very modern browser, that you don’t need much in the way of extra libraries, and that you’re happy losing out on a lot of the optimisation that a bundler can do.

Webpack is the de facto standard tool for building JS applications. It’s modular, and the ecosystem of plugins is now huge. Just out of the box it actually works pretty well, but it’s extremely flexible, and that flexibility means it’s often not at all simple. It’s also pretty slow.

Parcel is an alternative: it takes a lot less config but the trade-off is that it’s doing basically the same things, but now they’re magic and hidden. Also pretty slow.

Rollup is another alternative, but it’s designed to produce es modules (which it is very good at doing), and as such it’s primarily for bundling libraries rather than applications (it becomes as complicated and unwieldy as Webpack if you try to use it as an app bundler). Somewhat faster than the other two, but not really targeted at general application use.

Snowpack wraps the client-side code up and serves it as ESModules, which makes iterating in development incredibly fast compared to any of the above. It seems kinda jerry rigged atm though and a bit flaky, docs are not great.

ESBuild is slightly limited at the minute (not necessarily a bad thing), but generally works really well if it’s just JS (no CSS etc). It’s between 10-100 times faster in terms of compilation, and it replaces lots of dependencies with a single binary written in Go. It’s very good, and completely usable, but it’s incomplete atm compared to say Webpack (it can’t completely replace it anyway, but it’s only trying to do that for common usecases).

The Rome build tool is attempting to cover formatting, linting, bundling, etc (ie almost everything those 250Mb of development dependencies do in a single executable), it’s way, way off at the minute though (only works for linting atm). More modern languages (Go, Rust and Elixir for example) ship with tooling. Having a single tool that does this for JS would be very nice, and would solve innumerable headaches.

Deno (an alternative to Node) has the capacity in the future to work as a platform for JS app build tools, particularly as code written with Deno should often Just Work in a browser as-is. But it’s extremely new and can’t do any of the optimisation that a bundler can do (it doesn’t need to, its server-side).

Also

  • Babel, which is a JS -> JS compiler, is kinda becoming less necessary with evergreen browsers become the norm (ie they automatically update). It or something similar is still needed for JSX or GraphQL or other things that can be compiled to JS code, and for newer features, and for converting modern code to older versions of JS (there are some alternatives, SWC for example).
  • Things will move towards ESModules (so less need for bundling, plus HTTP/2 in theory works better with lots of small files loaded in parallel rather then one or two big bundles) but that’s a way off as there quite a few hurdles to get over. Bundlers can “tree shake”, ie remove bits of code that aren’t used from the end bundle, there’s no solution to that with ESModules atm (this is a core issue with attempts to use Deno for building FE code as well, as Deno works the same way).
3 Likes

Back when I was a beginner, I would have found all of this a little overwhelming. My advice to my younger self would have been to not worry about it - just use CRA, learn React really well, and you can get deep into this stuff later.

1 Like

Of course I won’t learn all that at this moment, and I’ll certainly build my first 10-20 projects with CRA, but it’s good to have at least some fuzzy idea of what’s going on under the hood.

I also get the point why that extra load might make sense. I always prefer to build stuff myself as long as it’s not too complicated, a responsive nav bar for example, a slide show, stuff like that, and admittedly kinda looked down on people who use free tools for that, but I realised that their solutions were often more stable and reliable in terms of browser compatibility.

One thing that I don’t really get though is why this bloated modules folder needs to be literally present in each and every project, why aren’t those modules stored once somewhere on my machine where they can be referenced, and all that the project gets is the package json with the list of dependencies? It just seems awfully ineffective.

I also get the point why that extra load might make sense. I always prefer to build stuff myself as long as it’s not too complicated, a responsive nav bar for example, a slide show, …

But in this case I think the analogy is that you are trying to learn how to drive the car, and you stop the teacher and say, “Wait, how does the engine work? Let’s take a bunch of time away from learning to drive to learn how to build my own engine.”

It is worthwhile to learn how to build and engine. In the end it might even make you a marginally better driver. But in the short term, in the goal of learning to drive, it is a massive distraction.

… and admittedly kinda looked down on people who use free tools for that, but I realised that their solutions were often more stable and reliable in terms of browser compatibility.

It’s also a matter of time and quality. Sure, I could spend a few months developing a library or UI components. Will it ever be as good as one of the canned ones? Will it be as well tested? This is the power of open-source software - all of us helping each other out, building on each other’s work, allowing us to go much further than we might have on our own. Again, if I’m building a house, I just buy a door, or hire a specialist - I don’t get sidetracked learning how to do it. I don’t learn how to blow glass for the windows - I could never do that as well as what I can get from the store and it would take forever to try. I don’t learn how to make paint from scratch.

One thing that I don’t really get though is why this bloated modules folder needs to be literally present in each and every project, why aren’t those modules stored once somewhere on my machine where they can be referenced,

Because different projects may need different versions of those projects. I have a lot of projects on my computer, each created at different times so they had different versions of libraries, and the same for the dependencies of those libraries. Upgrading a package is not always a simple thing, especially as a program gets larger and more aggressive with how they use edge cases of those libraries. There may also be cases where you need to do things manually with node modules.

And if they had a global library of packages indexed by version, how would that library know when that package wouldn’t be needed anymore? Is it going to search your computer for every folder called “node_modules”?

But yeah, you’re not the first person to comment on it. It’s a source of humor for many people, how ludicrously large it can get. But none of that will change what we have. I can see arguments for how it works and arguments against how it works. But it is how it works and I can either accept it or work in a different language.

And memory is cheap - it’s about $50 bucks for a TB. I see a 5TB for about $100.

I have about a dozen projects in my project folders. The entire folder (node_modules and everything else) is 5.5 GB. That means that with a 1 TB drive, I could fit about 100 of those programs before I even used up half of that. And it would be much less if I just deleted node_modules in unused projects. Or if I just archived them on github and deleted them.

My work project is big - it is 693 MB. This is a professional, industrial level React native app. I could fit a couple of hundred of those without running into trouble.


The world is full of questionable design choices with which society just gets stuck. Do we need to talk about the qwerty keyboard? Yeah, there are complaints about Node. Even Ryan Dahl, they guy that invented Node has some complaints. That’s why he invented a new runtime to fix those mistakes.

I was a music major before. I saw a lot of guys that would get bogged down in questions of “Why do I need to do it this way?” Some of those guys got so bogged down with questioning things that it stunted their learning, sometimes crippling it.

High level questions like this can wait for later. Just learn how to code for now. And then maybe in 10 years you’ll create a new JS runtime that fixes every problem.