Announcement: Learn Outage

Wishing you the best! Cannot wait to get back into learning!

4 Likes

I found if you are not logged in, for example if you use incognito mode, you have access to all learning material.

1 Like

in that case the database is not updated, so, yes, that would work

2 Likes

We can save it in notepad and then paste it on site when it’ll works

Thank you for the work-around!

1 Like

Update:

We have made a couple of rounds of releases. We are seeing slight improvements in load times.

The core performance issues still continue to exist.

The team has been working hard to bring things back to normal - you should eventually see the learning platform perform faster as we keep making tiny changes.

We wish we could have predicted this class of issues when operating at scale. We understand how frustrating this must be, and again we request your patience.

Thanks

6 Likes

Thank you for your hard work! Looking forward to study again :slight_smile:

1 Like

It’s working so much better now!!! Thank you for all your hard work today - I bet you were in back-to-back emergency meetings and all-hands on deck type of day! time to grab a “Pepsi” : fist bump!!

as soo as I get better at coding, I will deff try to contribute to the platform - it’s such a great resource that is impacting many lives!

3 Likes

Thanks for your efforts.

Samson

1 Like

Update:

We have rolled out even more updates. And the platform should be in a much more usable state. Although we are in our non-peak hours as I post this.

:point_right: We are not completely out of the woods.

2-day scale:

4-hour scale:

As you can see, we have been working on steady progress with the issues, and API is making more and more successful resolutions.

We will continue to work on the issues and try and return to normalcy as soon as possible.

Thanks, everyone, for your patience and kind words as always.

3 Likes

Thank you for your hard work. Kinda fascinating that FCC decided not to reverse the production. I am wondering what can be done here to improve the API endpoints. Expanding vertically/horizontally? Or creating extra caches to handle API workload and increase time of interactions to avoid overloading?

Would love to learn more!

1 Like

Thanks for the kind words.

Yes - These are great questions.

We did not want to revert because that would mean rolling back a freshly released overhaul of the Responsive Web Design Certification (RWD). The thing is, these new RWD curricula aren’t an issue. If anything, their response has been very positive (actually overwhelming and hence the heightened load).

The “real” issue here was when we overhauled the platform to be able to build projects on freeCodeCamp; we had not optimized things for scale. So when we tested (and re-tested), we were doing so in environments where we would have a few dozen folks or sometimes a hundred.

But we operate at scales much more significant than that. Here are how many challenge submissions we have gotten in the last week.

This is where we had some oversight. Some of the new logic we put in place ran slower than they should have. Of course, this was unknown to us in our User Acceptance testing but became very evident in the last 24 hours.

As our systems got caught up with the newfound popularity, every slow query started cascading compound with users’ impatience. More and more people must be refreshing their page out of habit and hammering the backends even harder.

Again this could not have been predicted ahead of time. The systems started responding to the load before tipping over almost a half-week after the new stack went out.

In other words, the infrastructure and the new platform need to become better.

RWD is the first implementation, and we will transition more certifications to the newer stack. But before we do, we will have to pace ourselves and be ready for scale.

So what can be done about this?

You are correct on some fronts, but it’s more complex than it may seem. We have mechanisms where the systems do autoscale horizontally and vertically with demand.

And indeed because of these protections, other stacks of the freeCodeCamp ecosystem like news, forum, chat, etc., were not affected a bit.

However, there is no excuse for unoptimized calls to the DB in this particular case. Because we are on MongoDB and essentially have only one document per user, plus coupled with the fact that our API is years old, we need to redesign parts and bits of it. The dev team is already working on it.

However, we have been trying to fix some of the cruft that we have gathered along the way, and we should see that our users do not see this exact issue in the future.

We will share more detailed updates should you be interested once these fires are entirely out.

9 Likes

Thank you, appreciate all of your hard work and effort.

2 Likes

Thanks for your hard work!

1 Like

Update:

We are seeing more and more requests getting timed out as we enter the peak hours of the day. Our team is still hard at work investigating and trying to resolve these outages.

Thanks for your patience.

2 Likes

is anyone else unable to submit any challenges when i go to test them it just gets stuck on running tests is this a know issue or is it new?

1 Like

Same here! I’m unable to submit a challenge, the site is still down for me currently atm. You’re not alone on this one.

i can get into the site but i cant test my code so i cant progress ima make an issue on github

the site is down and the freecodecamp team said they are working on it

ik but i can get in yesterday it worked when the site was loading really slow but that was all that happened so i think a new issue arised that is why i made the post