"GET" requests outside of APIs

Hi, FCC folks. I’ve been working on a word game app, and I’d like to be able to test the validity of words that are played. I’m using the Webster’s Dictionary API and the Oxford Dictionary API, but I find they’re missing a ton of words that show up on the Webster’s and Oxford websites, so I’m looking for a better approach. I had what I thought was a great idea, but I’m having trouble making it work, and I wanted to see if it was actually possible or if I’m wasting my time.
Here’s a Webster’s Dictionary page URL: https://www.merriam-webster.com/dictionary/hello
I can replace the “hello” with any word that I want to look up. I open up browser dev tools, network tab, and if the word has an entry in the dictionary, I get a status code of 200. Otherwise, I get a status code of 404. This looks to me like an easy way to determine if a word is valid.
The problem is I don’t know how to get access to those status codes. If I do a fetch request, I get an error in the console, “Cross-Origin Request Blocked”, because this isn’t an API, but I can see the request status code in the network tab of the browser dev tools. I just don’t know how to get access to that information in my program.
So that’s the long and the short of it. Is it even possible to do what I’m trying to do? And if so, what am I missing?
Thanks for your time!

I don’t think you can really do that from another domain in the JavaScript console or as a browser script; CORS is a security measure to protect from CSRF attacks among some other things that you can read online.

One way developers bypass this is by setting up a little server running either locally or the cloud (heroku or glitch for example) that makes the request and then forwards the response. If you use the server for the sole reason of forwarding the response then it’s called a “cors proxy” and there is already one that does the job:

https://cors-anywhere.herokuapp.com/

I think there used to be crossorigin.me but IDK if it’s still working as intended. So if you don’t want to do the requests serverside for either using it inside your own app or as a mere proxy you can use this little snippet I just used:

const proxy = "https://cors-anywhere.herokuapp.com"
const dictUrl = "merriam-webster.com/dictionary/hello"

fetch(`${proxy}/${dictUrl}`)
  .then(r => r.text()) // if status == 200 gets the html
  .then(console.log) // log the html
  .catch(console.error) // if status is http error

You will eventually get this blob of text that you need to make sense of lol:

<!DOCTYPE html><html lang="en"><head><meta http-equiv="x-ua-compatible" content="ie=edge"> <meta name="referrer" content="unsafe-url"> <meta property="fb:app_id" content="178450008855735"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <link rel="preconnect" href="https://www.google-analytics.com" crossorigin=""> <link rel="preconnect" href="https://js-sec.indexww.com"> <link rel="preconnect" href="https://securepubads.g.doubleclick.net"> <link rel="preconnect" href="https://www.googletagservices.com"> <link rel="preconnect" href="https://www.google.com"> <link rel="preconnect" href="https://adservice.google.com"> <link rel="preconnect" href="https://c.amazon-adsystem.com"> <link rel="preconnect" href="https://aax.amazon-adsystem.com"> <link rel="preconnect" href="https://aan.amazon-adsystem.com"> <link rel="preconnect" href="https://pagead2.googlesyndication.com"> <link rel="preconnect" href="https://tpc.googlesyndication.com"> <link rel="preconnect" href="https://sync.search.spotxchange.com"> <link rel="preconnect" href="https://googleads.g.doubleclick.net"> <link rel="preconnect" href="https://ajax.googleapis.com"> <link rel="preconnect" href="https://connect.facebook.net"> <link rel="preconnect" href="https://cdn.heapanalytics.com"> <link rel="preconnect" href="https://heapanalytics.com"> <link rel="preconnect" href="https://merriam-webster.com/assets"> <link rel="dns-prefetch" href="https://stats.merriam-webster.com"> <link rel="dns-prefetch" href="https://entitlements.jwplayer.com"> <link rel="dns-prefetch" href="https://telemetry.art19.com"> <link rel="dns-prefetch" href="https://p.versod.com"> <title>Hello | Definition of Hello by Merriam-Webster</title> <meta name="description" content="Hello definition is - an expression or gesture of greeting —used interjectionally in greeting, in answering the telephone, or to express surprise. How to use hello in a sentence."> <link rel="canonical" href="https://www.merriam-webster.com/dictionary/hello"> <link rel="search" type="application/opensearchdescription+xml" href="/opensearch/dictionary.xml" title="Merriam-Webster Dictionary"> <meta property="og:title" content="Definition of HELLO"> <meta property="og:image" content="https://merriam-webster.com/assets/mw/static/social-media-share/mw-logo-245x245@1x.png"> <meta property="og:url" content="https://www.merriam-

So yeah have fun doing that in the browser since it’s very complicated to do :slight_smile: .
If you were using NodeJs instead you could parse it with a libary called Cheerio.JS so that your little proxy now does more than proxying but also converts the HTML to something you can explore; once you are able to traverse the parsed HTML using jquery syntax with Cheerio, you are then able to fetch the word definition from the span tag with class dtText

He doesn’t need to make sense of html: then -> word exists, catch -> not :wink:

@LuosRestil, this is hugely unreliable way, I hope you realize that

1 Like

That’s what I’m saying, if you don’t want HTML just find another API that has the words you need, otherwise you’re stuck with what we call “web scrapping” which is a highly unreliable way since you depend a lot on the HTML structure of the scrapped website.

Another way to avoid having to parse the HTML is by using a simple regular expression that extracts what you need; parsing HTML with regexp is absolutely horrendous but in this case it may work.

Well, there’s been significant progress, but I’m still coming up just a hair short. I added an Express/Node server, and I’m sending my request from the client to my server, which then sends a GET request to the target URL.

app.get("/:word", function(req, res) {
  let word = req.params.word;
  fetch(`https://www.merriam-webster.com/dictionary/${word}`)
    .then(response => console.log(response))
})

I’m now getting a response from the request, rather than an error. I’ve logged the response to the console, and I can see the piece of information that I’m looking for, but I’m not quite able to figure out how to access it to send back to the client.
The response is an object with four keys: size, timeout, [Symbol(Body internals)], and [Symbol(Response internals)]. [Symbol(Response internals)] is an object that contains the status of the request, 200 for valid words or 400 for invalid. If I run Object.keys(response), it only shows me two keys, size and timeout, so I’m not sure how to access that last object that contains the info I need. Do I need to use a library to access this, like luishendrix92 mentioned for parsing the response text, or is there some other way? Thanks for the help so far!

We have success! Rather than using my own server, I just did exactly as luishendrix92 initially suggested and used https://cors-anywhere.herokuapp.com/ as a proxy, sending the request from the client. This returns an object with the status code easily accessible. The working code from my test looks like so:

let word = document.getElementById("word").value;
  const proxy = "https://cors-anywhere.herokuapp.com";
  const dictUrl = "merriam-webster.com/dictionary/";

  fetch(`${proxy}/${dictUrl}${word}`)
    .then(r => r.status
    )
    .then(status => {
    if (status == 200) {
      console.log('valid word')
    } else {
      console.log('invalid word')
    }
  })

Thanks so much for your help!