What tech do I need to learn to programmatically parse ingredients from a recipe?

I’m trying to understand either what language, what framework, or what topic in CS I need to learn for the following application I want to make: I’m looking to create an app that can parse ingredients from recipes written in English.

I have a general understanding that the above application is a form of artificial intelligence (AI), but nothing more than this basic understanding. My questions are as follows:

  1. Does the above challenge fall within AI?
  2. I think this falls in the realm of Natural Language Processing (NLP). Is that correct?
  3. What do I need to learn to write a program that can parse ingredients from recipes?
  4. Given the above, does anyone have any suggested courses that I might want to take to learn what I need to learn?
  5. I’m currently learning Kotlin – can I write an app in Kotlin to parse ingredients from a recipe text?
  6. If using Kotlin is a suitable approach, what technologies in addition to Kotlin should I look into for parsing ingredients from a recipe?

I’m no expert on AI/ML …

If the recipes were in a standard format, this would be trivial. But I assume that you mean reading “random” recipes from anywhere… this would be much easier.

Another naive approach might be to have a library of possible ingredients and search for them as keywords. The problem of course is that you would have to have an exhaustive ingredients list, account for misspellings, and you still couldn’t be sure (maybe they say, “I tried this with margarine but I didn’t like the results.” - how would it know what to do?)

Yeah, so you come back to AI/ML and NLP.

I think this is a pretty advanced concept, but then again, I haven’t done much work in this.

When I think of topics like this, I tend to think of languages like Python, R, and Julia. But I’m sure you can do it in Kotlin.

Did you do a google search? Did you run across this?

  1. Yes
  2. Yes, for the most part
  3. The main thing you need is “labeled data”, or data you can train your model in where you have the questions (recipe) and answers (the ingredients from the recipe)
  4. I unfortunately don’t know much AI/data-science beyond the core fundamentals so I have no references, however what you want to do is more or less language processing. Getting the text from your recipe is something already available off the shelf. The part where you train your own model to get specific ingredients is what you want to build.
  5. Kotlin = Java, and Java being such a behemoth means yes you should be able to find something.
  6. I’d google around, again I’m not of much help implementation wise, only core fundamentals.

There is one thing I’d like to point out. Most recipes provide ingredients in their own section, you could very easily just tell your code to get the list of text from that section and parse that. No need to make things more complex than necessary, and even if you trained a complex model using a bunch of different recipes, it might end up being trained to look for that section anyways, because its almost always given.

1 Like

Hi @kevinSmith ! Thank you for your response

If the recipes were in a standard format, this would be trivial. But I assume that you mean reading “random” recipes from anywhere… this would be much easier.

I actually have the raw text of the recipe. I extract them from the web. I’m trying to run a process over this text to parse out the recipe’s ingredients. There could be random text associated with a web page and not necessarily related to a recipe in the raw text.

Another naive approach might be to have a library of possible ingredients and search for them as keywords. The problem of course is that you would have to have an exhaustive ingredients list, account for misspellings, and you still couldn’t be sure (maybe they say, “I tried this with margarine but I didn’t like the results.” - how would it know what to do?)

This is a strong observation. Thank you. Along these lines someone suggested I look at this python library flashtext – maybe I can port over its logic into kotlin

Did you do a google search?

Yes, I did. The forum will only allow me to post two links because I’m a “new user” so I cannot show you all the resources I found from google. Here’s one meta repo

GitHub - keon/awesome-nlp: A curated list of resources dedicated to Natural Language Processing (NLP)

Did you run across this?
I don’t think I did, actually. Thank you for this suggestion!

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.