HTML is just a way of tagging pieces of text with meaning. AFAIK almost every other layout system uses XML or HTML for the same job. You need something like HTML, otherwise there is no possible way to convert a blob of text into a visual layout - there has to be some way to take what the author/designer wants and wrap into something that the program can convert into a web page.
CSS is a bit different; in terms of layout, which is the thing that you want automated, there are other strategies a bit more advanced than what CSS does. Constraint layouts is the main one - for every element, you define horizontal and vertical constraints (this should be {max} this high, this should be {min} this wide, etc) which allow the element to respond based on its relationship to other elements. There are lots of implementations of the most common algorithm (the cassowary constraint solver): http://overconstrained.io/. It actually makes things more complex, but it does make it much easier to build GUIs to build out layouts - for example,here’s a tutorial on how it can be used in Android development, https://developer.android.com/training/constraint-layout/index.html.
This is the original research paper that describes how it works, if you can wrap your head around that kind of thing: https://constraints.cs.washington.edu/solvers/cassowary-tochi.pdf
But with that, you’re still limited in that if you want a GUI, it’s only going to let you pick from a set of components. That may/may not be an issue - it’s easier on, say, Android. You ideally want your app to look like and work like every other Android app, and Android has a design language that describes what all the components should look like, so you can pick the bits you want and slot them together like Lego.
AI can be used to automatically build layouts - there have been a few demonstrations of this in the past couple of years. This is the paper most of the work is based on: https://arxiv.org/pdf/1705.07962.pdf. Here is someone experimenting with it: https://medium.freecodecamp.org/how-you-can-train-an-ai-to-convert-your-design-mockups-into-html-and-css-cc7afd82fed4. AirBnB tried it as well: https://blog.floydhub.com/turning-design-mockups-into-code-with-deep-learning/. The way they work is they take an image of a design mockup, tag each area of the image, and then convert the tags to HTML/CSS. It’s the same method as automatically tagging images - you give the AI lots of examples, and it learns what each area should be called, and builds it out from there. It’s kinda impressive, but it’s very, very limited.
The problem is that this only works with a very constrained set of stuff to build from/to. The explorations of this are mainly using Bootstrap, so they convert the mockup image to HTML with Bootstrap classes; to go beyond that you need the set of data the AI learns from to include all possible HTML layouts and all possible CSS, and that would seem to need a couple of magnitudes more computing power - it’s possible, but hard and very possibly not worth the effort at the minute. If you can have a starting point where everything is designed already, it might work - so if you use a prototyping tool, like Sketch or Adobe Illustrator, it’s much easier convert to HTML, and these currently work quite well. You are still left with the issue of how to deal with responsiveness, but as long as you can give the AI an indication of what the app should look like at various screen sizes, it’s doable I guess.
You’ve got to bear in mind that HTML/CSS was written for displaying [academic] text documents, so there are some fundamental limitations/problems with it with respect to writing GUI applications - flexbox/grid modules help a bit, but it’s all still built on shaky foundations.
Edit: having everything in one place (HTML + all the behaviour) makes things loads easier in many ways - React is a really good example because you define all the behaviour then also write a description of what the HTML should look like in the same file. PHP is also really good in this regard.