Auditory & Verbal Alternatives to Writing and Reading Code

A Question about Auditory & Verbal options for programmers:

Because my Visual processing of written words is slow in ratio to my auditory processing, and the amount of information I can recall from reading is minimal in comparison, I’ve always preferred to have a double of everything I read in an Auditory format. When I watch Video tutorials, I prefer the ones where the code is spoken in full, not just written.

Is there a usable equivalent to Text-to-Speech for programmers? Are there any good programs that provide Speech-to-Text. Or, is there any way to modify preexisting Text-to-Speech and Speech-to-Text programs, allowing them to read and write code.

You can use any screen reader or text to speech program.

I’m not sure how helpful having code read to you really is unless you are visually impaired of course. I don’t think relying on text to speech will help you improve your pattern recognition. Being able to scan code and see patterns, spot typos, and syntax errors is a very important skill to learn.

I’m sure speech-to-text exists, not sure how viable it is. Again, it will not help you improve your typing skills so it may be a disservice (again unless you actually need it).

I think perhaps I explained poorly.

Pattern Recog, Being able to scan code and see patterns, spotting typos, and syntax errors, are not a problem.

Just the initial memorizing…

I database Auditory information, faster and more efficiently in my Crystallized / Long term memory.

Perhaps your TTS is far superior to the ones that I have tried. I have three different ones, and this is the approximated Audio output Transcription :

GREATER THAN ![!unspoken]DOCTYPE html LESS THAN
GREATER THAN html LESS THAN
GREATER THAN head LESS THAN
GREATER THAN meta name="viewport" content="width=device-width, initial-scale=ONE INCH LESS THAN
GREATER THAN GREATER THAN ![!unspoken]DOCTYPE html LESS THAN
GREATER THAN html LESS THAN
GREATER THAN head LESS THAN
GREATER THAN meta name="viewport" content="width=device-width, initial-scale=ONE INCH LESS THAN
GREATER THAN style LESS THAN
* {
  box-sizing: border-box;
} [} UNSPOKEN]
  DOT  menu {
  float: left;
  width: 20%;
} [} UNSPOKEN]

To describe this CODE:

<!DOCTYPE html>
<html>
<head>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<style>
* {
  box-sizing: border-box;
}
.menu {
  float: left;
  width: 20%;
}

Because it uses a Natural Language interpretation, it also Discludes all Semi Colons, Colons, and unrecognized words are spelled out, Letter by letter.

I am wondering if there is either a program, or a modification to the TTS that better caters the needs of a Programmer.

I haven’t used any of the TTS browser extensions, so I can’t speak to those, but this is pretty standard default behavior for screen readers. Most screen readers will not read a lot of punctuation by default, but you can customize a screen reader to be more verbose with punctuation. But as far as I know there is no special “coding” mode for screen readers. I’m guessing a lot of programming with a screen reader involves going through a line character by character to make sure that the syntax is perfect.

1 Like

Why do you need to memorize the code?

I think you will find it is the actual writing of code that is causing you to remember, not the reading.

I’m sure you can use some excluded list of symbols not to read out. How that works would depend on the program used.

Again, I really do not think it will be all that useful to have code read to you.

1 Like





For the sake of Comprehension:

: ) I would invite you to Join me for a moment in a Thought Experiment

Let’s Imagine, for the sake of Argument, the we live in a Cognitively Neurodiverse World, A world in which the Neural patterning structure of the Brain differs (sometimes vastly) from person to person. That the Built-in pathways have different access points and different points of conjunction, and Overlap.

In this paradigm, I would propose, that It would be reasonable to suggest that, from one individual to another, the tenancy to efficiently learn and store information, may be influenced by the strength and accessibility of certain Neural Regions over other, weaker regions… And that, for some people that means A tenancy for Visual repetition, as you suggest… And for others, the Information Databases most efficiently in Alternate Paths…

Perhaps Like:

Im zweiten Teil der Philosophischen Untersuchungen schreibt Wittgenstein :

"Wenn ein Löwe sprechen könnte , wir könnten ihn nicht verstehen .”

People learn and remember differently, I agree.

However, I will still maintain that you learn to code by writing code, not by reading it. You can learn from reading code but it isn’t the same thing you learn as when you write the code.

If you think you can remember how to write an HTML document (or anything really) just from reading one, I disagree. You can read it 50 times and still not remember how to write it when the time comes to actually write the code.

That is also why watching videos doesn’t really teach you anything until you apply what is being shown.

I have Coded ( incompetently ) for Many years. and I can tell you, though you may be incredulous, that this Indeed is my preferred Path. That is why I am seeking a more personalized route to competence.

: )

I don’t disagree with the need to Type it out, I just want to improve my own work flow by using a STT.

Anyway, that Video may be helpful, so Thank You.

Learning to code is like learning to play a music instrument or baseball. There is no substitute for doing.

…Understood. I’ll get to work.

ㅤㅤㅤㅤㅤㅤ

On the other hand, our personal preferences are not always healthy. Which is the root cause of bad habits.

This is just an opinion, but it is also possible that you may be doing yourself a disservice by doing things “your way”. It may not be serving as extra help in the way you believe it does. I obviously have no way of knowing this so it’s just speculation. Just some food for thought, I’m not here trying to change your mind.

Just for clarity, I use TTS when writing (normal text) because it helps me catch errors I can’t spot. Having it proofread back to me helps a lot.

1 Like

I remember reading an article about a blind programmer. There was a picture their co-worker took of then in their office. Its them with their laptop collapsed and attached to a dock with a keyboard+mouse setup but they are literally facing the wall of their cubical.

They mentioned they use Window’s built in accessibility tools for screen reading. I vaguely remember them saying they primarily rely on Visual Studios, which again supports the screen reader.

Other things that stuck out to me was their screen reading read at somewhere the rate of 130 words per minute. and provided a sound byte of what they hear. Not only did it sound like incredibly quick gibberish, but they mentioned they are very used to its speed even with their “unique” use-case of having it read code to them.

I could imagine their workflow primarily relying on audio queues to understand the context they are in, and specifics, but they would primarily “visualize” the code mentally. As that’s essentially what I’d do even though I still can see.

So its totally possible to program with text-to-speech, but I’m not sure how easy it would be to use, or what sort of impact it would have, positive or negative.


I can kind of relate to these struggles. I’ve always had a bad memory, and spent a good portion of my life struggling with reading comprehension. Overall this doesn’t bode well for a job where your essentially reading most of the time.

However, just like reading a foreign language, reading only works if you comprehend the words your reading. having a conversation with feedback can help you progress faster than trying to memorize, digest the words on the page as-is. That will always be rougher. Some people can do this rather well, but most cannot.

I eventually got over my reading comprehension issues after spending a few summers grinding in “reading camps” to catch back up. The experience was overall horrible, but it did work. Today I’m able to read just fine, but I do have issues every now and then.


There is a concept out there called: “learning paradigms” where people are specifically treated as “visual learners”, or “auditory learners”, etc. I think this is what your referring to with your thought experiment.

However, there is evidence that this isn’t actually a big impact on how someone learns. The subject matter dictates the most sensible learning approach. For example, learning geography makes the most sense visually, where-as learning to build something out of wood makes the most sense with a “hands on” approach, or learning how to compose music would best be taught with some kind of audio factor.

You can consider yourself a auditory learner that learns best when given verbal instructions on a topic. Except most people would find this sort of learning method to be beneficial when the topic isn’t that self-explanatory and somewhat open ended, where a human explaining things can help “digest” the problem. This is in comparison to a more “strict” format where deeper explanation isn’t really helpful, such as memorizing words, dates/places or definitions.

Something as simple as a 10 minute video on a specific topic of web development can give you more insight than trying to read a boring documentation page on similar topics. This ignores whatever “paradigm” you consider your best way to learn and depends entirely on the topic your learning about. A video on “conditionals” in JavaScript might be more interactive and engaging than reading how an if statement works on MDN. This again is less about specific learning styles your brain is “built for” and more about how accessible it is from your current experience and the topic itself.

Learning to code has a number of facets, so focusing on only one approach doesn’t always work either. Something like database design would benefit from a visual approach (hello ER diagrams!), where-as something like learning algorithms would remove the syntax of code entirely and probably benefit from in-person instruction to go over the hard underlying math.

This isn’t to say personal preference isn’t a big factor, but not everyone’s personal preferences are always the most beneficial. Who doesn’t like turning on a 1 hour tutorial and then finding themselves looking at their phone 10 minutes in ;D

I think taking an auditory approach might help, but not as much as a “hands on approach”. Going back to my previous example of “building something out of wood” where a “hands on approach” would help the most with memorizing as you’d literally put things into muscle memory.

Coding is in a similar vein, where using the language of your choice can help you later when presented with a problem. Being comfortable with the language syntax, is similar to being familiar with your physical tools when building a physical object. Getting comfortable with the language itself requires you to “get hands on” with its aspects and just play around. Unlike building something with wood which requires upfront and continual investment, coding requires only a stable internet connection and a device to code on.

Finally, its worth remembering coding is only a means to an end. The code itself is the product of your tools that is used by the computer to solve a problem. Knowing what the problem is, what tools you can leverage, and how to get help when your pre-existing experienced and knowledge is all part of the path to success. Focusing only on the tool will have you missing the bigger picture about how to wield it, and miss out on all the experience required to take your “tool” to build something significant.

2 Likes

Hi @ALLESS ,

I think this will probably depends of your operating systems:

  • linux: espeak (there are others, but I only tried espeak)
  • windows, macos: I don’t know

In linux espeak’s voices are robotic, sibilant and they have weird pitch changes :?.


Python has an espeak library, I tried to use it plus i3wm (window manager) but I needed something with more features. This is the repo:


Later I started to write the app “text- editor: A modal, line oriented with audio cues text editor”:

As you can see in the video, I’m just using Ed[0] (the relevant code, nodeJS):

...
const { spawn }  = require("child_process");
const   ed       = spawn("ed", ["-v"]);
const   esp      = spawn("espeak", ["--punct", "-s", 180]);
const   readline = require("readline");

readline.emitKeypressEvents(process.stdin);
if (process.stdin.isTTY) {
  process.stdin.setRawMode(true);
}

ed.stdout.on("data", (data) => {
  let input = data.toString().split(/(\s)/);
  input.forEach((elem) => {
    esp.stdin.write(`${elem}\n`);
  });
  process.stdout.write(`${data}\n`);
});

ed.stderr.on("data", (data) => {
  process.stdout.write(`textEditor output ${data}\n`);
  esp.stdin.write(`textEditor output\n`);
  esp.stdin.write(`${data}\n`);
});
...


Later I thought that it will be a good idea to write a code editor, with audio cues. This app uses an abstract syntax tree to generate the code (you are not editing a text file). Here a playlist of the app:

https://www.youtube.com/playlist?list=PLEJxzxnNnYoXOl6Tfa8UhtbKvt4k4Fp6z

The advantages of using an Abstract Syntax Tree are:

  • (In theory) syntax errors are almost impossible
  • You can customize the syntax of the language
  • You can see your code as “data”

I think that the option of see the code as “data” can be really useful, an example:

With this code I can get a list of every variable:

...
function createListVariableNames(ctx) {
  let list = [];
  let ast = recast.parse(ctx.src);
  visit(ast, {
    visitVariableDeclaration(path) {
      let node = path.node;
      list.push(node.declarations[0].id.name);
      return false; // this.traverse(path);
    },
  });

  return list;
}
...

And I’m also can get a list of function names, function calls , etc. That means that I can make “queries”(just like a data base or use that data to create other representation of the code (like a diagram, etc.)).


Speech-to-Text:


Cheers and happy coding :slight_smile:

Notes:

[0] ed (text editor) - Wikipedia

3 Likes

@bradtaniguchi When I read Your writing, I always think it deserves three hearts, but they only let me give you one. So here are the other two:
:hearts: :hearts:

Thank you for being super Helpful.

1 Like

No comment. : )

ㅤㅤㅤㅤㅤㅤ

Last, but not least, @JeremyLT : As a person why played both, I agree with you. And this comment prompted me into action.

I logged out of the Forums, after I read this, and then coded for the next 12 hours. So, Thank You for this response.