Say I have a folder full of .txt files. Within each text file, there is a line that says the following:
Name: [name here]
So for example, the first three text files could contain Bob Smith, Joe Snow, or Mary Fields in the “Name” field (in the place of [name here]).
I’d like to extract the names into a CSV file that looks like this:
file_name, name
1.txt, Bob Smith
2.txt, Joe Snow
3.txt, Mary Fields
Further, I’d like to create a column called “text” that contains all the contents within each .txt file (e.g., in the text column for 1.txt, it will say “Name: Bob Smith”) . This is the JS solution that I’ve tried:
const fs = require('fs');
const files = fs.readdirSync('./').filter((file) => /.txt$/.test(file));
if (!files.length) process.exit(1);
const text = fs.readFileSync('./');
if (!files.length) process.exit(1);
fs.writeFileSync('names.csv', 'file_name, name\n, text');
files.forEach((file) => {
const match = fs.readFileSync(file, { encoding: 'utf8' }).match(/^(.*?)[,\/] Name:/mi);
if (match && match[1]) {
fs.appendFileSync('names.csv', `${file}, ${match[1]}\n, ${text}`);
}
});
For this problem, I don’t know how to do this using Python, but I do know how to do this using JavaScript with Node.js. Using Nodejs’ filesystem module that will read and write any file and JavaScripts functions that will turn it into a Comma Seperated Text and then use the filesystem again to make a .csv file. If you’re interested, you can ask more with me.