Request - Extract features/artifacts from CUCKOO JSON


Would you please help me to create a script that can extract features or artifacts from JSON files (CUCKOO) that I got after malware analysis. I am not a programmer and someone share with me this website for help.

Thank you for your time.

What exactly do you want to do? If you are not a programmer, why would you need a script for that?

JSON files can be opened with basic text-editors, if you are looking for it’s content.

Thank you for your feedback. I have a dataset of malware/benign and it has total of 8500 files. I have to execute all those files in the virtual environment (Cuckoo) for malware and benign file analysis. The cuckoo will provide me the JSON format summary. I need to extract the artifacts/features like (network part, Static, memory dump, etc) from it. The analysis of one file ok but when you have a huge dataset. It required writing a script that can extract those features/artifacts from the data folder automatically.

Thank you

That certainly explains the need for a script.
But why does a non-programer need to do this? If this was a routine task, surely there would already be a script for that, maybe an entire program dealing with the data, done by programmers. So Google might yield a result.

If it is not a routine task, why do you, a non-programmer need to do this?
And what do you need it for?
I mean, given 8500 files, the script will give you 8500 results for every feature, which depending on purpose would need another script to evaluate those, unless you’d just like to have a list counting all different entries or whatnot. Though I think especially memory-dumps can be quite unique for every single file, resulting in 8500 different dumps which need further analysis.

The reason for the script, I need the dataset in the form of CSV for all malware/benign files so I create a model that can do the prediction using a machine learning algorithm.

Hopefully, it clarify the reason.

Thank you

Doesn’t creating a machine-learning model also include some level or programming?

I can’t write you the script because for a start, I don’t know how the JSON looks.
What I think you need to do is just use python os (module for operation system) to create a loop opening each file, extracting the relevant data based on where it is located in the JSON (by index and/or keys), append that into propably a basic list and then save that list with something like pandas.to_csv

1 Like