I am not getting desire result - please help!

thinker · May 15, 2021, 10:57am

Hello,

I have multiple JSON files and want to extract all keys from it into a CSV. Below is my code but unfortunately I am not getting my desired result. After execution, I am getting the output in a file but only the first level of keys from the JSON file where I am expecting all keys.
Please see, where is the issue,

import json  # For JSON loading
import csv  # For CSV dict writer


def get_leaves(item, key=None, key_prefix=""):
    """
    This function converts nested dictionary structure to flat
    """
    if isinstance(item, dict):
        leaves = {}
        """Iterates the dictionary and go to leaf node after that calls to get_leaves function recursively to go to leaves level"""
        for item_key in item.keys():
            """Some times leaves and parents or some other leaves might have same key that's why adding leave node key to distinguish"""
            temp_key_prefix = (
                item_key if (key_prefix == "") else (key_prefix + "_" + str(item_key))
            )
            leaves.update(get_leaves(item[item_key], item_key, temp_key_prefix))
        return leaves
    elif isinstance(item, list):
        leaves = {}
        elements = []
        """Iterates the list and go to leaf node after that if it is leave then simply add value to current key's list or 
        calls to get_leaves function recursively to go to leaves level"""
        for element in item:
            if isinstance(element, dict) or isinstance(element, list):
                leaves.update(get_leaves(element, key, key_prefix))
            else:
                elements.append(element)
        if len(elements) > 0:
            leaves[key] = elements
        return leaves
    else:
        return {key_prefix: item}


with open("4.json") as f_input, open("output.csv", "w", newline="") as f_output:
    json_data = json.load(f_input, strict=False)
    """'First parse all entries to get the unique fieldnames why because already we have file in RAM level and
    if we put each dictionary after parsing in list or some data structure it will crash your system due to memory constraint
    that's why first we will get the keys first then we convert each dictionary and put it to CSV"""
    fieldnames = set()
    for entry in json_data:
        fieldnames.update(get_leaves(entry).keys())
    csv_output = csv.DictWriter(f_output, delimiter=";", fieldnames=sorted(fieldnames))
    csv_output.writeheader()
    csv_output.writerows(get_leaves(entry) for entry in json_data)

Sky020 · May 15, 2021, 11:20am

Hello there,

I’ve edited your post for readability. When you enter a code block into a forum post, please precede it with a separate line of three backticks and follow it with a separate line of three backticks to make it easier to read.

You can also use the “preformatted text” tool in the editor (</>) to add backticks around text.

See this post to find the backtick on your keyboard.
Note: Backticks (`) are not single quotes (’).

thinker · May 15, 2021, 12:14pm

I am sorry I was not aware. Thank you

andrew-1135 · May 15, 2021, 10:19pm

Can you post an example of input and desired output? I’m honestly not sure what you’re trying to accomplish here.

thinker · May 16, 2021, 4:57am

This is an example that I have successfully extract the first level of key name “behavior”“apistats” but unfortunately rest of the keys have nested dict and list.

Please let me know how I can share the sample of json file here so it will make sense to test.

andrew-1135 · May 16, 2021, 9:40pm

Still unclear. What was the input? What are the numbers?

Jagaya · May 17, 2021, 9:04am

Can you just open the json and print it - then copy paste the output?
Also consider printing your result to test it, before exporting the csv.

Personally, I just created a nested dict/list thing to test your function.
It looks fine to me - except it didn’t create the correct “dict_d” key but only “d”, so that’s a minor issue.
So maybe tell us what different output you would expect from this:

thing = {
    "dict": {
        "b":"c",
        "c":"b",
        "d":["list1", "list2"]
    },
    "list": [
             {
                 "l_dict":"e"
             },
             "list_entry"
    ]
}

get_leaves(thing)
# output
{'dict_b': 'c',
 'dict_c': 'b',
 'd': ['list1', 'list2'], 
 'list_l_dict': 'e',
 'list': ['list_entry']
 }

Topic		Replies	Views
Itterate over JSON values	2	446	January 16, 2021
Not everything is being parsed into json? Python	11	438	February 28, 2022
Python JSON API looping	2	409	June 1, 2021
¿mejor forma de manejar JSON anidados en python? Python	0	8	April 16, 2026
Nesting data in JSON? Python	1	275	June 1, 2021

I am not getting desire result - please help!

Related topics