Python help with wordcloud

HollyBomb · October 2, 2022, 3:01pm

Hi. So i’ve recently been undertaking a course in python with google certificates. I’ve managed great so far but the final project is extremely hard. The software doesn’t show you the progress of the dictionary created so far so it’s quite difficult to understand where i’m going wrong. The outcome is to create a word cloud with the counted words in a dictionary.

Below is the current code i have.
The part that says learner code starts here is the only code written by me and there is other installs and uploads done previously to this with some done behind the scenes. Any help would be appreciated.

    punctuations = '''!()-[]{};:'"\,<>./?@#$%^&*_~'''
    uninteresting_words = ["the", "a", "to", "if", "is", "it", "of", "and", "or", "an", "as", "i", "me", "my", \
    "we", "our", "ours", "you", "your", "yours", "he", "she", "him", "his", "her", "hers", "its", "they", "them", \
    "their", "what", "which", "who", "whom", "this", "that", "am", "are", "was", "were", "be", "been", "being", \
    "have", "has", "had", "do", "does", "did", "but", "at", "by", "with", "from", "here", "when", "where", "how", \
    "all", "any", "both", "each", "few", "more", "some", "such", "no", "nor", "too", "very", "can", "will", "just"]
    
    # LEARNER CODE START HERE
    split_text = file_contents.split()
    result_dict = {}
    
    for word in split_text:
        for char in word:
                if char in punctuations:
                    char.replace(punctuations, "")
        
        if word in uninteresting_words:
            pass
        else:
            if word not in result_dict:
                result_dict += word
                result_dict[word] = 1
                
            else:
                result_dict[word] += 1

               
    #wordcloud
    cloud = wordcloud.WordCloud()
    cloud.generate_from_frequencies(result_dict)
    return cloud.to_array()

myimage = calculate_frequencies(result_dict)
plt.imshow(myimage, interpolation = 'nearest')
plt.axis('off')
plt.show()

greentart · October 3, 2022, 2:00am

The immediate problem is:

        if word not in result_dict:
            result_dict += word
            result_dict[word] = 1

result_dict += word means {} = {} + "word" which doesn’t make sense. result_dict[word] = 1 is all you need to add the new word and its count (1) to the dictionary. Once you fix that, you’ll find that this doesn’t remove any punctuation:

for word in split_text:
    for char in word:
        if char in punctuations:
            char.replace(punctuations, "")

Strings in python are immutable. You can’t change them. Any time you make a change, you’re actually create a new string and the original isn’t affected. char.replace doesn’t change char and even if it did, it doesn’t change the word where char came from. It also doesn’t take a list of characters like that, unfortunately. This will work:

for word in split_text:
    for punctuation in punctuations:
        if punctuation in word:
            word = word.replace(punctuation, "")

But it’s not very efficient. Maybe you can find a better way?

HollyBomb · October 3, 2022, 12:17pm

thank you, that makes sense.
i’ve replaced the text with what you have suggested but its still throwing this error at me:

NameError Traceback (most recent call last)
in
1 # Display your wordcloud image
2
----> 3 myimage = calculate_frequencies(result_dict)
4 plt.imshow(myimage, interpolation = ‘nearest’)
5 plt.axis(‘off’)

NameError: name ‘result_dict’ is not defined

This error doesnt help me much at all as it doesnt point to anything specific to fix in the code

greentart · October 3, 2022, 8:43pm

It literally points to the line: ----> 3 myimage = calculate_frequencies(result_dict) and at the bottom explains give the reason NameError: name ‘result_dict’ is not defined. Python error messages aren’t the greatest in the universe, but they are very good. You should take more time to read them. For whatever reason, nothing called result_dict is defined. You must not be running the code that creates it. If you’re running this in a jupyter notebook environment, you have to run all the cells.

system · April 4, 2023, 8:44am

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.