Skip to content

First Matrice project : analyse a simple text and present the outcome with simple terms

Notifications You must be signed in to change notification settings

Laeti-dev/project1-text_analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

project1-text_analysis

First Matrice project : analyse a simple text and present the outcome with simple terms

jupyter Python using matplotlib & wordcloud

In the event of COP27, I wanted to analyse a text on the topic of environment. Remembering Greta Tunberg's direct speeches, I found interesting to analyse words frequencies and make a wordcloud out of it. Then compare 'How dare you' speech, happening in 2019 at the U.N.'s Climate Action Summit and the 'Blah blah blah' speech made two years later in Milan, few month before COP26.

After getting rid of unwanted spaces and punctuation, words were uniformed (lower case), then stored in a list without stop words (keeping 'we', 'you', 'i', 'our', 'your' for the analysis).

Finally, a function taking this list of tokens as a parameter was counting all words frequencies, stores as key-value in a dictionnary.

As a visual support, the data were presented in a horizontal histogram. Different colors were used to separate frequencies as themes seemed to appear within each group.

How dare you Speech top_25 words Blah blah blah speech top_25 words

In the fun of testing visual support, comparison of the two speeches with simple wordcloud gave us :

How dare you Speech wordcloud Blah blah blah speech wordcloud

And to go further, we could play with wordcloud masks : How dare you Speech wordcloud world mask Blah blah blah speech wordcloud world mask

About

First Matrice project : analyse a simple text and present the outcome with simple terms

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published