Scenario B, final phase

That NLP video you linked me over lunch today was super cool. I already installed VADER on this laptop and you should see what it says about those polemic HR Emails from while back ;) Cheers, Robin

Using natural language processing, you have gone beyond simply counting string occurrences. Conjugation and spelling are no longer issues. You are even considering automatically translating all the posts into (poorly written) English for the purpose of spam detection.

Hands-on option

Basic stage for the hands-on option

Using a computational tool of your choice (the Python template from class is just fine), carry out stemming and stop-word removal for a text of your choice. Prepare a word cloud in which each (stemmed) word is drawn at a font size that is proportional to its frequency within the text. Provide your code, discuss the process in writing, and include an image of your word cloud in your response.

In-depth stage for the hands-on option

Label the words into three sentiment classes—positive, negative, or neutral—based on a lexicon of your choice. Modify your word cloud from the basic stage to colour each term according to its sentiment.

Conceptual option

Basic stage for the conceptual option

Review some scientific literature on the use of natural language processing in hate-speech detection and then discuss, in writing, the benefits and risks of relying on a computational tool to censor hate speech. Remember to clearly cite all your sources.

In-depth stage for the conceptual option

Once you have completed this task, look into the methods used for automated translation. Discuss possible future AI-assisted language technology applications that are not yet widely in use.