Scenario B, phase 2

Hey, how's it hanging? I wrote a piece of code to calculate how many times each word appears in each forum post. Do you think we could use that to improve our spam-classifier? Cheers, Robin.

Robin was bored over the weekend and so decided to write some more software to help moderate the discussion board. The results of Robin’s calculations provide you with additional data. But you are worried about accommodating the word-counts in the training model. The authors of the posts are multilingual, and their posts include many sentences, which can also include both conjugation and spelling errors.

Hands-on option

Basic stage for the hands-on option

Review some scientific literature on multi-layer neural networks in which each layer contains one or more perceptron(s) and discuss how that might help in this situation. Combine writing, diagrams, and pseudocode as needed in your response, and remember to clearly cite all your sources.

In-depth stage for the hands-on option

Review some scientific literature on dimensionality reduction and discuss possible ways of applying it to this situation. Combine writing, diagrams, and pseudocode as needed in your response, and remember to clearly cite all your sources.

Conceptual option

Basic stage for the conceptual option

Review some scientific literature on neural networks and put together some rough guidelines for training one if you need it. What kind of circumstances would not be ideal for this kind of approach? Remember to clearly cite all your sources.

In-depth stage for the conceptual option

Once you have completed your guidelines, read online about deep learning, and discuss how it differs from (vanilla) neural networks. Remember to clearly cite all your sources.