You oversee the online discussion board of a nonprofit organization for which you volunteer on weekends. It is important to allow your target audience to interact, but you worry about hate speech and spam. Assigning volunteer staff to screen all posts and comments seems unfeasible, not only because of the workload but also because of the constant exposure to potential toxicity.
Illustration by Robert Couse-Baker, Wikimedia, reused with Creative Commons licence.
You have been talking to your colleagues at the organization about your plans at work for detecting pavement defects, and they encourage you to explore options that would employ computational intelligence to do the heavy lifting for you in this task as well.
Robin has been supporting your data-analysis efforts and has helped clean up a dataset (download the log file) from the forum posts. Your current hypothesis is that people who post abnormally often and get few up-votes might be undesirable users.
Authors are represented by their numerical user IDs since you do not want anyone to think ill of an individual just because your system might, at an early stage, imply that they are a spammer or a troll.
With your preferred computational tool (the Python example
from class is just fine), train a perceptron to label the
posts that were not manually labelled (those that
say none in the third column) into either spam
or good. Remember that your training set can only contain
data points that are manually labelled, and you might want
to set a part of those aside for testing purposes. Discuss
your code in writing and report the results of the
model. Include a confusion matrix.
Select and compute at least three performance measures (either based on the confusion matrix or on some other aspect of the resulting model and/or of the training/testing calculations).
Read the first chapter of the online textbook Neural Networks and Deep Learning (Nielsen, 2019), and then sketch a rudimentary illustrated glossary on the main components and elements of a simple neural network.
Browse through the textbook An Introduction of Neural Networks (Gurney, 1997) and write down any interesting advances, components, and elements that more complex neural networks can contain and apply.