Repository for the CS229 project of Rémy Zawislak (remzawi@stanford.edu) and Albin Forsberg (albinfor@stanford.edu) on Theme Classification for texts using Naive Bayes and Neural Networks
NB.py: Naive Bayes implementation
DNN.py: Dense Neural Network implementation
CNN.py: Convoluional Neural Network Implementation
SSandDP.py: some theme classification and sentence selection heuristics for entire text classification and some data processing functions
visualization.py: code to plot the confusion matrices
report.pdf: the report as uploaded on gradescope
The base dataset used in the project can be found here: https://www.researchgate.net/publication/304742521_CSV_dataset_of_76000_quotes_suitable_for_quotes_recommender_systems_or_other_analysis.