A university project created for the lesson "Information Retrieval" with my colleague Nikos Stamopoulos.
Reading the "BX-Books" dataset and uploading the data to ElasticSearch. Then, given a book lemma, run a query in the index, where we uploaded the data, and then return the book results in descending order based on the BM25 similarity ranking, provided by ElasticSearch.
We combine the BM25 ranking, the personal score of the user and the average users' score to create our own personalized ranking for the book results. Return the results in descending order based on this ranking.
Trying to improve the quality of sorting by predicting a user's personal score for each book of the book results.
Performing k-means clustering of the books based on cosine similarity. Then, trying to extract demographic correlations between the clusters.
- BX-Books.csv
- BX-Book-Ratings.csv
- BX-Users.csv
They can be found and downloaded here.