Skip to content

Latest commit

 

History

History
56 lines (37 loc) · 1.65 KB

README.md

File metadata and controls

56 lines (37 loc) · 1.65 KB

Expedia Hotel Recommendation

By GrandMingLakeのSummerRainLotus

Daodao Wang Weiye Deng
Quantitative Research Associate
fatenaught@gmail.com
MS. Analytics - Data Science
Georgetown University
Business Intelligence Engineer
dwy904@gmail.com
MS. Analytics - Data Science
Georgetown University

Introduction

The goal of this project is to build a multi classification recommendation model to classify the predefined hotel clusters according to the log of customer behavior data (search and other attributes associated with the user events) provided by Expedia. More details could be found on the Kaggle Expedia Hotel Recommendation Page.


Environment Setup

Install Spark and sparklyr (Open RStudio, type in the following code in the RStudio Console):

install.packages("sparklyr")
library(sparklyr)
spark_install(version = "1.6.2")

Install other necessary packages (Open RStudio, type in the following code in the RStudio Console:):

install.packages('ggplot2')
install.packages('readr')
install.packages('lubridate')
install.packages('utils')
install.packages('chron')
install.packages('glmnet')
install.packages('sparklyr')
install.packages('dplyr')
install.packages('reshape2')
install.packages('caret')
install.packages('h2o')

Data Cleaning and Feature Engineering

Data Partitioning

Model Initialization

Parameter Tuning and Cross Validation

Model Combination

Model Finalization