Housing Market Prediction

We use economic data to predict housing market trends, more specifically crashes. We also use sentiment analysis using live tweets to track general feelings about housing market and what the general talk around the project is. We start off with the sentiment analysis and then dive into the economic data.

Sentiment Analysis on Housing Market

The sentiment analysis is largely dependent on Twitter. We user the vader analysis for the sentiment analysis, however we do create a custom list of stopwords. Those custom words are then added to a list of custom words imported from the NLTK library to later create a word cloud.

Set Up API

Set up Tweepy with required tokens and access keys. Using Api, we created a function that pulls Tweets from Twitter and does a sentiment analysis of those Tweets.

Keyword search

Created function that allows the input of any keyword (can also be hashtag) and searches a requested amount of Tweets, as related to keyword and number inputed. The output is a list of raw Tweets containing the inputted keyword

Sentiment Analysis

Dataframe was created containing Tweets with "positive", "negative", and "neutral" sentiment. Created a function that spits out the count of how many Tweets are in each dataframe

Raw Tweets

The variable "tweet_list" contains a list of the most recent tweets as described by the parameters inputted in the "keyword search"

Stopwords

Stopwords were imported from nltk.corpus. We also created a for loop that iterated through each tweet to find words that were frequently mentionned. These words could have been a list of adverbs, hashtags, or verbs that don't add much syntax to the project, for example: "a, #housingmarket, realestate, isn't." The goal in finding frequently mentionned words was to create a custom list of stopwords, so we could find more "valuable" words that are mentionnend when a specifici key word is mentionned.

Processing Tweets

Created a function that cleans tweets and removes stopwords

Wordcloud

After each tweet has been processed and cleaned for stopwords, a wordcloud is generated containing words that showed up often. The goal of the wordcloud is to see what people say when a specific keyword is searched.

Wordcloud #2

This wordcloud was conducted a week later to see if their were common words that showedup.

Economic Data on Housing Market

Most of the data used is public data from Fanny Mac. The data contains fixed and adjusted mortgage rates for houses starting from 1971. The data also contains 15 year and 30 year interest rates, as well as the margin of profit that banks make on those loans. We then run various regression models to create predictions and understand trends.

Median Home Price

Number of Homes Sold (in Millions)

Number of New Homes Sold

Mortgage Applications Submitted

Interest Rates

The data from interests rates was later merged with another dataframe containing the number of houses purchased in each region of the US, starting from the 1970s. The data was merged in order to facilitate the view of the dataframe and to also create a linear regression model.

Random Forest Regressor

Using the previously mentionned dataframe, we run a random forest regression to create a predictive model.A random forest is a meta estimator that fits a number of classifying decision trees on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting. The sub-sample size is controlled with the max_samples parameter if bootstrap=True (default), otherwise the whole dataset is used to build each tree. X contained interest rates and margin, and y contained houses bought in the US. According to the result, purchases were mostly related to margin than they were related to housing interest rates.

Linear Regression

We also ran a linear regression and concluded that about 48% of the time, the data can can explain the trend in houses bought

Deep Learning Model

Understanding the links between multiple economic indicators and their influence on mortgage rates we used 8 datasets to create this model including Inflation(CPI), Changes in Mortgage Back Securities Prices, Avg Wages, the Fed Funds rate, number of houses sold, Unemployment rates, and average adjustable and fixed rated mortgages.

Inflation

Mortgage Backed Securities

Fed Funds Rate

All Dataframes Combined

Relationship between Fixed and Adjustable Rate Mortgages

Price to Interest Rate Relationship

Results

Conclusion

Although sentiment may say the US housing market is on the verge of a crash. The data says otherwise. With the Fed keeping interest rates astronomically low, there is no reason to predict that prices will go down. Despite other economic indicators including rising GDP, rising inflation, low unemployment, more government spending, and wages increasing the Federal Reserve is intent on keeping interest low to keep both stock and housing markets on the rise.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
ML_Rates.ipynb		ML_Rates.ipynb
Number of holmes sold in the US from 2005 to 2021 - Sheet1.csv		Number of holmes sold in the US from 2005 to 2021 - Sheet1.csv
Project3.ipynb		Project3.ipynb
README.md		README.md
US Median Home Price - Sheet1 (1).csv		US Median Home Price - Sheet1 (1).csv
historicalweeklydata (1).xls		historicalweeklydata (1).xls
historicalweeklydata21.csv		historicalweeklydata21.csv
mortgage_data.ipynb		mortgage_data.ipynb
regression_model.ipynb		regression_model.ipynb
sold_cust.xls		sold_cust.xls
twitter_sentiment_analysis.ipynb		twitter_sentiment_analysis.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Housing Market Prediction

Sentiment Analysis on Housing Market

Set Up API

Keyword search

Sentiment Analysis

Raw Tweets

Stopwords

Processing Tweets

Wordcloud

Wordcloud #2

Economic Data on Housing Market

Median Home Price

Number of Homes Sold (in Millions)

Number of New Homes Sold

Mortgage Applications Submitted

Interest Rates

Random Forest Regressor

Linear Regression

Deep Learning Model

Inflation

Mortgage Backed Securities

Fed Funds Rate

All Dataframes Combined

Relationship between Fixed and Adjustable Rate Mortgages

Price to Interest Rate Relationship

Results

Conclusion

About

Releases

Packages

Contributors 3

Languages

jrrameau2000/Housing_Market_Prediction

Folders and files

Latest commit

History

Repository files navigation

Housing Market Prediction

Sentiment Analysis on Housing Market

Set Up API

Keyword search

Sentiment Analysis

Raw Tweets

Stopwords

Processing Tweets

Wordcloud

Wordcloud #2

Economic Data on Housing Market

Median Home Price

Number of Homes Sold (in Millions)

Number of New Homes Sold

Mortgage Applications Submitted

Interest Rates

Random Forest Regressor

Linear Regression

Deep Learning Model

Inflation

Mortgage Backed Securities

Fed Funds Rate

All Dataframes Combined

Relationship between Fixed and Adjustable Rate Mortgages

Price to Interest Rate Relationship

Results

Conclusion

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages