Skip to content

Latest commit

 

History

History
16 lines (10 loc) · 535 Bytes

README.md

File metadata and controls

16 lines (10 loc) · 535 Bytes

Income Prediction

Dataset: https://archive.ics.uci.edu/ml/datasets/Adult

Task 1: binary classification

  • prediction_baselines.py --- classification models using sklearn
  • bert_clf.py --- tried to use BERT model to do the classification task TODO: find a larger dataset to solve the under-fitting problem seen in BERT model; preprocessing; More models

Task 2: clustering

  • clustering.py --- find clusters among >50k TODO: find the center of each cluster for better interpretation

Task 3: Multivariable analysis (TODO)