Skip to content

This repository contains the code and data associated with the research paper: "Material discovery of secondary and natural cementitious precursors."

License

Notifications You must be signed in to change notification settings

olivettigroup/Cementitious-reactivity-prediction-and-discovery

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This repository contains the code and data associated with the research paper:

"Material discovery of secondary and natural cementitious precursors"

Authors: Soroush Mahjoubi, Vineeth Venugopal, Ipek Bensu Manav, Hessam Azarijafari, Randolph Kirchain, Elsa Olivetti

Cement production is a significant contributor to global greenhouse gas (GHG) emissions, primarily due to the production of clinker. This project aims to reduce these emissions by expanding the repertoire of secondary and natural cementitious precursors through data-driven, machine learning methods.

We developed models to predict the reactivity of cementitious materials and systematically identified potential substitutes for clinker, including secondary materials like construction and demolition wastes, and natural precursors such as rhyolite. Our approach combines large language models (LLMs) and multi-task neural networks to predict reactivity metrics, enabling the discovery of new materials that can reduce the GHG intensity of cement production.

Repository Contents The repository is organized into Jupyter notebooks that sequentially perform the tasks described in the paper:

1_keyword_filtering_and_preprocessing.ipynb Filters relevant literature using keywords and preprocesses the data for analysis.

2_Vector_database_for_tables.ipynb Creates a vector database for tables extracted from the literature using embeddings.

3_Evaluation_of_table_embeddings.ipynb Evaluates different embedding models for their effectiveness in retrieving relevant tables.

4_Table_extraction_with_fine-tuned_GPT_3.5.ipynb Extracts chemical compositions from tables using a fine-tuned GPT-3.5 model.

4_Table_extraction_with_fine-tuned_Mistral_7b.ipynb Alternative extraction of chemical compositions using a fine-tuned Mistral 7b model.

5_Evaluation_of_table_extraction.ipynb Assesses the performance of the table extraction methods.

6_Vector_database_for_sentences.ipynb Builds a vector database for sentences to aid in material classification.

7_Extraction_of_material_descriptors_with_Mistral_7b.ipynb Extracts material descriptors using the Mistral 7b model.

8_Categorize_materials_with_fine-tuned_GPT_3.5.ipynb Classifies materials into predefined categories using a fine-tuned GPT-3.5 model.

9_Train_the_multi-headed_NN_for_reactivity_prediction.ipynb Trains a multi-headed neural network to predict reactivity metrics.

10_Reactivity_prediction_of_extracted_materials.ipynb Predicts the reactivity of materials extracted from the literature.

11_Reactivity_prediction_of_natural_precursors.ipynb Applies the trained model to predict the reactivity of natural rock samples.

About

This repository contains the code and data associated with the research paper: "Material discovery of secondary and natural cementitious precursors."

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published