Feature representations for new Proteins in DiG

Hi,

This is regarding protein generation in DiG.

I wanted to know how you obtained the features present in the protein pickle files. As per Appendix B.1 of the paper, the single and pair representations are simply outputs of a pre-trained Evoformer model from AlphaFold given the corresponding protein's Fasta sequence and MSAs.

I set up OpenFold on our systems and saved the representations from Evoformer in a pickle file for the corresponding protein. I used the `single` and `pair` keys in the `output` dictionary in this [link](https://github.com/aqlaboratory/openfold/blob/80c85b54e1a81d9a66df3f1b6c257ff97f10acd3/openfold/model/model.py#L583). Also, to get the MSAs for the fasta sequence I queried the ColabFold server.

Unfortunately, the representations I received from OpenFold's Evoformer and the representations in the dataset's pickle file were quite different.

Can you please let me know the exact method you used to obtain the single and pair representations for the respective protein fasta sequence?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature representations for new Proteins in DiG #184

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature representations for new Proteins in DiG #184

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions