Description
Hi,
This is regarding protein generation in DiG.
I wanted to know how you obtained the features present in the protein pickle files. As per Appendix B.1 of the paper, the single and pair representations are simply outputs of a pre-trained Evoformer model from AlphaFold given the corresponding protein's Fasta sequence and MSAs.
I set up OpenFold on our systems and saved the representations from Evoformer in a pickle file for the corresponding protein. I used the single
and pair
keys in the output
dictionary in this link. Also, to get the MSAs for the fasta sequence I queried the ColabFold server.
Unfortunately, the representations I received from OpenFold's Evoformer and the representations in the dataset's pickle file were quite different.
Can you please let me know the exact method you used to obtain the single and pair representations for the respective protein fasta sequence?