-
Notifications
You must be signed in to change notification settings - Fork 0
In‐house Protein Database Creation and Configuration
If you have a Macromolecule Hub licence, you can also create and register in-house protein databases for use within On-Site WebCSD.
To create an In-house Protein Database, you will need the Python Utilities. These can be downloaded from the CCDC Downloads page under CSD Python API > Python CSD Python API Utilities
or by filling in the form to request download links. If you do not see the CSD Python API listed in the available downloads, you may need to sign in. The relevant script and the comprehensive documentation for it can be found under ccdc > utilities > create_protein_database
. You can run the script using the CSD Python API version 3.3.0 or later.
Once you have created your protein database, you can register it by adding to your docker-compose.db-config.yml
file following the instructions from here. You must then mark the database as a protein database. Your database config should look something like this:
volumes:
- /path/to/ExampleProteinDb.csdsqlx:/csd-data/ExampleProteinDb.csdsqlx
environment:
- ServiceSettings__Databases__2__Name=Example Protein DB
- ServiceSettings__Databases__2__ConnectionString=/csd-data/ExampleProteinDb.csdsqlx
- ServiceSettings__Databases__2__Speciality__0=Protein
The PDB Database which is built into Macromolecule Hub is a processed variant of the PDB. We are processing each structure, extracting binding sites within 8 Å of its small molecules and protonating these parts creating an optimal hydrogen bond network. Sometimes you may find structures without 3D coordinates. It just means that something went wrong during the processing (protonation went extremely wrong or no small molecules were identified). However, we managed to retrieve other information and keep it. There are also some structures which are deposited in PDB but can’t be found in our processed database. This usually means that we couldn’t read this file, probably because of unknown space groups or atoms. Here is the list of excluded structures:
1AIO, 1ARF, 1BC3, 1BUB, 1BZ6, 1CXR, 1D18, 1D19, 1D20, 1EJM, 1FH0, 1FZT, 1G65, 1GGR, 1HYA, 1J4V, 1JNR, 1KBR, 1KD1, 1KDO,
1KDR, 1KDS, 1KDT, 1KDW, 1M2X, 1MH2, 1NH5, 1O2D, 1P09, 1SAP, 1SY3, 1VTO, 1WDW, 1XOC, 1Z44, 2A3A, 2D47, 2F69, 2FNT, 2FXU,
2GHH, 2HG3, 2HG9, 2HH1, 2HIT, 2JE4, 2MYI, 2NO6, 2OCS, 2OVV, 2OYF, 2V5V, 2YQB, 3BEC, 3CI2, 3CND, 3J3Q, 3J9K, 3J9L, 3JCU,
3KD6, 3KD7, 3KDJ, 3KDR, 3KDT, 3LWX, 3NJO, 3PJS, 3QPM, 3QTE, 3SHR, 3VR2, 3VR3, 3VR4, 3VR5, 3VR6, 3VSF, 3VSZ, 3VT0, 3VT2,
3VYP, 3W82, 3WFS, 3ZGO, 3ZJO, 488D, 4A4A, 4A6S, 4A7T, 4AL9, 4AMF, 4AYO, 4AYR, 4B0H, 4B4I, 4BRD, 4BRE, 4BUE, 4C7D, 4C7F,
4CDV, 4CPB, 4CW0, 4CW2, 4D06, 4D13, 4FFH, 4GLO, 4H8R, 4HCQ, 4IXQ, 4IXR, 4KC7, 4KC8, 4KDN, 4L6V, 4LRN, 4M00, 4MDQ, 4MJO,
4N9D, 4NEN, 4NES, 4NET, 4OC4, 4Q7R, 4Q7T, 4Q7U, 4ROM, 4U3M, 4U3N, 4U3U, 4U4N, 4U4O, 4U4Q, 4U4R, 4U4U, 4U4Y, 4U4Z, 4U50,
4U51, 4U52, 4U53, 4U55, 4U56, 4U58, 4U5S, 4U60, 4U6F, 4UB6, 4UB8, 4UHG, 4URH, 4V27, 4V7R, 4V88, 4V8E, 4V8F, 4V8H, 4V9Q,
4XK8, 4XP3, 4YD9, 4YUU, 4ZBQ, 4ZIO, 5A9Z, 5AA0, 5ACH, 5ACI, 5ACJ, 5B5E, 5B5N, 5B66, 5DAT, 5DGE, 5DHQ, 5DHT, 5DHU, 5EDT,
5FCI, 5FCJ, 5G3L, 5H2F, 5HH4, 5I4L, 5I5B, 5JCJ, 5KAF, 5LYB, 5MDX, 5MEI, 5MOT, 5MX2, 5N6V, 5NJ4, 5O4C, 5O5R, 5O5U, 5O64,
5OF0, 5ON6, 5OY0, 5PK6, 5PK7, 5PK8, 5PK9, 5PKA, 5QDM, 5QEZ, 5QFJ, 5QIB, 5QIC, 5QID, 5QIE, 5QIF, 5QIG, 5QIH, 5QTP, 5QXI,
5QXQ, 5QY3, 5QY5, 5QYA, 5QYD, 5QYE, 5QYJ, 5RBP, 5RBQ, 5RBS, 5RBT, 5RBX, 5RBZ, 5RC2, 5RC3, 5RC5, 5RC6, 5RC7, 5RC8, 5RC9,
5RCB, 5RCE, 5RCG, 5RF4, 5RFC, 5RFE, 5RFI, 5RFJ, 5RFS, 5RFX, 5RFZ, 5RG0, 5RGQ, 5TBW, 5TGA, 5TGM, 5V2C, 5WS5, 5XNL, 5XNM,
5XTI, 5Y6P, 5ZF0, 5ZZN, 6CDN, 6DHG, 6DHO, 6DYK, 6ETY, 6EZ9, 6FE5, 6GNQ, 6H7Y, 6HHQ, 6HIF, 6HKJ, 6HKZ, 6HMU, 6IJJ, 6IJO,
6J3Y, 6J3Z, 6J40, 6JEO, 6JLK, 6JLL, 6JLM, 6JLO, 6JLP, 6JLU, 6JO5, 6JO6, 6K33, 6K61, 6KAC, 6KAD, 6KAF, 6KC4, 6KGX, 6KIF,
6KIG, 6KMW, 6KMX, 6L4U, 6LY5, 6M8P, 6NWA, 6O2T, 6PFY, 6PGK, 6PNJ, 6RBC, 6RKD, 6RTI, 6S1X, 6SGP, 6TBV, 6TC3, 6TCL, 6TG0,
6THF, 6TRC, 6TRD, 6U42, 6UBO, 6UBQ, 6UBR, 6UBS, 6UBU, 6UDX, 6UDY, 6UZV, 6VPV, 6W1O, 6W1P, 6Y10, 6Y1C, 6YNZ, 6YP7, 6ZBQ,
6ZBZ, 6ZC3, 6ZC4, 6ZC6, 6ZC7, 6ZC8, 6ZZX, 6ZZY, 7AZO, 7AZS, 7B1E, 7BA5, 7BGI, 7CJJ, 7D0J, 7DR2, 7DZ7, 7DZ8, 7EAP, 7EBT,
7EXT, 7EYD, 7EZX, 7F4V, 7F9O, 7FIX, 7H1L, 7H1N, 7H89, 7KC1, 7KCB, 7KCF, 7KCT, 7LX0, 7M75, 7M7M, 7M7N, 7N61, 7N6G, 7NER,
7NPA, 7NTV, 7NUK, 7O0U, 7O0V, 7O0W, 7O0X, 7OUI, 7PI0, 7PI5, 7PIN, 7PIW, 7Q4X, 7Q7P, 7Q7Q, 7QCO, 7QIL, 7RF1, 7RF3, 7RF4,
7RF6, 7RF7, 7RRO, 7RTH, 7S3D, 7SC7, 7SC9, 7SOM, 7SQC, 7UEA, 7UEB, 7UMH, 7UNG, 7VD5, 7VEA, 7W5Z, 7WG5, 7XQP, 7Y4L, 7Y5E,
7Y7A, 7Y7B, 7YCA, 7YE8, 7YMI, 7YMM, 7YQ2, 7YQ7, 7ZQ9, 7ZQC, 7ZQD, 8AHU, 8ANF, 8AOY, 8B6H, 8B7G, 8BCT, 8BD3, 8BMU, 8BQS,
8C29, 8CKB, 8DXK, 8F4E, 8FB0, 8G2Z, 8G3D, 8GLV, 8GN0, 8GN1, 8GN2, 8GVM, 8GYM, 8H7P, 8HTU, 8I7O, 8I7R, 8IR6, 8IR7, 8IR8,
8IR9, 8IRA, 8IRB, 8IRC, 8IRD, 8IRE, 8IRF, 8IRG, 8IRH, 8IRI, 8IUF, 8IWH, 8IYJ, 8J07, 8J5K, 8JJR, 8JW0, 8JZE, 8KCM, 8OTN,
8OTZ, 8PW7, 8Q0A, 8RNX, 8SF7, 8SNB, 8T7V, 8U8V, 8UEO, 8UEP, 8UGH, 8UGI, 8UGJ, 8UGN, 8UGR, 8WB4, 8WM6, 8WMJ, 8WMV, 8WMW,
8WQL, 8XLP, 8XR6