only search Keymodule website

Active binding site detection by flood docking a library of molecular fragments

Tamas Lengyel, A. Peter Johnson
School of Chemistry
University of Leeds
Leeds, LS2 9JT
United Kingdom
Poster presented at:
3rd Joint Sheffield Conference on Chemoinformatics
April 2004, Sheffield, United Kingdom

An important step of de novo drug design methods is to find the binding sites on the receptor, which will act as starting points of the structure generation procedure.

SPROUT is a de novo ligand design program, that constructs putative ligands using a small library of generic templates in a stepwise fashion with the constraints of the receptor. Prior to the structure generation phase, the active pocket of the receptor is explored. The characterisation of the pocket is followed by docking fragments sequentially to a user-selected set of target sites, providing strong electrostatic constraints for structure generation.

For an average protein, SPROUT typically generates 10-40 acceptor or donor hydrogen bonding target sites, but even tightly binding ligands rarely make more than 5 H bonds to the receptor. Therefore, the user has to select manually a subset of all the sites and these are used in the growing phase. Appropriate site selection is a crucial step in the de novo design procedure - different selections represent different de novo design experiments and will give rise to different answers.

A novel flood docking method, presented here, has been implemented in SPROUT to assist target site selection by docking and scoring a large fragment library to each potential target site or pair of neighbouring sites. This library is obtained by selecting the most frequently occurring of the set of fragments, themselves obtained by fragmenting the MDDR database. After scoring the docked templates, an overall site score is calculated for each site taking into account the number of fragments docked to the site and the binding scores of each individual fragment. Our hypothesis is that this site score provides a good estimate of the ability of a particular target site to bind strongly to an appropriate ligand.

This hypothesis has been validated on six protein families, each containing 15-20 pdb complexes. This method also provides a good starting set of highly scored docked templates for connection.