Procleave is developed to improve the performance of protease substrate cleavage site prediction by incorporating the real 3D structural features of substrates. Procleave uses the latest version of the MEROPS database and maps the substrates sequence to PDB structure by performing the blast search and generate a relatively comprehensive 3D structural substrate dataset. Then a variety of sequence and structural features are calculated and extracted, which are further integrated into a novel integrative conditional random field (CRF) with a data-smoothing framework to train the cleavage site prediction models. A comprehensive performance benchmarking test by using different combinations of sequence and structural features illustrates the smoothed structural features can greatly improve the prediction performance. The web server is implemented for 27 major substrates taking advantage of the findings in this study and make it publicly available.





