IntEnzyDB Homepage

Image of IntEnzyDB cover page

A Brief Introduction

A relational database for enzymes, stores clean and tabulated structural and catalytic data for enzyme. This database contains curated experimentally characterized kinetics parameters (i.e., 4037 kcat/KM values) with experimental conditions for 686 enzymes, 2540 single amino acid substitution mutants, with reactions on 929 substrates. The experimental temperature range of 95% of the curated data is 295.15 to 343.15 K. This database also contains 8086 PDB structures with chain level, amino acid level and atom level information corresponding to the kinetics parameters. The enzymes in IntEnzyDB catalyze 6 types of chemical reactions. These enzymes are Oxidoreductases (EC 1), Transferases (EC 2), Hydrolases (EC 3), Lyases (EC 4), Isomerases (EC 5), and Ligases (EC 6).

The reason why we build this database

We built IntEnzyDB to resolve the challenges for collecting, cleaning and integrating enzyme structural data and kinetics data that stored in different databases (e.g., PDB, UniProt, BRENDA, and SABIO-RK) with various data formats and standards. With IntEnzyDB, it’s easier to explore the properties and spatial distributions for rate-enhancing mutations across a diverse range of hydrolase sequences, functions, and substrate types. Specially designed for advanced statistical analysis and data-driven modeling, IntEnzyDB enhances the efficiency of data preprocessing and provide the data formatted ready for modeling.

Database architecture

IntEnzyDB has a flattened data structure to facilitate advanced statistical analysis and data-driven modeling. There are total of 5 tables stored in IntEnzyDB, the three structural tables contain cleaned chain-level, amino acid-level, atom-level enzyme structural information derived from RCSB PDB. The other kinetics table stores the cleaned kinetics data derived from BRENDA and Sabio-RK. The reference table contains one-to-one kinetics and PDB structure mapping information with foreign keys PDB ID, Chain ID, and UniProtKB. The kinetics and structure mapping in this table was generated using in-house alignment and filtration methods, having PDB ID and Chain ID to UniProtKB mapping along with enzyme active site location.