Chemical classifications for biology and medicine Minoru Kanehisa Institute for Chemical Research, Kyoto University ACS National Meeting, San Diego, March 15, 2016
How chemical classifications are implemented in: KEGG How chemical classifications can help understand: Genomes Drug interactions
KEGG (Kyoto Encyclopedia of Genes and Genomes) Category Database name Content Systems Information Genomic Information Chemical Information Health Information KEGG PATHWAY KEGG BRITE KEGG MODULE KEGG ORTHOLOGY KEGG GENOME KEGG GENES KEGG SSDB KEGG COMPOUND KEGG GLYCAN KEGG REACTION KEGG RPAIR KEGG RCLASS KEGG ENZYME KEGG DISEASE KEGG DRUG KEGG DGROUP KEGG ENVIRON JAPIC DailyMed KEGG pathway maps BRITE functional hierarchies KEGG modules KEGG Orthology (KO) groups KEGG organisms with complete genomes Gene catalogs of complete genomes Sequence similarity database for GENES Metabolites and other small molecules Glycans Biochemical reactions Reactant pairs Reaction class Enzyme nomenclature Human diseases Drugs Drug groups Crude drugs and health-related substances Drug labels in Japan Drug labels in the USA (links only) http://www.kegg.jp/ http://www.genome.jp/kegg/
Chemical classifications: Represent scientific knowledge Can be used for inference and prediction Chemical classifications in KEGG include: Ontologies KEGG BRITE database Class instance relationships KEGG Orthology (KO) for genes and proteins Reaction class (RC) for chemical reactions Drug group (DG) for drugs
Three types of molecular networks in KEGG Type Instance (Database) Class (Database) Functional unit Gene/protein network Gene/Protein (KEGG GENES) Ortholog (KEGG ORTHOLOGY) KO KEGG module Chemical reaction network Reaction (KEGG REACTION) Reaction class (KEGG RCLASS) RC Reaction module Drug interaction network Drug (KEGG DRUG) Drug group (KEGG DGROUP) DG Interaction unit
K05032 KEGG Orthology (KO) Genome-based generalization of experimental knowledge ABC transporter ABCC8 Type II diabetes mellitus (map04930) Transporters (ko02000) ABC Transporters, Eukaryotic Type ABCA (ABC1) subfamily ABCB (MDR/TAP) subfamily ABCC (CFTR/MRP) subfamily ABCC8, 9 subgroups K05032 ABCD (ALD) subfamily ABC Transporters, Prokaryotic Type KO (K number entry) represents a manually defined ortholog group corresponding to the KEGG pathway node and/or the BRITE hierarchy node (bottom leaf) Solute Carrier Family (SLC) Major Facilitator Superfamily (MFS) Phosphotransferase System (PTS) Other Transporters
KEGG Module Functional unit in the molecular interaction network Citrate Cycle (TCA Cycle) consists of two modules M00010 = K01647 (K01681,K01682) (K00031,K00030) M00011 Eukaryotes Bacteria Archaea M00010 90.7% 75.4% 51.8% M00011 62.3% 62.6% 37.2%
Reaction Module Reaction R00351 Reactant pair RP00177 C00036 + C00024 + C00001 <=> C00158 + C00010 (Oxaloacetate + Acetyl-CoA + H 2 O <=> Citrate + CoA) C00036_C00158 K01647 2.3.3.1 Reaction class RC00067 C1d-C5a:C1b-*:C1b+C6a+O1a-C1b+C6a+O5a O6a O6a C1b C6a C1b C6a C5a O6a C1d O6a O5a C1b O1a Reaction module M00010 RM001-01 RC00067 (RC00498+RC00618,RC00497) (RC00084+RC00626,RC00114)
TCA cycle Correspondence of KEGG modules and Reaction modules M00010 / RM001 oxaloacetate M00010 RM001 Lysine biosynthesis M00433 RM001 2-oxoglutarate M00433 / RM001 2-oxoadipate
Modular architecture of metabolic network KEGG modules (genomic units) and reaction modules (chemical units) map01210 2-Oxocarboxylic acid metabolism RG001 RM002 RM033 RG001 RM030 M00033 M00019 RM001 M00010 RM001 RM001 M00535 M00432 RG001 RM030 RG001 RM032 RM001 M00433 M00608 M00028 M00763 RM033 M00019 RG001 RM030 RG001 RM002 M00031 RM001 M00608 RG001 RM001 RG001 RM030 RM001 M00608 RM001 x 5 RG001 RM030
map00550 Peptidoglycan biosynthesis map00311 Penicillin and cephalosporin biosynthesis M00673 Cephalosporin C M00672 Penicillin Cephamycin C Penicillin binding proteins (PBPs)
map01501 beta-lactam resistance M00625 Altered target site Enzymatic inactivation Decreased penetration Increased efflux
Drug Interaction Unit A unit of drug-target interaction and drug metabolism for integration with gene/protein and chemical reaction networks Drug Target molecule Enzyme, transporter, etc Drug response Antimicrobial resistance Warfarin metabolizes targets VKORC1 (polymorphism) β-lactam hydrolyzes targets PBP (mutations) CYP2C9 (polymorphism) Drug-drug interaction β-lactamase (mutations) Clarithromycin inhibits CYP3A4 targets 50S ribosomal subunit X HTR1A Ergotamine
Drug Group For generalization of known drug-drug interactions DG01551 Macrolide antibiotic DG01874 14-membered ring macrolide antibiotic DG00436 Erythromycin DG00603 Oleandomycin DG00605 Clarithromycin D00276 Clarithromycin D07710 Clarithromycin lactobionate DG01875 15-Membered ring macrolide antibiotic DG01876 16-Membered ring macrolide antibiotic Strong CYP3A4 inhibitor DG01472 Dopamine agonist DG01467 Dopamine D1-receptor agonist DG01468 Dopamine D2-receptor agonist DG00833 Dihydroergotamine D07837 Dihydroergotamine D02211 Dihydroergotamine mesylate 0187-0245 MIGRANAL DG01469 Dopamine D3-receptor agonist DG01470 Dopamine D4-receptor agonist DG01471 Dopamine D5-receptor agonist CYP3A4 substrate WARNING Serious and/or life-threatening peripheral ischemia has been associated with the coadministration of DIHYDROERGOTAMINE with potent CYP 3A4 inhibitors including protease inhibitors and macrolide antibiotics.
Drug Group For assessing drug response from personal genome DG01646 Proton pump inhibitor DG00020 Omeprazole DG00021 Pantoprazole DG00022 Rabeprazole DG00023 Esomeprazole D00355 Lansoprazole D08903 Dexlansoprazole CYP2C19 substrate
Drug Group For establishing links from antimicrobial resistance genes and gene sets DG01710 beta-lactam antibiotic DG01713 Penicillin skeleton group DG01778 beta-lactamase sensitive penicillin DG01779 beta-lactamase resistant penicillin DG01780 Extended spectrum penicillin DG01479 beta-lactamase inhibitor DG01458 Carbapenem DG00591 Meropenem D08185 Meropenem D02222 Meropenem hydrate 0310-0321 MERREM IV injection 1g 0310-0325 MERREM IV injection 500mg DG00592 Ertapenem DG00593 Doripenem DG01212 Imipenem DG01714 Cephalosporin skeleton group DG01774 First-generation cephalosporin DG01775 Second-generation cephalosporin DG01776 Third-generation cephalosporin DG01777 Fourth-generation cephalosporin DG01454 Monobactam Carbapenemase K18768 (KPC) K18970 (GES) K19316 (IMI/SME) K18782 (IMP) K18781 (VIM) K18780 (NDM) K19099 (GIM) K19216 (IND) K18793 (OXA-23) K18794 (OXA-51) K19318 (OXA-213) etc. Extend-spectrum beta-lactamase
Antibiotic Resistance Threats in the United States (CDC, 2013) A. Urgent threats A1. Clostridium difficile A2. Carbapenem-resistant Enterobacteriaceae (CRE) A3. Drug-resistant Neisseria gonorrhoeae B. Serious threats B1. Multidrug-resistant Acinetobacter B2. Drug-resistant Campylobacter B3. Fluconazole-resistant Candida (a fungus) B4. Extended spectrum beta-lactamase producing Enterobacteriaceae (ESBLs) B5. Vancomycin-resistant Enterococcus (VRE) B6. Multidrug-resistant Pseudomonas aeruginosa B7. Drug-resistant non-typhoidal Salmonella B8. Drug-resistant Salmonella enterica serovar Typhi B9. Drug-resistant Shigella B10. Methicillin-resistant Staphylococcus aureus (MRSA) B11. Drug-resistant Streptococcus pneumoniae B12. Drug-resistant tuberculosis C. Concerning threats C1. Vancomycin-resistant Staphylococcus aureus (VRSA) C2. Erythromycin-resistant group A Streptococcus C3. Clindamycin-resistant group B Streptococcus
KEGG BRITE Database Pathways and Ontologies Genes and Proteins Protein families (K numbers) RNA family (K numbers) Compounds and Reactions Compounds (C numbers) Reactions (R/RC numbers) Drugs Drug classifications (D/C numbers) Other drug information (D/E/C numbers) Diseases Human diseases (H numbers) Organisms and Cells BRITE htext files ATC classification USP drug classification Therapeutic category of drugs in Japan Classification of Japanese OTC drugs Risk category of Japanese OTC drugs Pharmaceutical additives in Japan Target-based classification of drugs Antiinfectives etc. BRITE table files Antineoplastics Antibacterials Antivirals Antifungals Antiparasitics Antidiabetics Hypolipidemic agents etc.
Summary Chemical classifications Chemical classifications in KEGG include ontologies and class instance relationships Class instance relationships are used to represent and generalize scientific knowledge enabling inference and prediction For biology Chemical classifications, such as reaction class, can be used to improve gene/protein annotation For medicine Drug groups are defined for representation of drug interactions, including adverse drug drug interactions, drug-personal genome relationships, and drug antimicrobial resistance relationships