The Artima Developer Community
Sponsored Link

Python Buzz Forum
Exceptions to the graph model

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
Andrew Dalke

Posts: 291
Nickname: dalke
Registered: Sep, 2003

Andrew Dalke is a consultant and software developer in computational chemistry and biology.
Exceptions to the graph model Posted: Oct 7, 2003 8:33 AM
Reply to this message Reply

This post originated from an RSS feed registered with Python Buzz by Andrew Dalke.
Original Post: Exceptions to the graph model
Feed Title: Andrew Dalke's writings
Feed URL: http://www.dalkescientific.com/writings/diary/diary-rss.xml
Feed Description: Writings from the software side of bioinformatics and chemical informatics, with a heaping of Python thrown in for good measure.
Latest Python Buzz Posts
Latest Python Buzz Posts by Andrew Dalke
Latest Posts From Andrew Dalke's writings

Advertisement
Yesterday I said that many compounds could be represented as a graph. The computer scientist in you might ask what can't be represented as graphs. Salts (like good old NaCl) are one. These have ionic bonds caused by the interactions between strong charges, and not covalent ones caused by the overlap of orbitals. Salt dissolves so easily because a charge doesn't care what it interacts with so long as it's the opposite charge. Water has its own partial charges which love to join in on the fun, and the result is entropy heaven.

Crystals are infinite in extent. What's the graph of diamond or graphite?

A graph based on covalent bonds cannot handle metals, which have delocalized electrons and so no real covalent bonds. There's even problems when there's only one metal atom, like in the heme of hemoglobin or photosynthesis. An atom binds to the delocalized ring of the heme and not to any one atom. There are ways to fake it, but I recall that Cherwell's tookit supported "ring" as a new graph element, in addition to atoms and bonds.

And then there's hydrogen gas absorbed in platinum. Not only is Pt a metal but the hydrogens reversibly bind to the Pt atoms so move about.

Polymers cause more problems. These can have a huge number of different lengths and branchings ("1020 and more" for polysaccharides!). While each one can be represented as a discrete graph, you'll need better than 64 bits to enumerate them all. Even if you do that, it isn't obvious what you can do with that data.

The nice thing about these exceptions is that they rarely make good drugs. Good drugs are small, have well-defined shapes, and only come in one of at most a couple chiral forms. The exceptions are usually protein-based drugs, like insulin, where non-graph models are more appropriate.

The usefulness of the molecular graph decreases with the size of the molecule. Molecules have a three-dimensional structure (called a conformation) as well as the two-dimensional structure (the graph). Small molecules usually have one or only a few common conformations and these can be determined reasonably well directly from the graph using Corina, Omicron, Concord, or similar programs. Even when the prediction isn't exactly the minimum energy state, thermal motion and other flucuations make it likely that the molecule will be close to a predicted conformation at some point.

The larger the molecule the less likely it will be predicted correctly, and the less likely it will be ergodic. (That's fancy for "explore all phase space." That's fancy for "take on all available conformations.") Huge molecules like proteins are extremely hard to predict ab initio and are not ergodic. That's the whole protein folding problem, and the Levinthal paradox is that folding does happen despite the combinitorial problems.

When a molecule gets that large, its conformation plays a much more important role. For example, the serine protease works because of the arrangement of the catalytic triad. You can change a lot of the rest of the molecule, making a very different graph, and still have a serine protease so long as the fold is correct.

A better model for this stage is the bioinformatics one, where proteins and DNA are represented by one-dimensional sequences (a high-level abstraction of the graph). This allows similarity searches, which are extremely hard to do for graphs, and supports concepts like phylogenies to describe the evolutionary relationship between sequences.

Read: Exceptions to the graph model

Topic: Swimming against the stream Previous Topic   Next Topic Topic: 1-day project: An adventure engine

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use