@@ -7,6 +7,15 @@ Adjacency Lists
77
88.. module :: rmgpy.molecule.adjlist
99
10+
11+ .. note ::
12+ The adjacency list syntax changed in July 2014.
13+ The minimal requirement for most translations is to prefix the number
14+ of unpaired electrons with the letter `u `.
15+ The new syntax, however, allows much
16+ greater flexibility, including definition of lone pairs, partial charges,
17+ wildcards, and molecule multiplicities.
18+
1019.. note ::
1120 To quickly visualize any adjacency list, or to generate an adjacency list from
1221 other types of molecular representations such as SMILES, InChI, or even common
@@ -21,49 +30,100 @@ RMG -- but extended to allow for specification of extra semantic information.
2130The first line of most adjacency lists is a unique identifier for the molecule
2231or pattern the adjacency list represents. This is not strictly required, but
2332is recommended in most cases. Generally the identifier should only use
24- alphanumeric characters and the underscore, as if an identifer in many popular
33+ alphanumeric characters and the underscore, as if an identifier in many popular
2534programming languages. However, strictly speaking any non-space ASCII character
2635is allowed.
2736
28- After the identifier line, each subsequent line describes a single atom and its
37+ The subsequent lines may contain keyword-value pairs. Currently there is only
38+ one keyword, ``multiplicity ``.
39+
40+ For species or molecule declarations, the value after ``multiplicity `` defines
41+ the spin multiplicity of the molecule. E.g. ``multiplicity 1 `` for most ground state
42+ closed shell species, ``multiplicity 2 `` for most radical species,
43+ and ``multiplicity 3 `` for a triplet biradical.
44+ If the ``multiplicity `` line is not present then a value of
45+ (1 + number of unpaired electrons) is assumed.
46+ Thus, it can usually be omitted, but if present can be used to distinguish,
47+ for example, singlet CH2 from triplet CH2.
48+
49+ If defining a Functional :class: `~rmgpy.molecule.Group `, then the value must be a list,
50+ which defines the multiplicities that will be matched by the group, eg.
51+ ``multiplicity [1,2,3,4,5] `` or, for a single value, ``multiplicity [1] ``.
52+ If the multiplicity line is omitted, then ``multiplicity [1,2,3,4,5] `` is assumed.
53+
54+ After the identifier line and keyword-value lines,
55+ each subsequent line describes a single atom and its
2956local bond structure. The format of these lines is a whitespace-delimited list
3057with tokens ::
3158
32- <number> [<label>] <element> <radicals> <bondlist>
59+ <number> [<label>] <element> u<unpaired> [p<pairs>] [c<charge>] <bondlist>
3360
3461The first item is the number used to identify that atom. Any number may be used,
3562though it is recommended to number the atoms sequentially starting from one.
3663Next is an optional label used to tag that atom; this should be an
37- asterisk followed by a unique number for the label, e.g. ``*1 ``. After that is
38- the atom's element, indicated by its atomic symbol, followed by the number of
39- radical electrons on the atom. The last set of tokens is the list of bonds.
64+ asterisk followed by a unique number for the label, e.g. ``*1 ``.
65+ In some cases (e.g. thermodynamics groups) there is only one labeled atom, and the label
66+ is just an asterisk with no number: ``* ``.
67+
68+ After that is
69+ the atom's element or atom type, indicated by its atomic symbol, followed by
70+ a sequence of tokens describing the electronic state of the atom:
71+
72+ * ``u0 `` number of **unpaired ** electrons (eg. radicals)
73+ * ``p0 `` number of lone **pairs ** of electrons, common on oxygen and nitrogen.
74+ * ``c0 `` formal **charge ** on the atom, e.g. ``c-1 `` (negatively charged),
75+ ``c0 ``, ``c+1 `` (positively charged)
76+
77+ For :class: `~rmgpy.molecule.Molecule ` definitions:
78+ The value must be a single integer (and for charge must have a + or - sign if not equal to 0)
79+ The number of unpaired electrons (i.e. radical electrons) is required, even if zero.
80+ The number of lone pairs and the formal charge are assumed to be zero if omitted.
81+
82+ For :class: `~rmgpy.molecule.Group ` definitions:
83+ The value can be an integer or a list of integers (with signs, for charges),
84+ eg. ``u[0,1,2] `` or ``c[0,+1,+2,+3,+4] ``, or may be a wildcard ``x ``
85+ which matches any valid value,
86+ eg. ``px `` is the same as ``p[0,1,2,3,4] `` and ``cx `` is the same as
87+ ``c[-4,-3,-2,-1,0,+1,+2,+3,+4] ``. Lists must be enclosed is square brackets,
88+ and separated by commas, without spaces.
89+ If lone pairs or formal charges are omitted from a group definition,
90+ the wildcard is assumed.
91+
92+
93+ The last set of tokens is the list of bonds.
4094To indicate a bond, place the number of the atom at the other end of the bond
4195and the bond type within curly braces and separated by a comma, e.g. ``{2,S} ``.
42- Multiple bonds to the same atom should be separated by whitespace.
96+ Multiple bonds from the same atom should be separated by whitespace.
4397
4498.. note ::
4599 You must take care to make sure each bond is listed on the lines of *both *
46100 atoms in the bond, and that these entries have the same bond type. RMG will
47101 raise an exception if it encounters such an invalid adjacency list.
48102
103+
49104When writing a molecular substructure pattern, you may specify multiple
50- elements, radical counts, and bond types as a comma-separated list inside curly
51- braces. For example, to specify any carbon or oxygen atom, use the syntax
52- ``{C,O} ``. Atom types may also be used as a shorthand. (Atom types can also be
105+ elements, radical counts, and bond types as a comma-separated list inside square
106+ brackets. For example, to specify any carbon or oxygen atom, use the syntax
107+ ``[C,O] ``. For a single or double bond to atom 2, write ``{2,[S,D]} ``.
108+
109+ Atom types such as ``R!H `` or ``Cdd `` may also be used as a shorthand. (Atom types
110+ like ``Cdd `` can also be
53111used in full molecules, but this use is discouraged, as RMG can compute them
54112automatically for full molecules.)
55113
56114Below is an example adjacency list, for 1,3-hexadiene, with the weakest bond in
57115the molecule labeled with ``*1 `` and ``*2 ``. Note that hydrogen atoms
58- can be omitted if desired, as their presence is inferred::
116+ can be omitted if desired, as their presence is inferred, provided that unpaired
117+ electrons, lone pairs, and charges are all correctly defined::
59118
60119 HXD13
61- 1 C 0 {2,D}
62- 2 C 0 {1,D} {3,S}
63- 3 C 0 {2,S} {4,D}
64- 4 C 0 {3,D} {5,S}
65- 5 *1 C 0 {4,S} {6,S}
66- 6 *2 C 0 {5,S}
120+ multiplicity 1
121+ 1 C u0 {2,D}
122+ 2 C u0 {1,D} {3,S}
123+ 3 C u0 {2,S} {4,D}
124+ 4 C u0 {3,D} {5,S}
125+ 5 *1 C u0 {4,S} {6,S}
126+ 6 *2 C u0 {5,S}
67127
68128The allowed element types, radicals, and bonds are listed in the following table:
69129
@@ -77,10 +137,10 @@ The allowed element types, radicals, and bonds are listed in the following table
77137 | | H | Hydrogen atom |
78138 | +----------+---------------------+
79139 | | S | Sulfur atom |
80- +----------------------+----------+---------------------+
81- | Nonreactive Elements | N | Nitrogen atom |
82140 | +----------+---------------------+
83- | | Si | Silicon atom |
141+ | | N | Nitrogen atom |
142+ +----------------------+----------+---------------------+
143+ | Nonreactive Elements | Si | Silicon atom |
84144 | +----------+---------------------+
85145 | | Cl | Chlorine atom |
86146 | +----------+---------------------+
@@ -89,18 +149,6 @@ The allowed element types, radicals, and bonds are listed in the following table
89149 | | Ar | Argon atom |
90150 | +----------+---------------------+
91151 +----------------------+----------+---------------------+
92- | Free Electrons | 0 | Non-radical |
93- | +----------+---------------------+
94- | | 1 | Mono-radical |
95- | +----------+---------------------+
96- | | 2 | Bi-radical |
97- | +----------+---------------------+
98- | | 2T | Triplet |
99- | +----------+---------------------+
100- | | 2S | Singlet |
101- | +----------+---------------------+
102- | | 3 | Tri-radical |
103- +----------------------+----------+---------------------+
104152 | Chemical Bond | S | Single Bond |
105153 | +----------+---------------------+
106154 | | D | Double Bond |
0 commit comments