File Parsing Tools¶
Extract Data from CML¶
- group_decomposition.utils.all_data_from_cml(data)[source]¶
Gets symbols, xyz coords, bonds and charge of a mol from cml file
- Parameters:
data – lines of a .cml file
- Returns:
dictionary with relevant data from cml file. Keys included are ‘geom’, ‘atom_types’, ‘bonds’, ‘labels’, ‘charge’, ‘multiplicity’, ‘smiles’
Note
Not used in
group_decomposition.fragfunctions.identify_connected_fragmentsThis is used in an AiiDA workflow employing this packageThis is designed to parse files specifically from the Retrievium database https://retrievium.ca
the SMILEs extracted is labelled with tag retrievium:inputSMILES
The geometry extracted is from the third atomArray block in the .cml file
- Example Usage:
>>> utils.all_data_from_cml(cml_file)
Extract Data from CML¶
- group_decomposition.utils.data_from_cml(cml_file, bonds=False)[source]¶
Gets symbols, xyz coords, bonds and charge of a mol from cml file
- Parameters:
cml_file – .cml filename
- Returns:
list with relevant data from cml file. Elements in order are: molecular geometry, atom types, list of bonds, list of elements, charge
Note
This is designed to parse files specifically from the Retrievium database https://retrievium.ca
the SMILEs extracted is labelled with tag retrievium:inputSMILES
The geometry extracted is from the third atomArray block in the .cml file
- Example Usage:
>>> utils.data_from_cml(cml_file)
Extract Atom Types from CML¶
Extract SMILEs from cml¶
- group_decomposition.utils.smiles_from_cml(cml_file, smile_tag='retrievium:inputSMILES')[source]¶
Finds the Retreivium SMILES in a cml file with a given label
- Parameters:
cml_file – cml file name
smile_tag – the label fo the SMILEs in the cml file. Defaults to input SMILEs
- Returns:
inputSMILES
- Return type:
string of the input SMILES code tagged in the file as retrievium
Note
Must be used on .cml files from the Retrievium database https://retrievium.ca
Get XYZ from CML¶
- group_decomposition.utils.xyz_from_cml(cml_file)[source]¶
Extract xyz coordinates from cml file
- Parameters:
cml_file – cml file name
- Returns:
list of length 3 lists containing a molecule’s xyz coordinates
Note
Must be used on .cml files from the Retrievium database https://retrievium.ca