add_rdkit_compound_descriptors module
Add RDKit-based compound properties to the dataset.
- add_rdkit_compound_descriptors.add_aromaticity_descriptors(dataset: Dataset)[source]
Add number of aromatic atoms in a compounds, specifically:
total # aromatics atoms (aromatic_atoms)
# aromatic carbon atoms (aromatic_c)
# aromatic nitrogen atoms (aromatic_n)
# aromatic hetero atoms (aromatic_hetero)
- Parameters:
dataset (Dataset) – Dataset with compound-target pairs. Will be updated to only include counts of aromatic atoms
- add_rdkit_compound_descriptors.add_built_in_descriptors(dataset: Dataset)[source]
Add RDKit built-in compound descriptors.
- Parameters:
dataset (Dataset) – Dataset with compound-target pairs. Will be updated to only include built-in RDKit compound descriptors.
df_combined (pd.DataFrame) – Pandas DataFrame with compound-target pairs
- add_rdkit_compound_descriptors.add_rdkit_compound_descriptors(dataset: Dataset)[source]
Add RDKit-based compound descriptors (built-in and numbers of aromatic atoms).
- Parameters:
dataset (Dataset) – Dataset with compound-target pairs. Will be updated to only include built-in RDKit compound descriptors and numbers of aromatic atoms.
- add_rdkit_compound_descriptors.calculate_aromatic_atoms(smiles_set: set[str]) tuple[dict[str, int], dict[str, int], dict[str, int], dict[str, int]] [source]
Get dictionaries with number of aromatic atoms for each smiles.
- Parameters:
smiles_set (set[str]) – Set of smiles to calculate the number of aromatic atoms for
- Returns:
Dictionaries with:
SMILES -> # aromatics atoms
SMILES -> # aromatic carbon atoms
SMILES -> # aromatic nitrogen atoms
SMILES -> # aromatic hetero atoms
- Return type:
(dict[str, int], dict[str, int], dict[str, int], dict[str, int])