Create a fingerprint database file¶
Use the create_db_file()
function to create the fingerprint database file.
Caution
FPSim2 only supports integer molecule ids.
The fingerprints are calculated with RDKit. Fingerprint types available are:
From a .sdf file¶
>>> from FPSim2.io import create_db_file
>>> create_db_file('chembl.sdf', 'chembl.h5', 'Morgan', {'radius': 2, 'nBits': 2048}, mol_id_prop='mol_id')
From a .smi file¶
>>> from FPSim2.io import create_db_file
>>> create_db_file('chembl.smi', 'chembl.h5', 'Morgan', {'radius': 2, 'nBits': 2048})
From a Python list¶
>>> from FPSim2.io import create_db_file
>>> create_db_file([['CC', 1], ['CCC', 2], ['CCCC', 3]], 'test/10mols.h5', 'Morgan', {'radius': 2, 'nBits': 2048})
From any other Python iterable like a SQLAlchemy result proxy¶
from FPSim2.io import create_db_file
from sqlalchemy.orm import Session
from sqlalchemy import create_engine
engine = create_engine('sqlite:///test/test.db')
s = Session(engine)
sql_query = "select mol_string, mol_id from structure"
res_prox = s.execute(sql_query)
create_db_file(res_prox, 'test/10mols.h5', 'Morgan', {'radius': 2, 'nBits': 2048})