Create a fingerprint database file

Use the create_db_file() function to create the fingerprint database file.

Caution

FPSim2 only supports integer molecule ids.

The fingerprints are calculated with RDKit. Fingerprint types available are:

From a .sdf file

>>> from FPSim2.io import create_db_file
>>> create_db_file('chembl.sdf', 'chembl.h5', 'Morgan', {'radius': 2, 'nBits': 2048}, mol_id_prop='mol_id')

From a .smi file

>>> from FPSim2.io import create_db_file
>>> create_db_file('chembl.smi', 'chembl.h5', 'Morgan', {'radius': 2, 'nBits': 2048})

From a Python list

>>> from FPSim2.io import create_db_file
>>> create_db_file([['CC', 1], ['CCC', 2], ['CCCC', 3]], 'test/10mols.h5', 'Morgan', {'radius': 2, 'nBits': 2048})

From any other Python iterable like a SQLAlchemy result proxy

from FPSim2.io import create_db_file
from sqlalchemy.orm import Session
from sqlalchemy import create_engine

engine = create_engine('sqlite:///test/test.db')
s = Session(engine)
sql_query = "select mol_string, mol_id from structure"
res_prox = s.execute(sql_query)
create_db_file(res_prox, 'test/10mols.h5', 'Morgan', {'radius': 2, 'nBits': 2048})