Create a fingerprint database file
To create a fingerprint database file for running searches, use either:
- The
fpsim2-create-db
command line tool - The
FPSim2.io.create_db_file
Python function
Both methods are described below.
Warning
FPSim2 only supports integer molecule ids.
The fingerprints are calculated with RDKit. Fingerprint types available are:
Using the command line
Run in parallel using .smi files as input. Example usage:
fpsim2-create-db smiles_file.smi fp_db.h5 --fp_type Morgan --fp_params '{"radius": 2, "fpSize": 256}' --processes 32
Using Python
Note: When using the Python library, fingerprint calculation is single-threaded.
from FPSim2.io import create_db_file
create_db_file(
mols_source='sdf_file.sdf',
filename='fp_db.h5',
mol_format=None, # set to None
fp_type='Morgan',
fp_params={'radius': 2, 'fpSize': 256},
mol_id_prop='mol_id'
)
from FPSim2.io import create_db_file
create_db_file(
mols_source='smiles_file.smi',
filename='fp_db.h5',
mol_format=None, # set to None
fp_type='Morgan',
fp_params={'radius': 2, 'fpSize': 256}
)
from FPSim2.io import create_db_file
mols = [['CC', 1], ['CCC', 2], ['CCCC', 3]]
create_db_file(
mols_source=mols,
filename='fp_db.h5',
mol_format='smiles', # required
fp_type='Morgan',
fp_params={'radius': 2, 'fpSize': 256}
)
SQLAlchemy CursorResult as an example
from FPSim2.io import create_db_file
from sqlalchemy import create_engine, text
engine = create_engine('sqlite:///test/test.db')
with engine.connect() as conn:
sql_query = text("select molfile, mol_id from structure")
cursor = conn.execute(sql_query)
create_db_file(
mols_source=cursor,
filename='fp_db.h5',
mol_format='molfile', # required
fp_type='Morgan',
fp_params={'radius': 2, 'fpSize': 256}
)