tcutility.job#

Overview#

This module offers you the tools to efficiently and easily build computational workflows with various engines. The module defines usefull classes that do all the heavy lifting (input and runscript preparation) in the background, while ensuring correctness of the generated inputs

Job classes#

Jobs are run using subclasses of the Job class. The base Job class handles setting up directories and running the calculations.

The Job subclasses are also context-managers, which results in cleaner and more error-proof code:

 1from tcutility.job import ADFJob
 2
 3# job classes are also context-managers
 4# when exiting the context-manager the job will automatically be run
 5# this ensures you won't forget to start the job
 6with ADFJob() as job:
 7    job.molecule('example.xyz')
 8
 9# you can also not use the context-manager
10# in that case, don't forget to run the job
11job = ADFJob()
12job.molecule('example.xyz')
13job.run()

You can control where a calculation is run by changing the job.name and job.rundir properties.

1from tcutility.job import ADFJob
2
3with ADFJob() as job:
4    job.molecule('example.xyz')
5    job.rundir = './calc_dir/molecule_1'
6    job.name = 'ADF_calculation'
7
8print(job.workdir)

This script will run a single point calculation using ADF in the working directory ./calc_dir/molecule_1/ADF_calculation. You can access the full path to the working directory using the job.workdir property.

Slurm support#

One usefull feature is that the Job class detects if slurm is able to be used on the platform the script is running on. If slurm is available, jobs will be submitted using sbatch instead of ran locally. It is possible to set any sbatch option you would like.

1from tcutility.job import ADFJob
2
3with ADFJob() as job:
4    job.molecule('example.xyz')
5    # we can set any sbatch settings using the job.sbatch() method
6    # in this case, we set the partition to 'tc' and the number of cores to 32
7    job.sbatch(p='tc', n=32)

Job dependencies#

It is possible to setup dependencies between jobs. This allows you to use the results of one calculation as input for a different calculation.

 1from tcutility.job import ADFJob, CRESTJob
 2
 3# submit and run a CREST calculation
 4with CRESTJob() as crest_job:
 5    crest_job.molecule('input.xyz')
 6    crest_job.sbatch(p='tc', n=32)
 7
 8    crest_job.rundir = './calculations/molecule_1'
 9    crest_job.name = 'CREST'
10
11# get the 10 lowest conformers using the crest_job.get_conformer_xyz() method
12for i, conformer_xyz in enumerate(crest_job.get_conformer_xyz(10)):
13    # set up the ADF calculation
14    with ADFJob() as opt_job:
15        # make the ADFJob depend on the CRESTJob
16        # slurm will wait for the CRESTJob to finish before starting the ADFJob
17        opt_job.dependency(crest_job)
18        # you can set a file to an xyz-file
19        # that does not exist yet as the molecule
20        opt_job.molecule(conformer_xyz)
21        opt_job.sbatch(p='tc', n=16)
22
23        opt_job.functional('OLYP-D3(BJ)')
24        opt_job.basis_set('TZ2P')
25        opt_job.quality('Good')
26        opt_job.optimization()
27
28        opt_job.rundir = './calculations/molecule_1'
29        opt_job.name = f'conformer_{i}'

This script will first setup and submit a CRESTJob calculation to generate conformers for the structure in input.xyz. It will then submit geometry optimizations for the 10 lowest conformers using ADFJob at the OLYP-D3(BJ)/TZ2P level of theory. Slurm will first wait for the CRESTJob calculation to finish before starting the ADFJob calculations.

Rerun prevention#

Before submitting a calculation tcutility.job will check if the calculation has already been run or is currently being managed by slurm. This way you can be sure that you are not wasting time rerunning your calculation when you run a script you have run before.

For example, we can write a script that performs optimizations using ADFJob on structures stored in a directory:

 1from tcutility.job import ADFJob
 2import os
 3
 4
 5input_xyz_directory = 'molecules'
 6
 7# get the xyz files we want to optimize
 8xyz_files = [os.path.join(input_xyz_directory, file) for file in os.listdir(input_xyz_directory) if file.endswith('.xyz')]
 9
10for xyz_file in xyz_files:
11    with ADFJob() as job:
12        job.molecule(xyz_file)
13        job.sbatch(p='tc', n=16)
14
15        job.functional('OLYP-D3(BJ)')
16        job.basis_set('TZ2P')
17        job.quality('Good')
18        job.optimization()
19
20        job.rundir = './calculations'
21        job.name = os.path.split(file)[1].removesuffix('.xyz')

Everytime this script is run it will loop through the molecules stored in the molecules directory. If you add new molecules to this directory and then rerun it, the script will detect which molecules were previously optimized and skip those. This way you can easily reuse the script multiple times without manually checking/implementing rerun prevention.

Supported engines#

We currently support the following engines and job classes:

See the API Documentation for an overview of the Job classes offered by tcutility.job module.

Note

If you want support for new engines/classes, please open an issue on our GitHub page, or let one of the developers know!

Requirements#

To run calculations related to the Amsterdam Modelling Suite (AMS) you will require a license.

For ORCA calculations you will need to add the ORCA executable to your PATH.

Examples#

A few typical use-cases are given below. Click here for a full overview of all examples. Of course, the scripts shown above are also valid example uses of tcutility.job!

Geometry optimization using ADF#

It is quite easy to set up calculations using the tcutility.job package. For example, if we want to run a simple geometry optimization using ADF we can use the ADFJob class.

In this case we are optimizing the water dimer at the BP86-D3(BJ)/TZ2P level. To handle the ADF settings you can refer to the GUI. For example, to use a specific functional simply enter the name of the functional as it appears in the ADF GUI. The same applies to pretty much all settings. The ADFJob class will handle everything in the background for you.

The job will be run in the ./calculations/GO_water_dimer directory. The tcutility.job package will handle running of the calculation as well. It will detect if your platform supports slurm and if it does, will use sbatch to run your calculations. Otherwise, it will simply run the calculation locally.

 1import pathlib as pl
 2
 3from scm.plams import AMSJob, Molecule, Settings, config, finish, init
 4from tcutility.job import ADFJob
 5
 6current_file_path = pl.Path(__file__).parent
 7mol_path = current_file_path / "water_dimer.xyz"
 8
 9
10def try_plams_job(mol: Molecule) -> None:
11    # Test case with plams for checking if plams works solely on Windows
12    run_set = Settings()
13    run_set.input.ams.Task = "GeometryOptimization"
14    run_set.input.adf.Basis.Type = "DZP"
15    run_set.input.adf.XC.GGA = "BP86"
16
17    config.log.file = 7
18    config.log.stdout = 7
19
20    init(path=str(current_file_path), folder="GO_water_dimer", config_settings=config)
21    AMSJob(molecule=mol, name="water_dimer", settings=run_set).run()
22    finish()
23
24
25def try_tcutility_job(mol: Molecule) -> None:
26    # Test case with tcutility for checking if tcutility works solely on Windows
27    with ADFJob(use_slurm=False) as job:
28        job.molecule(mol)
29        job.rundir = str(current_file_path / "calculations")
30        job.name = "GO_water_dimer"
31        job.functional("BP86-D3(BJ)")
32        job.basis_set("TZ2P")
33        job.quality("Good")
34        job.optimization()
35
36
37def main():
38    current_file_path = pl.Path(__file__).parent
39    mol_path = current_file_path / "water_dimer.xyz"
40
41    mol = Molecule(str(mol_path))
42
43    # Use these functions to test if a plams and tcutility job can be run on Windows, Mac, and Linux. Both do not use slurm.
44    try_plams_job(mol)
45    try_tcutility_job(mol)
46
47
48if __name__ == "__main__":
49    main()

Fragment calculation using ADF#

Another common usage of ADF is running a fragment calculation. This calculation requires setting up three different ADF jobs. Using the tcutility.job package allows you to set up and run these kinds of calculations in as little as 8 lines of code.

In this case we make use of a special xyz file format (see tcutility.molecule.guess_fragments()) which specifies the fragments. This saves us some work in setting up the calculations.

 1from tcutility.job import ADFFragmentJob
 2from tcutility import molecule
 3
 4# load a molecule
 5mol = molecule.load('NH3BH3.xyz')
 6
 7# define a new job using the Job context-manager
 8with ADFFragmentJob() as job:
 9	# add the molecule
10	job.molecule(mol)
11	# add the fragments. The fragment atoms are defined in the input xyz file
12	for fragment_name, fragment in molecule.guess_fragments(mol).items():
13		job.add_fragment(fragment, fragment_name)