tcutility package#

Subpackages#

Submodules#

tcutility.cache module#

timed_cache(delay)[source]#

Decorator that creates a timed cache for the function or method. This cache will expire after a chosen amount of time.

Parameters:: delay (float) – the expiry time for the function cache.

cache(func)[source]#: Function decorator that stores results from previous calls to the function or method.

cache_file(file)[source]#

Function decorator that stores results of a function to a file. Because results are written to a file the values persist between Python sessions. This is useful, for example, for online API calls.

Parameters:: file – the filepath to store function call results to. Files will be stored in the platform dependent temporary file directory.

See also

platformdirs.user_cache_dir for information on the temporary directory.

tcutility.cite module#

cite(doi, style='wiley', mode='html')[source]#

Format an article in a certain style.

Parameters:

doi (str) – the article DOI to generate a citation for.
style (str) – the style formatting to use. Can be ['wiley', 'acs', 'rsc'].
mode – the formatting mode. Can be ['html', 'latex', 'plain'].

Return type:

str

get_pages(data)[source]#

is_accepted(data)[source]#

tcutility.connect module#

class Connection(server=None, username=None, key_filename=None)[source]#

Bases: object

Main class used for creating and using SSH sessions to remote servers. It gives you the option to execute any shell code, but also provides useful default commands (for example, cd and ls). The Connection class also allows you to download and upload files between your local machine and the remote.

Parameters:

server (str) – the adress of the server you want to connect to. You can prepend the server adress with your username separated from the adress with a @ character. For example: Connection('username@server.address.nl') is the same as Connection('server.address.nl', 'username').
username (str) – the username used to log in to the remote server.
key_filename (str) – if you cannot log in using only the ssh command you can try to give the filename of the private key that matches a public key on the server.

Usage:

This class is a context manager and the with-syntax should be used to open and automatically close connections. For example, to open a connection to the Snellius supercomputer we use the following code:

from tcutility.connect import Connection

with Connection('username@server.address.nl') as server:
    print(server.pwd())  # this will print the home-directory of the logged-in user
    server.cd('example/path/to/some/data')
    print(server.pwd())  # ~/example/path/to/some/data

Warning

Currently we only support logging in using SSH keys. Make sure that you can log in to the remote with SSH keys. There might be server specific instructions on how to enable this authentication method.

open()[source]#

close()[source]#

full_path(path)[source]#

Return type:: str

execute(command)[source]#

Run a command on the server and return the output.

Parameters:: command (str) – the command to run on the server.
Return type:: str
Returns:: Data written in stdout after the command was run.
Raises:: RuntimeError` with error message if there was something printed to the stderr –

Note

The __call__ method redirects to this method. This means you can directly call the Connection object with your command.

ls(path='')[source]#

Run the ls program and return useful information about the paths.

Parameters:

path – the path to run ls on.

Return type:

Result

Returns:

Result object containing information from the output of the ls program.

The keys are the path names and the values contain the information.

owner (str) - the username of the owner of the path.
date (datetime.datetime) - datetime object holding the date the file was created.
is_dir (bool) - whether the path is a directory.
is_hidden (bool) - whether the path is hidden.
permissions (str) - the permissions given to the path.

cd(path='~')[source]#

Run the cd command.

Parameters:: path – the path to change directories to. This is relative to the current directory.

Note

Due to limitations with some servers (e.g. Snellius) we do not actually run the cd command, but update the internal Connection.currdir attribute. Before running any command we prepend with cd {self.currdir}; .... In this way we run commands from the correct directory.

pwd()[source]#

Run the pwd command.

Note

Due to limitations with some servers (e.g. Snellius) we do not actually run the pwd command, instead we return the internal Connection.currdir attribute. See the Connection.cd() method for more details.

Return type:: str

mkdir(dirname)[source]#

Run the mkdir command.

Parameters:: dirname – the name of the directory to make. This is relative to the current directory.

rm(file_path)[source]#

rmtree(dirname)[source]#

download(server_path, local_path)[source]#

Download a file from the server and store it on your local machine.

Parameters:

server_path (str) – the path on the server to the file to download. The path is relative to the current directory.
local_path (str) – the path on the local machine where the file is stored.

upload(local_path, server_path=None)[source]#

Upload a file from your local machine to the server. If the server_path is not given, store it in the current directory.

Parameters:

local_path (str) – the path on the local machine where the file to be uploaded is stored.
server_path (str) – the path to upload the file to. If not given or set to None we upload the file to the current directory with the same filename.

path_exists(path)[source]#

open_file(file_path)[source]#

chmod(rights, file_path)[source]#

class ServerFile(file_path, server)[source]#: Bases: object

class Server(username=None)[source]#

Bases: Connection

Helper subclass of :class:Connection that is used to quickly connect to a specific server. The constructor takes only the username as the server url is already set. You can also specify default settings for sbatch calls, for example the partition or time-limits.

server = None#

sbatch_defaults = {}#

preamble_defaults = {}#

postamble_defaults = {}#

program_modules = {}#

class Local[source]#

Bases: object

server = None#

sbatch_defaults = {}#

preamble_defaults = {}#

postamble_defaults = {}#

program_modules = {}#

execute(command)[source]#

Execute a command on the local machine and return the output.

Parameters:: command (str) – the command to run.
Return type:: str

Note

We use subprocess.check_output with the shell=True argument enabled.

mkdir(dirname)[source]#

rm(file_path)[source]#

rmtree(dirname)[source]#

download(server_path, local_path)[source]#

upload(local_path, server_path=None)[source]#

path_exists(path)[source]#

Return type:: bool

open_file(file_path)[source]#

chmod(rights, path)[source]#

pwd()[source]#

ls(dirname)[source]#

class Bazis(username=None)[source]#

Bases: Server

Default set-up for a connection to the Bazis cluster. By default we use the tc partition.

server = 'bazis.labs.vu.nl'#

sbatch_defaults = {'N': 1, 'mem': 250000, 'n_tasks_per_node': 16, 'p': 'tc'}#

preamble_defaults = {'AMS': ['export SCM_TMPDIR="/scratch/$SLURM_JOBID"', 'srun mkdir -p $SCM_TMPDIR', 'chmod 700 $SCM_TMPDIR']}#

program_modules = {'AMS': {'2021': 'module load shared ams/2021.102', '2022': 'module load shared ams/2022.103', '2023': 'module load shared ams/2023.101', '2024': 'module load shared ams/2024.102', 'latest': 'module load shared ams/2024.102'}}#

postamble_defaults = {'AMS': ['srun rm -rf $SCM_TMPDIR']}#

class Snellius(username=None)[source]#

Bases: Server

Default set-up for a connection to the Snellius cluster. By default we use the rome partition and a time-limit set to 120:00:00.

server = 'snellius.surf.nl'#

sbatch_defaults = {'N': 1, 'n_tasks_per_node': 16, 'p': 'rome', 't': '120:00:00'}#

program_modules = {'AMS': {'2023': 'module load 2023 AMS/2023.104-intelmpi', '2024': 'module load 2024 AMS/2024.104-intelmpi-aocl', 'latest': 'module load 2024 AMS/2024.104-intelmpi-aocl'}}#

get_current_server()[source]#

Return the Server-subclass of the server location of the current shell. If the server location could not be detected returns Local.

Return type:: Server

tcutility.constants module#

tcutility.environment module#

class OSName(*values)[source]#

Bases: Enum

An enumeration of the different operating systems.

WINDOWS = 1#

LINUX = 2#

MACOS = 3#

get_os_name(server=<tcutility.connect.Local object>)[source]#

Get the name of the operating system. Returns a value from the OSName enumeration.

Return type:: OSName

tcutility.errors module#

Module containing errors to distinguish between tcutility-specific errors and general python errors from other packages / scripts.

exception TCError[source]#

Bases: Exception

Base class for all errors in the tcutility package.

exception TCJobError(job_class, message)[source]#

Bases: TCError

An error that occurs when a job fails to run properly.

exception TCMoleculeError[source]#

Bases: TCError

An error that occurs when a molecule is not in a valid state.

exception TCCompDetailsError(section, message)[source]#

Bases: TCError

An error that occurs when the computation details are not in a valid state. It expects a section such as a “Functional” or “Basis set” and a message.

tcutility.formula module#

parse_molecule(molecule)[source]#

Analyse a molecule and return the molstring describing its parts. Each part will then be separated by a + sign in the new string.

Parameters:: molecule (Molecule) – plams.Molecule object to be parsed.
Return type:: str
Returns:: A string that contains each part of the molecule separated by a + sign, for use in molecule() function for further formatting.

molecule(molecule, mode='unicode')[source]#

Parse and return a string containing a molecular formula that will show up properly in LaTeX, HTML or unicode.

Parameters:

molecule (Union[str, Molecule]) – plams.Molecule object or a string that contains the molecular formula to be parsed. It can be either single molecule or a reaction. Molecules should be separated by + or ->.
mode (str) – the formatter to convert the string to. Should be unicode, html, latex, pyplot.

Return type:

str

Returns:

A string that is formatted to be rendered nicely in either HTML or LaTeX. In the returned strings any numbers will be subscripted and +, -, * and • will be superscripted. For latex and pyplot modes we apply \mathrm to letters.

Examples

>>> molecule('C9H18NO*')
'C₉H₁₈NO•'

>>> molecule('C2H2 + CH3* -> C2H2CH3', mode='html')
'C<sub>2</sub>H<sub>2</sub> + CH<sub>3</sub><sup>•</sup> -> C<sub>2</sub>H<sub>2</sub>CH3'

See also

The parse_molecule() function is used to convert plams.Molecule objects to a molecular formula.

tcutility.geometry module#

class Transform[source]#

Bases: object

Transformation matrix that handles rotation, translation and scaling of sets of 3D coordinates.

Build and return a transformation matrix. This 4x4 matrix encodes rotations, translations and scaling.

$\textbf{M} = \begin{bmatrix} \textbf{R}\text{diag}(S) & \textbf{T} \\ \textbf{0}_3 & 1 \end{bmatrix}$

where $\textbf{R} \in \mathbb{R}^{3 \times 3}$, $\textbf{T} \in \mathbb{R}^{3 \times 1}$ and $\textbf{0}_3 = [0, 0, 0] \in \mathbb{R}^{1 \times 3}$.

When applied to a coordinates $[\textbf{x}, \textbf{y}, \textbf{z}, \textbf{1}]^T \in \mathbb{R}^{n \times 4}$ it will apply these transformations simultaneously.

apply(v)[source]#

Applies the transformation matrix to vector(s) $v \in \mathbb{R}^{N \times 3}$.

Application is a three-step process:

Append row vector of ones to the bottom of $v$
Apply the transformation matrix: $\textbf{M}v$
Remove the bottom row vector of ones and return the result

Return type:: ndarray
Returns:: A new array $v' = \textbf{M}v$ that has been transformed using this transformation matrix.

Note

The Transform.__call__() method redirects to this method. Calling transform.apply(coords) is the same as transform(coords).

combine_transforms(other)[source]#

Combine two different transform objects. This involves creating a new Transform object and multiplying the two transform matrices and assigning it to the new object.

Parameters:: other (Transform) – the transformation matrix object to combine this one with.
Return type:: Transform
Returns:: A new transformation matrix that is a product of the original (left side) and other (right side) matrices.

Note

The Transform.__matmul__() method redirects to this method. Calling new = this.combine_transforms(other) is the same as new = this @ other.

translate(T=None, x=None, y=None, z=None)[source]#

Add a translation component to the transformation matrix. Arguments can be given as a container of x, y, z values. They can also be given separately. You can also specify x, y and z components separately

Example usage:: Transform.translate([2, 3, 0])

Transform.translate(x=2, y=3)

rotate(R=None, x=None, y=None, z=None)[source]#

Add a rotational component to transformation matrix. Arguments can be given as a rotation matrix R in R^3x3 or by specifying the angle to rotate along the x, y or z axes

Example usage:: Transform.rotate(get_rotmat(x=1, y=-1))

Transform.rotate(x=1, y=-1)

See also

get_rotmat() rotate()

scale(S=None, x=None, y=None, z=None)[source]#

Add a scaling component to the transformation matrix. Arguments can be given as a container of x, y, z values. You can also specify x, y and z components separately

Example usage:: Transform.scale([0, 0, 3])

Transform.scale(z=3)

reflect(normal=None)[source]#

Add a reflection across a plane given by a normal vector to the transformation matrix. The reflection is given as

$R = \mathbb{I} - 2\frac{nn^T}{n^Tn} \in \mathbb{R}^{3 \times 3}$

where $n$ is the normal vector of the plane to reflect along.

Parameters:: normal (ndarray) – the normal vector of the plane to reflect across. If not given or None, it will be set to one unit along the x-axis, i.e. a reflection along the yz-plane.

References

https://en.wikipedia.org/wiki/Reflection_(mathematics)

get_rotmat()[source]#

get_translation()[source]#

to_vtkTransform()[source]#

class KabschTransform(X, Y)[source]#

Bases: Transform

Use Kabsch-Umeyama algorithm to calculate the optimal transformation matrix $T_{Kabsch}$ that minimizes the RMSD between two sets of coordinates $X \in \mathbb{R}^{N \times 3}$ and $Y \in \mathbb{R}^{N \times 3}$, such that

$\text{arg}\min_{T_{Kabsch}} \text{RMSD}(T_{Kabsch}(X), Y)$

It is numerically stable and works when the covariance matrix is singular. Both sets of points must be the same size for this algorithm to work. The coordinates are first centered onto their centroids before determining the optimal rotation matrix.

Parameters:

X (ndarray) – array containing the first set of coordinates. The Kabsch transformation matrix will be made such that applying it to X will yield Y.
Y (ndarray) – array containing the second set of coordinates. These coordinates is the target to transform to.

Warning

In principle, the Kabsch-Umeyama algorithm does not care about the dimensions of the coordinates, however we will always assume 3D coordinates as that is our most common use-case. Further, the Transform class also assumes 3D coordinates. If you would like to make use of 2D or 1D Transforms we suggest you simply set the correct axes to zero.

See also

Transform: The main transformation class.

Example

from tcutility import geometry
import numpy as np

# create two arrays that are the same
X, Y = np.arange(5 * 3).reshape(5, 3), np.arange(5 * 3).reshape(5, 3)

# create a transformation matrix to change X
Tx = geometry.Transform()
Tx.rotate(x=1, y=1, z=1)
Tx.translate(x=1, y=1, z=1)

X = Tx(X)

# get the Kabsch transformation matrix
Tkabsch = geometry.KabschTransform(X, Y)

# check if applying the transformation matrix to X yields Y
assert np.isclose(Tkabsch(X), Y).all()

References

https://en.wikipedia.org/wiki/Orthogonal_Procrustes_problem

https://en.wikipedia.org/wiki/Kabsch_algorithm

class MolTransform(mol)[source]#

Bases: Transform

A subclass of Transform that is designed to generate transformation for a molecule. It adds, among others, methods for aligning atoms to specific vectors, planes, or setting the centroid of the molecule. The nice thing is that the class applies the transformations based only on the atom indices given by the user.

Parameters:: mol (Molecule) – the molecule that is used for the alignment.

Note

Indexing starts at 1 instead of 0.

center(*indices)[source]#

Center the molecule on given indices or by its centroid.

Parameters:: indices – the indices that are used to center the molecule. If not given the centering will be done based on all atoms.

align_to_vector(index1, index2, vector=None)[source]#

Align the molecule such that a bond lays on a given vector.

Parameters:

index1 (int) – index of the first atom.
index2 (int) – index of the second atom.
vector (Sequence[float]) – the vector to align the atoms to. If not given or None it defaults to (1, 0, 0).

align_to_plane(index1, index2, index3, vector=None)[source]#

Align a molecule such that the normal of the plane defined by three atoms is aligned to a given vector.

Parameters:

index1 (int) – index of the first atom.
index2 (int) – index of the second atom.
index3 (int) – index of the third atom.
vector (Sequence[float]) – the vector to align the atoms to. If not given or None it defaults to (0, 1, 0).

get_rotmat(x=None, y=None, z=None)[source]#

Create a rotation matrix based on the Tait-Bryant sytem. In this system, x, y, and z are angles of rotation around the corresponding axes. This function uses the right-handed convention

Parameters:

x (float) – Rotation around the x-axis in radians.
y (float) – Rotation around the y-axis in radians.
z (float) – Rotation around the z-axis in radians.

Return type:

ndarray

Returns:

the rotation matrix $\textbf{R} \in \mathbb{R}^{3 \times 3}$ with the specified axis rotations.

See also

apply_rotmat(): For applying the rotation matrix to coordinates.
rotate(): For rotating coordinates directly, given Tait-Bryant angles.
Transform.rotate(): The Transform class allows you to also rotate.

rotmat_to_angles(R)[source]#

Return type:: Tuple[float]

apply_rotmat(coords, R)[source]#

Apply a rotation matrix to a set of coordinates.

Parameters:

coords (ndarray) – the coordinates :math`in mathbb{R}^{n times 3}` to rotate.
R (ndarray) – the rotation matrix to apply.

Returns:

math`in mathbb{R}^{n times 3}` rotated using the given rotation matrix.

Return type:

New coordinates

See also

get_rotmat(): For creating a rotation matrix.
rotate(): For rotating coordinates directly, given Tait-Bryant angles.

rotate(coords, x=None, y=None, z=None)[source]#

Build and apply a rotation matrix to a set of coordinates.

Parameters:

coords (ndarray) – the coordinates :math`in mathbb{R}^{n times 3}` to rotate.
x (float) – Rotation around the x-axis in radians.
y (float) – Rotation around the y-axis in radians.
z (float) – Rotation around the z-axis in radians.

Return type:

ndarray

See also

get_rotmat(): For creating a rotation matrix.

vector_align_rotmat(a, b)[source]#

Calculate a rotation matrix that aligns vector a onto vector b.

Parameters:

a (ndarray) – vector that is to be aligned.
b (ndarray) – vector that is the target of the alignment.

Return type:

ndarray

Returns:

Rotation matrix R, such that geometry.apply_rotmat(a, R) == b.

RMSD(X, Y, axis=None, use_kabsch=True, include_mirror=False)[source]#

Calculate Root Mean Squared Deviations between two sets of points X and Y. By default Kabsch’ algorithm is used to align the sets of points prior to calculating the RMSD. Optionally the axis can be given to calculate the RMSD along different axes.

RMSD is given as

$\text{RMSD}(X, Y) = \frac{1}{N}\sqrt{\sum_i^N (X_i - Y_i)^2}$

when using the Kabsch algorithm to align the two sets of coordinates we first obtain the KabschTransform $T_{Kabsch}$ and then

$\text{RMSD}(X, Y) = \frac{1}{N}\sqrt{\sum_i^N (T_{Kabsch}(X_i) - Y_i)^2}$

Parameters:

X (ndarray) – the first set of coordinates to compare. It must have the same dimensions as Y.
Y (ndarray) – the second set of coordinates to compare. It must have the same dimensions as X.
axis (Optional[int]) – axis to compare. Defaults to None.
use_kabsch (bool) – whether to use Kabsch’ algorithm to align X and Y before calculating the RMSD. Defaults to True.
include_mirror (bool) – return the lowest value between the RMSD of the supplied coordinates and also the RMSD of mirrored X with Y. This will only be done if use_kabsch == True.

Return type:

float

Returns:

RMSD in the units of X and Y. If axis is set to an integer this function will return a vector of RMSD’s along that axis.

Note

It is generally recommended to enable the use of the Kabsch-Umeyama algorithm prior to calculating the RMSD. This will ensure you get the lowest possible RMSD for you sets of coordinates.

See also

KabschTransform

random_points_on_sphere(shape, radius=1)[source]#

Generate random points on a sphere with a specified radius.

Parameters:

shape (Tuple[int]) – The shape of the resulting points, generally shape[0] coordinates with shape[1] dimensions
radius (float) – The radius of the sphere to generate the points on.

Return type:

ndarray

Returns:

Array of coordinates on a sphere.

random_points_in_anular_sphere(shape, min_radius=0, max_radius=1)[source]#

Generate random points in an sphere or anular sphere with specified radii. An anular sphere is a hollow sphere of a certain thickness.

Parameters:

shape (Tuple[int]) – The shape of the resulting points, generally shape[0] coordinates with shape[1] dimensions
min_radius (float) – The lowest radius of the sphere to generate the points in.
max_radius (float) – The largest radius of the sphere to generate the points in.

Returns:

Array of coordinates on a sphere.

random_points_on_spheroid(coordinates, Nsamples=1, margin=0)[source]#

Generate random points on a spheroid generated by a set of coordinates.

Parameters:

coordinates (ndarray) – The (n x dim) set of coordinates that is used to generate the minimum-volume spheroid.
Nsamples (int) – The number of samples to return.
margin (float) – the spacing between the sampling spheroid and the minimum-volume spheroid.

Returns:

Array of coordinates on a spheroid.

parameter(coordinates, *indices, pyramidal=False, sum_of_angles=False)[source]#

Return geometry information about a set of coordinates given 1 to 4 indices. If 1 index is given we return the coordinate at that index. If 2 indices are given we return the distance between the coordinates at the indices. If 3 indices are given we return the angle between the vector from index 1 to 2 and the vector from index 2 to 3. If 4 indices are given we return the dihedral angle or the pyramidalization angle (if pyramidal is set to True) or the sum-of-angles (if sum_of_angles is set to True).

Parameters:

coordinates (Union[Buffer, _SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[bool | int | float | complex | str | bytes]]) – set of coordinates to calculate parameters for.
indices (Sequence[int]) – 1 to 4 integers specifying the indices to use in the coordinates set.
pyramidal (bool) – if 4 indices are given return the pyramidalization angle in degrees.
sum_of_angles (bool) – if 4 indices are given return the sum of the angles between the first index and the rest in degrees.

tcutility.log module#

class Emojis[source]#

Bases: object

Class containing some useful emojis and other characters. Supports dot-notation and indexation to get a character.

E.g. Emojis.wait == Emojis['wait']

wait = '🕒'#

good = '✅'#

cancel = '🛑'#

sleep = '💤'#

fail = '❌'#

send = '📤'#

receive = '📥'#

empty = '⠀⠀'#

finish = '🏁'#

warning = '⚠️'#

question = '❔'#

info = 'ℹ️'#

rarrow = '─>'#

larrow = '<─'#

lrarrow = '<─>'#

rlarrow = '<─>'#

angstrom = 'Å'#

class NoPrint(stdout=None, stderr=None)[source]#

Bases: object

Context-manager that suppresses printing. It works by redirecting prints to a temporary output file. This file is deleted after exiting the context-manager.

time_stamp()[source]#: Return the current timestamp in a “[YYYY/MM/DD HH:MM:SS] “” format.

log(message='', level=20, end='\\n')[source]#

Print a nicely formatteed message. This function adds the current timestamp and supports multi-line printing (split on the \n escape character). For verbosity levels we use the following convention:

NOTSET   = 0
DEBUG    = 10
INFO     = 20
WARN     = 30
ERROR    = 40
CRITICAL = 50

Parameters:

message (Any) – the message to send. Before printing we will use the message.__str__ method to get the string representation. If the message is a dict we use the json module to format the message nicely.
level (int) – the level to print the message at. We compare the level against the module-wide log_level variable (by default log_level = 20). If the level is below log_level we do not print it.
end (str) – the end of the string. This is usually the new-line character \n.

flow(message='', tags=['straight'], level=20)[source]#

Function to create flowchart-like output. It will print a message prepended by flow elements (arrows and lines). The flow elements are determined based on the given tags.

Return type:: None

table(rows, header=None, sep=' ', hline=[], level=20)[source]#

Print a table given rows and a header. Values in rows will be cast to strings first.

Parameters:

rows (List[List[Any]]) – list of nrows sequences containing ncols data inside the table.
header (Optional[List[str]]) – list of ncols strings that represent the column names of the table. They will be printed at the top of the table.
sep (str) – str representing the separation between columns.
hline (List[int]) – list of integers specifying rows after which lines will be drawn. Supports negative indices, e.g. -1 will draw a line at the end of the table.

Returns:

the table in string format, where lines are separated by “n”

Return type:

str

rectangle_list(values, spaces_before=0, level=20)[source]#: This function prints a list of strings in a rectangle to the output. This is similar to what the ls program does in unix.

loadbar(sequence, comment='', Nsegments=50, Nsteps=10, level=20)[source]#

Return values from an iterable sequence and also print a progress bar for the iteration over this sequence.

Parameters:

sequence (Union[Iterable[TypeVar(T)], Sequence[TypeVar(T)]]) – any iterable sequence. Should define the __len__ method.
comment (str) – a string to be printed at the end of the loading bar to give information about the loading bar.
Nsegments (int) – length of the loading bar in characters.
Nsteps (int) – number of times to print the loading bar during iteration. If the output is a tty-type stream Nsteps will be set to the length of sequence.

Return type:

Generator[TypeVar(T), None, None]

boxed(message, title=None, message_align='left', title_align='left', round_corners=True, double_edge=False, level=20)[source]#

Print a message surrounded by a box with optional title.

Parameters:

message (str) – the message to place in the box. Multiline messages are separated by “n”.
title (Optional[str]) – the title placed in the top edge of the box.
message_align (str) – alignment of the text inside the box. One of [“left”, “center”, “right”].
title_align (str) – alignment of the title. One of [“left”, “center”, “right”].
round_corners (bool) – whether the corners of the box should be rounded or not. Rounded corners are only available for single-edge boxes.
double_edge (bool) – whether the edges of the box should be double.

Return type:

None

Returns:

The printed message in strings format.

debug(message, level=10, caller_level=2)[source]#: Print a debug message.

info(message, level=20, caller_level=2)[source]#: Print an informative message.

warn(message, level=30, caller_level=2)[source]#: Print a warning message.

error(message, level=40, caller_level=2)[source]#: Print an error message.

critical(message, level=50, caller_level=2)[source]#: Print a critical message.

caller_name(level=1)[source]#

Return the full name of the caller of a function.

Parameters:: level (int) – the number of levels to skip when getting the caller name. Level 1 is always this function. When used by a different function it should be set to 2. E.g. when using the log.warn function level is set to 2.
Return type:: str
Returns:: The full name of the caller function.

tcutility.molecule module#

number_of_electrons(mol, charge=0)[source]#

The number of electrons in a molecule.

Parameters:

mol (Molecule) – the molecule to count the number of electrons from.
charge (int) – the charge of the molecule.

Return type:

int

Returns:

The sum of the atomic numbers in the molecule minus the charge of the molecule.

load(path)[source]#

Load a molecule from a given xyz file path. The xyz file is structured as follows:

[int]
Comment line
[str] [float] [float] [float] atom_tag1 atom_tag2 atom_key1=...
[str] [float] [float] [float] atom_tag1 atom_tag2 atom_key1=...
[str] [float] [float] [float]

mol_tag1
mol_tag2
mol_key1=...
mol_key2 = ...

The xyz file is parsed and returned as a plams.Molecule object. Flags and tags are given as mol.flags and mol.flags.tags respectively. Similarly for the atoms, the flags and tags are given as mol.atoms[i].flags and mol.atoms[i].flags.tags

Return type:: Molecule

from_string(s)[source]#

Load a molecule from a string. Currently only supports simple XYZ-files, e.g. not extended XYZ-files with flags.

Parameters:: s (str) – string containing the molecule to parse. This function only reads the element, x, y and z coordinates on each line. Other lines will not be read.
Return type:: Molecule
Returns:: A new molecule object with the elements and coordinates from the input.

Example

s = """
    O      -0.77012509       2.82058313      -0.00000000
    H      -0.77488739       2.61994920      -0.93878823
    H      -0.75583818       2.00242615       0.50201099
    """
mol = from_string(s)

guess_fragments(mol)[source]#

Guess fragments based on data from the xyz file. Two methods are currently supported, see the tabs below. We also support reading of charges and spin-polarizations for the fragments. They should be given as charge_{fragment_name} and spinpol_{fragment_name} respectively.

8

N       0.00000000       0.00000000      -0.81474153
B      -0.00000000      -0.00000000       0.83567034
H       0.47608351      -0.82460084      -1.14410295
H       0.47608351       0.82460084      -1.14410295
H      -0.95216703       0.00000000      -1.14410295
H      -0.58149793       1.00718395       1.13712667
H      -0.58149793      -1.00718395       1.13712667
H       1.16299585      -0.00000000       1.13712667

frag_Donor = 1, 3-5
frag_Acceptor = 2, 6-8
charge_Donor = -1
spinpol_Acceptor = 2

In this case, fragment atom indices must be provided below the coordinates. The fragment name must be prefixed with frag_. Indices can be given as integers or as ranges using -.

8

N       0.00000000       0.00000000      -0.81474153 frag=Donor
B      -0.00000000      -0.00000000       0.83567034 frag=Acceptor
H       0.47608351      -0.82460084      -1.14410295 frag=Donor
H       0.47608351       0.82460084      -1.14410295 frag=Donor
H      -0.95216703       0.00000000      -1.14410295 frag=Donor
H      -0.58149793       1.00718395       1.13712667 frag=Acceptor
H      -0.58149793      -1.00718395       1.13712667 frag=Acceptor
H       1.16299585      -0.00000000       1.13712667 frag=Acceptor

charge_Donor = -1
spinpol_Acceptor = 2

In this case, fragment atoms are marked with the frag flag which gives the name of the fragment the atom belongs to.

Parameters:: mol (Molecule) – the molecule that is to be split into fragments. It should have defined either method shown above. If it does not define these methods this function returns None.
Return type:: Dict[str, Molecule]
Returns:: A dictionary containing fragment names as keys and plams.Molecule objects as values. Atoms that were not included by either method will be placed in the molecule object with key None.

write_mol_to_xyz_file(out_file, mols, include_n_atoms=False)[source]#

Writes a list of molecules to a file in xyz format.

Return type:: None

write_mol_to_amv_file(out_file, mols, energies, mol_names=None)[source]#

Writes a list of molecules to a file in amv format.

Return type:: None

save(mol, path, comment=None)[source]#: Save a molecule in a custom xyz file format. Molecule and atom flags can be provided as the “flags” parameter of the object (mol.flags and atom.flags).

tcutility.pathfunc module#

split_all(path)[source]#

Split a path into all of its parts.

Parameters:: path (str) – the path to be split, it will be separated using os.path.split().
Return type:: List[str]
Returns:: A list of parts of the original path.

Example

>>> split_all('a/b/c/d')
['a', 'b', 'c', 'd']

get_subdirectories(root, include_intermediates=False, max_depth=None, _current_depth=0)[source]#

Get all sub-directories of a root directory.

Parameters:

root (str) – the root directory.
include_intermediates (bool) – whether to include intermediate sub-directories instead of only the lowest levels.
max_depth (int) – the maximum depth depth to look for subdirectories, e.g. setting it to 1 will return only the contents of the root path.

Return type:

List[str]

Returns:

A list of sub-directories with root included in the paths.

Example

Given a file-structure as follows:

root
|- subdir_a
|  |- subsubdir_b
|  |- subsubdir_c
|- subdir_b
|- subdir_c

Then we get the following outputs.

>>> get_subdirectories('root', include_intermediates=True)
['root',
 'root/subdir_a',
 'root/subdir_a/subsubdir_b',
 'root/subdir_a/subsubdir_c',
 'root/subdir_b',
 'root/subdir_c']

>>> get_subdirectories('root', include_intermediates=False)
['root/subdir_a/subsubdir_b',
 'root/subdir_a/subsubdir_c',
 'root/subdir_b',
 'root/subdir_c']

path_depth(path)[source]#

Calculate the depth of a given path.

Return type:: int

match(root, pattern, sort_by=None)[source]#

Find and return information about subdirectories of a root that match a given pattern.

Parameters:

root (str) – the root of the subdirectories to look in.
pattern (str) – a string specifying the pattern the subdirectories should correspond to. It should look similar to a format string, without the f in front of the string. Inside curly braces you can put a variable name, which you can later extract from the results. Anything inside curly braces will be matched to word characters ([a-zA-Z0-9_-]) including dashes and underscores.
sort_by (str) – the key to sort the results by. If not given, the results will be returned in the order they were found.

Return type:

Dict[str, dict]

Returns:

A Result object containing the matched directories as keys and information (also Result object) about those matches as the values. Each information dictionary contains the variables given in the pattern. E.g. using a pattern such as {a}/{b}/{c} will populate the info.a, info.b and info.c keys of the info Result object.

Example

Given a file-structure as follows:

root
|- NH3-BH3
|   |- BLYP_QZ4P
|   |  |- extra_dir
|   |  |- blablabla
|   |
|   |- BLYP_TZ2P
|   |  |- another_dir
|   |
|   |- M06-2X_TZ2P
|
|- SN2
|   |- BLYP_TZ2P
|   |- M06-2X_TZ2P
|   |  |- M06-2X_TZ2P

We can run the following scripts to match the subdirectories.

from tcutility import log
# get the matches, we want to extract the system name (NH3-BH3 or SN2)
# and the functional and basis-set
# we don't want the subdirectories
matches = match('root', '{system}/{functional}_{basis_set}')

# print the matches as a table
rows = []
for d, info in matches.items():
    rows.append([d, info.system, info.functional, info.basis_set])

log.table(rows, ['Directory', 'System', 'Functional', 'Basis-Set'])

which prints

[2024/01/17 14:39:08] Directory                  System    Functional   Basis-Set
[2024/01/17 14:39:08] ───────────────────────────────────────────────────────────
[2024/01/17 14:39:08] root/SN2/M06-2X_TZ2P       SN2       M06-2X       TZ2P
[2024/01/17 14:39:08] root/NH3-BH3/BLYP_TZ2P     NH3-BH3   BLYP         TZ2P
[2024/01/17 14:39:08] root/NH3-BH3/M06-2X_TZ2P   NH3-BH3   M06-2X       TZ2P
[2024/01/17 14:39:08] root/SN2/BLYP_TZ2P         SN2       BLYP         TZ2P
[2024/01/17 14:39:08] root/NH3-BH3/BLYP_QZ4P     NH3-BH3   BLYP         QZ4P

tcutility.report module#

tcutility.slurm module#

has_slurm(server=<tcutility.connect.Local object>)[source]#

Function to check if the current platform uses slurm.

Return type:: bool
Returns:: Whether slurm is available on this platform.

squeue(server=<tcutility.connect.Local object>)[source]#

Get information about jobs managed by slurm using squeue.

Return type:

Result

Returns:

A Result object containing information about the calculation status:

directory (list[str]) – path to slurm directories.

id (list[str]) – slurm job id’s.

status (list[str]) – slurm job status name. See squeue documentation.

statuscode (list[str]) – slurm job status codes. See squeue documentation

Note

By default this function uses a timed cache (see timed_cache) with a 3 second delay to lessen the load on HPC systems.

sbatch(runfile, server=<tcutility.connect.Local object>, **options)[source]#

Submit a job to slurm using sbatch.

Parameters:

runfile (str) – the path to the filename to be submitted.
options (dict) – options to be used for sbatch.

Return type:

Result

Returns:

A Result object containing information about the newly submitted slurm job

id (str) - the ID for the submitted slurm job.

command (str) - the command used to submit the job.

workdir_info(workdir, server=<tcutility.connect.Local object>)[source]#

Function that gets squeue information given a working directory. This will return None if the directory is not being actively referenced by slurm.

Return type:: Result
Returns:: Result object containing information about the calculation status, see squeue().

wait_for_job(slurmid, check_every=60, server=<tcutility.connect.Local object>)[source]#

Wait for a slurm job to finish. We check every check_every seconds if the slurm job id is still present in squeue.

Parameters:

slurmid (int) – the ID of the slurm job we are waiting for.
check_every (int) – the amount of seconds to wait before checking squeue again. Don’t put this too low, or you will anger the cluster people.

tcutility.spell_check module#

naive_recursive(a, b)[source]#

The naïve recursive algorithm to obtain the Levenshtein distance between two strings. We do not recommend using this algorithm as it is quite slow and faster alternatives exist.

Parameters:

a (str) – strings to compare.
b (str) – strings to compare.

Return type:

float

Returns:

The Levenshtein distance between the strings a and b.

See also

wagner_fischer(): A more efficient algorithm to obtain the Levenshtein distance (up to 25x faster).

wagner_fischer(a, b, substitution_cost=1, case_missmatch_cost=1, insertion_cost=1)[source]#

Return the Levenshtein distance using the Wagner-Fischer algorithm. You can also change the penalty for various errors for this algorithm. By default, all types of errors incur a penalty of 1.

Parameters:

a (str) – strings to compare.
b (str) – strings to compare.
substitution_cost (float) – the penalty for the erroneous substitution of a character.
case_missmatch_cost (float) – the penalty for miss-matching the case of a character.
insertion_cost (float) – the cost for the erroneous insertion or deletion of a character.

Return type:

float

Returns:

The Levenshtein distance between the strings a and b.

Example

>>> wagner_fischer('kitten', 'sitting')
3

See also

naive_recursive(): An alternative (and slower) algorithm to obtain the Levenshtein distance.

get_closest(a, others, compare_func=<function wagner_fischer>, ignore_case=False, ignore_chars='', maximum_distance=None, **kwargs)[source]#

Return strings that are similar to an input string using the Levenshtein distance.

Parameters:

a (str) – the string to compare the rest to.
others (List[str]) – a collection of strings to compare to a. The returned strings will be taken from this collection.
compare_func – the function to use to compare the strings. Defaults to the efficient wagner_fischer() algorithm.
ignore_case (bool) – whether the case of the strings is taken into account. If enabled, all strings are turned to lower-case before comparison.
ignore_chars (str) – a strings specifying characters that should be ignored.
maximum_distance (int) – the maximum Levenshtein distance to allow. If it is lower than the lowest distance for the collection of strings, we return the strings with the lowest distance. If set to None we return the lowest distance strings.

Return type:

List[str]

Returns:

A collection of strings that have a Levenshtein distance to a below maximum_distance or have the lowest distance to a if all strings have a distance greater than maximum_distance. If the lowest distance is 0, return an empty list instead.

Example

>>> closest = get_closest('kitten', ['mitten', 'bitten', 'sitting'])
>>> print(closest)
['mitten', 'bitten']

make_suggestion(a, others, **kwargs)[source]#

Print a warning that gives suggestions for strings that are close to a given string.

Example

>>> make_suggestion('kitten', ['mitten', 'bitten', 'sitting'])
[2024/01/30 15:26:35] [WARNING](main): Could not find "kitten". Did you mean mitten or bitten?

See also

get_closest() for a description of the function arguments.

check(a, others, caller_level=2, **kwargs)[source]#

tcutility.timer module#

class timer(name=None)[source]#

Bases: object

The main timer class. It acts both as a context-manager and decorator.

print_timings()[source]#

Module contents#

ensure_list(x)#

squeeze_list(x)#

ensure_2d(x, transposed=False)[source]#

tcutility package#

Subpackages#

Submodules#

tcutility.cache module#

tcutility.cite module#

tcutility.connect module#

tcutility.constants module#

tcutility.environment module#

tcutility.errors module#

tcutility.formula module#

tcutility.geometry module#

tcutility.log module#

tcutility.molecule module#

tcutility.pathfunc module#

tcutility.report module#

tcutility.slurm module#

tcutility.spell_check module#

tcutility.timer module#

Module contents#

This Page