`sfftk.core` package

Configs

`sfftk.core.configs`

This module defines classes and functions to correctly process persistent configurations. Please see the guide to miscellaneous operations for a complete description of working with configs.

class sfftk.core.configs.Configs(config_fn, *args, **kwargs)[source]

Bases: dict

Class defining configs

Configurations are stored in a subclass of OrderedDict (normal dict for Python 3.7+) with appended methods for reading (Configs.read()), writing (Configs.write()) and clearing (Configs.clear()) configs.

Printing an object of this class displays all configs.

This class is used an argument to configs.load_configs().

clear()[source]: Clear configs

read()[source]: Read configs from file

write()[source]: Write configs to file

sfftk.core.configs.del_configs(args, configs)[source]

Delete the named config

Parameters:

args (argparse.Namespace) – parsed arguments
configs (dict) – configuration options

Return status:

status

Rtype status:

int

sfftk.core.configs.get_config_file_path(args, user_folder='~/.sfftk', user_conf_fn='sff.conf', config_class=<class 'sfftk.core.configs.Configs'>)[source]

A function that returns the right config path to use depending on the command specified

The user may specify

sff <cmd> [<sub_cmd>] [--shipped-configs|--config-path] [args...]`

and we have to decide which configs to use.

Example:

View the notes in the file. If user configs are available use them otherwise use shipped configs

sff notes list file.json

View the notes in the file but ONLY use shipped configs.

sff notes list --shipped-configs file.json

View the notes in the file but ONLY use custom configs at path

sff notes list --config-path /path/to/sff.conf file.json

Get available configs. First check for user configs and fall back on shipped configs

sff config get --all

Get configs from the path

sff config get --config-path /path/to/sff.conf --all
# ignore shipped still!
sff config get --config-path /path/to/sff.conf --shipped-configs --all

Get shipped configs even if user configs exist

sff config get --shipped-configs --all

Set configs to user configs. If user configs don’t exist copy shipped and add the new config.

sff config set NAME VALUE

Set configs to config path. Ignore user and shipped configs

sff config set --config-path /path/to/sff.conf NAME VALUE

Fail! Shipped configs are read-only

sff config set --shipped-configs NAME VALUE

Parameters:

args –
user_folder –
user_conf_fn –

Returns:

sfftk.core.configs.get_configs(args, configs)[source]

Get the value of the named config

Parameters:

args (argparse.Namespace) – parsed arguments
configs (dict) – configuration options

Return status:

status

Rtype status:

int

sfftk.core.configs.load_configs(config_file_path, config_class=<class 'sfftk.core.configs.Configs'>)[source]

Load configs from the given file

Parameters:

config_file_path (str) – a path to a file with configs
config_class (class) – the config class; default: Configs

Return configs:

the configs

Rtype configs:

Configs

sfftk.core.configs.set_configs(args, configs)[source]

Set the config of the given name to have the given value

Parameters:

args (argparse.Namespace) – parsed arguments
configs (dict) – configuration options

Return status:

status

Rtype status:

int

Parser

`sfftk.core.parser`

A large number of functions in sfftk consume only two arguments: args, which is the direct output of Python’s argparse.ArgumentParser and a configs dictionary, which consists of all persistent configs. This module extends the parser object sfftkrw.core.parser.Parser as well as includes a sfftk.core.parser.parse_args() function which does sanity checking of all command line arguments.

class sfftk.core.parser.UpperAction(option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None)[source]: Bases: Action

sfftk.core.parser.check_multi_file_formats(file_names)[source]

Check file names for file formats

When working with multifile segmentations, this function checks that all files are consistent

Parameters:: file_names (list) – a list of file names
Returns:: a tuple consisting of whether or not the set of file formats if valid, the set of file formats observed and the set of invalid file formats
Return type:: tuple[bool, set, set]

sfftk.core.parser.cli(cmd: str) -> (<class 'argparse.Namespace'>, <class 'configparser.ConfigParser'>)[source]: CLI function

sfftk.core.parser.parse_args(_args, use_shlex=False)[source]

Parse and check command-line arguments and also return configs.

This function does all the heavy lifting in ensuring that commandline arguments are properly formatted and checked for sanity. It also extracts configs from the config files.

In this way command handlers (defined in sfftk.sff e.g. sfftk.sff.handle_convert()) assume correct argument values and can concentrate on functionality making the code more readable.

Parameters:

_args (list or str) – list of arguments (use_shlex=False); string of arguments (use_shlex=True)
use_shlex (bool) – treat _args as a string instead for parsing using shlex lib

Returns:

parsed arguments

Return type:

tuple[argparse.Namespace, sfftk.core.configs.Configs]

Preparation utilities

`sfftk.core.prep`

This module consists of preparation utilities to condition segmentation files prior to conversion.

class sfftk.core.prep.MergedMask(data=None, dtype=dtype('int16'), mask_name_prefix='mask_', zfill=4)[source]

Bases: object

This class describes a special mask used to perform mask merging. It automatically handles complex cases involving mask overlaps by constructing a label tree showing the relations between masks. The trivial case of non-overlapping overlaps will have all labels children of the root label (0).

There are only three ways that an overlap can happen.

no overlap is the trivial case - no elements are shared between masks;
complete overlap: one set of elements is completely contained in another set;
partial overlap: some elements are shared.

For this functionality to work we need several functions:

vectorised addition of masks to the merged mask;
a way to decide the next label to use, which is not necessary the current label plus one;
a way to capture the relationship between labels

Consider the simple exercise of merging the following non-trivial (overlapping) masks:

mask1 = [0, 1, 0, 0]
mask2 = [0, 1, 0, 0]
mask3 = [0, 0, 1, 0]
mask4 = [0, 1, 1, 1]
mask5 = [1, 0, 0, 0]
mask6 = [1, 0, 1, 0]

We will build our merged mask by successively adding each mask to the empty mask: [0, 0, 0, 0].

We assume that all masks are positive binary with values 0 (background) and 1 (elements of interest).

At each iteration, will set a new label to be used. This label will identify the particular mask. Therefore, we multiply the mask by the label.

Because elements can overlap, we need a way to keep track of labels so that we can record when we have to assign labels that indicate either complete or partial overlap. We, therefore, examine the resulting labels and from this infer the relationships between labels. To do this, we have a set of admitted labels as well as a set of new labels. By comparing these sets and taking into account the current label, we can determine the label for elements resulting from overlap and which labels they relate to.

merged_mask = [0, 0, 0, 0] # the internal value of MergedMask's array
label = 1
label_set = {}
label_tree = dict()
# mask 1
merged_mask = merged_mask + [0, 1, 0, 0] * 1 # => [0, 1, 0, 0]
label_set = {1}
label_tree[1] = 0 # 1 is a child of the root (0) => {1: 0}
new_labels = {}
label = numpy.amax(merged_mask) + 1 = 2
# mask 2
merged_mask = [0, 1, 0, 0] + [0, 1, 0, 0] * 2 = [0, 3, 0, 0]
label_set = {1, 2}
label_tree[2] = 0 # => {1: 0, 2: 0}
new_labels = {3}
label_tree[3] = [1, 2] # 3 is a child of 1 and 2 (overlap) => {1: 0, 2: 0, 3: [1, 2]}
label_set = {1, 2, 3}
label = numpy.amax(merged_mask) + 1 = 4
# mask 3
merged_mask = [0, 3, 0, 0] + [0, 0, 1, 0] * 4 = [0, 3, 4, 0]
label_set = {1, 2, 3, 4}
label_tree[4] = 0 # => {1: 0, 2: 0, 3: [1, 2], 4: 0}
new_labels = {}
label = numpy.amax(merge_mask) + 1 = 5
# mask 4
merged_mask = [0, 3, 4, 0] + [0, 1, 1, 1] * 5 = [0, 8, 9, 5]
label_set = {1, 2, 3, 4, 5}
label_tree[5] = 0 # => {1: 0, 2: 0, 3: [1, 2], 4: 0, 5: 0}
new_labels = {8, 9}
label_tree[8] = [3, 5]
label_tree[9] = [4, 5] # => {1: 0, 2: 0, 3: [1, 2], 4: 0, 5: 0, 8: [3, 5], 9: [4, 5]}
label = numpy.amax(merge_mask) + 1 = 10
# mask 5
merged_mask = [0, 8, 9, 5] + [0, 1, 1, 1] * 10 = [10, 18, 19, 15]
label_set = {1, 2, 3, 4, 5, 10}
label_tree[10] = 0 # => {1: 0, 2: 0, 3: [1, 2], 4: 0, 5: 0, 8: [3, 5], 9: [4, 5], 10: 0}
new_labels = {15, 18, 19}
label_tree[15] = [5, 10]
label_tree[18] = [8, 10]
label_tree[19] = [9, 10] # => {1: 0, 2: 0, 3: [1, 2], 4: 0, 5: 0, 8: [3, 5], 9: [4, 5], 10: 0, 15: [5, 10],
18: [8, 10], 19: [9, 10]}
label_set = {1, 2, 3, 4, 5, 10, 15, 18, 19}
label = numpy.amax(merge_mask) + 1 = 20
# mask 6
merged_mask = [10, 18, 19, 15] + [1, 0, 1, 0] * 20 = [30, 18, 39, 15]
label_set = {1, 2, 3, 4, 5, 10, 15, 18, 19, 20}
label_tree[20] = 0 # => {1: 0, 2: 0, 3: [1, 2], 4: 0, 5: 0, 8: [3, 5], 9: [4, 5], 10: 0, 15: [5, 10],
18: [8, 10], 19: [9, 10], 20: 0}
new_labels = {30, 39}
label_tree[30] = [10, 20]
label_tree[39] = [19, 20] # => {1: 0, 2: 0, 3: [1, 2], 4: 0, 5: 0, 8: [3, 5], 9: [4, 5], 10: 0, 15: [5, 10],
18: [8, 10], 19: [9, 10], 20: 0, 30: [10, 20], 39: [19, 20]}
label_set = {1, 2, 3, 4, 5, 10, 15, 18, 19, 20, 30, 39}
label = numpy.amax(merge_mask) + 1 = 40

Objects of this class have a number of important properties germane to working with collation of masks:

they know what the next label value is implicitly;
they handle iterative addition of masks to construct the merged mask;
they keep track of the label tree;

The internal array instantiation is lazy—it is only created once we know the size of the masks to be merged.

Using a MergedMask object converts the complexity of the above into the following:

merged_mask = MergedMask()
for mask in masks: # masks is a list of n-dimensional binary-valued arrays
    merged_mask.merge(mask)

Internally, merging is a vectorised addition of arrays by overloading the __add__, __radd__ and __iadd__ protocols. However, it is safest to use the MergeMask.merge() method because numpy arrays also implement the addition protocols meaning that __radd__ fails.

Once the masks have been merged, we can now interrogate the merged mask for some attributes:

merged_mask.label # the next label to be used; autoincremented appropriately
merged_mask.label_tree # the hiearchy of labels (complex tree of labels)
merged_mask.mask_to_label # the relations between masks and labels

merge(mask: ndarray, mask_name=None)[source]: Merge the sequence of masks in the specified order

class sfftk.core.prep.RelionCompositeStarReader(*args, **kwargs)[source]

Bases: RelionStarReader

Relion composite star file reader

maximum_tomograms = None

sfftk.core.prep.bin_map(args, configs)[source]

Bin the CCP4 map

Parameters:

args (argparse.Namespace) – parsed arguments
configs (sfftk.core.configs.Configs) – configurations object

Returns:

exit status

Return type:

int

sfftk.core.prep.check_mask_is_binary(fn, verbose=False)[source]

Check whether a mask is binary or not

Parameters:

fn (str) – map filename
verbose (bool) – verbosity flag

Returns:

boolean, True if binary mask

Return type:

bool

sfftk.core.prep.construct_transformation_matrix(args)[source]

Construct the transformation matrix

Parameters:: args (argparse.ArgumentParser) – parsed arguments
Returns:: transform
Return type:: numpy.ndarray

sfftk.core.prep.mergemask(args, configs)[source]

Merge two or more (max 255) masks into one with a distinct label for each mask

Parameters:

args (argparse.Namespace) – parsed arguments
configs (sfftk.core.configs.Configs) – configurations object

Returns:

exit status

Return type:

int

sfftk.core.prep.starcrop(args, configs)[source]

Crop a star file to have at most the given number of rows

Parameters:

args (argparse.Namespace) – parsed arguments
configs (sfftk.core.configs.Configs) – configurations object

Returns:

exit status

Return type:

int

sfftk.core.prep.starsplit(args, configs)[source]

Split a star file into multiple star files based on the given column

Parameters:

args (argparse.Namespace) – parsed arguments
configs (sfftk.core.configs.Configs) – configurations object

Returns:

exit status

Return type:

int

sfftk.core.prep.transform(args, configs)[source]

Rescale the STL mesh using the params in the arguments namespace

Parameters:

args (argparse.Namespace) – parsed arguments
configs (sfftk.core.configs.Configs) – configurations object

Returns:

exit status

Return type:

int

sfftk.core.prep.transform_stl_mesh(mesh, transform)[source]

Rescale the given STL mesh by the given transform

Parameters:

mesh (numpy.ndarray) – an STL mesh
transform (numpy.ndarray) – numpy array with shape = (4, 4)

Returns:

an STL mesh transformed

Return type:

numpy.ndarray

sfftk.core package

Configs

sfftk.core.configs

Parser

sfftk.core.parser

Preparation utilities

sfftk.core.prep

`sfftk.core` package

`sfftk.core.configs`

`sfftk.core.parser`

`sfftk.core.prep`