Documentation

Phybers is a Python library that provides several tools for brain tractography dataset analysis. With the aim of improving its usability, the library has been separated into 4 primary modules: Segment, Clustering, Utils, and Visualization. This section explains each of its modules, functions, and input/output arguments.

Segment

This module incorporates a white matter fiber bundle segmentation algorithm based on a multi-subject atlas [3, 5, 6], called FiberSeg.

FiberSeg Description

The FiberSeg uses as a measure of similarity between pairs of fibers the maximum Euclidean distance between corresponding points ( \(d_{ME}\)), defined as:

\[\begin{equation} \label{eq:ecua1} d_{ME}(A,B) = \min(\max_{i}(|a_{i}-b_{i}|),\max_{i}(|a_{i}-b_{N_{p}-i}|)) \quad \qquad [1] \end{equation}\]

Where \(a_{i}\) and \(b_{i}\) represent the 3D coordinates of the points in fibers A and B, respectively, both having an equal number of points (\(N_{p}\)) and in direct order. This implies that the points of fiber A are sequentially traversed as \(a_{i}\) = [\(a_{1}\), \(a_{2}\), …, \(a_{N_{p}}\)], and those of B are similarly defined: \(b_{i}\) = [\(b_{1}\), \(b_{2}\), …, \(b_{N_{p}}\)]. Therefore, the reverse order of fiber B is expressed as \(b_{N_{p-i}}\) = [ \(b_{N_{p}}\), \(b_{N_{p-1}}\), …, \(b_{1}\)].

It aims to classify the subject fibers according to a multi-subject bundle atlas. The bundle atlas consists of a set of representative bundles and an information file. The fibers of the atlas bundles are called centroids. The fibers of each subject are classified using a maximum \(d_{ME}\) distance threshold for each bundle between the subject’s fibers and the atlas centroids. The fibers are labeled with the closest atlas bundle, given that the distance is smaller than the distance threshold.

Atlases Download

We provide one atlas of Deep White Matter (DWM) bundles [3] and two atlases of Superficial White Matter (SWM) bundles, [4] and [7]. The following links provide access to these three atlases:

  1. Download DWM bundle atlas (Guevara et al. 2012)

  2. Download SWM bundle atlas (Claudio Roman et al. 2017)

  3. Download SWM bundle atlas (Claudio Roman et al. 2022)

In addition to these atlases, we have tested our algorithm with the atlas [8], which contains both DWM and SWM bundle. If you need this atlas in data bundles format, feel free to contact us via email.

phybers.segment.fiberseg(file_in: str, subj_name: str, atlas_dir: str, atlas_info: str, dir_out: str) None[source]

White matter fiber bundle segmentation algorithm based on a multi-subject atlas.

Parameters:
  • file_in (str) – Tractography dataset file in bundles format.

  • subj_name (str) – Subject name, used to label the results.

  • atlas_dir (str) – Bundle atlas directory, with bundles in separated files, sampled at 21 equidistant points. The bundle atlases provided are in same folders.

  • atlas_info (str) – Text file associated to the used atlas, that stores information needed to apply the segmentation algorithm, i.e., a list of the atlas fascicles, containing the name, the segmentation threshold (in mm) and the size of each fascicle. Note that the segmentation threshold can be adjusted depending on the database to be used.

  • dir_out (str) – Directory name to store all the results generated by the algorithm.

Return type:

None

Notes

This function generates the following files in the specified directory:

final_bundlesbundles files

The directory contains individual files for each segmented fascicle in the bundles/bundlesdata format, sampled at 21 points or all point, and in the atlas space (MNI). Each file is named using the subject’s name followed by the atlas label.

centroidsbundles files

A directory storing two files corresponding to a tractography dataset in bundles/bundlesdata format, containing one centroid for each segmented fascicle. This dataset is named centroids.bundles/centroids.bundlesdata, and is sampled at 21 points in the atlas space (MNI).

bundles_idtext files

The directoty that has a text file for each segmented bundle. A text file of a segmented bundle contains the indices of the fibers in the original subject’s tractography dataset file. The name of each file corresponds to the name of the segmented bundle file with '.txt' extension. These files are used to obtain the segmented fascicles in the subject’s original space and with all points of brain fibers.

Clustering

The module comprises two fiber clustering algorithms whole-brain tractography dataset, HClust (Clustering Hierarchical, [4, 7]) and FFClust (Fast Fiber Clustering, [9])

HClust Description

HClust is an average-link hierarchical agglomerative clustering algorithm which allows finding bundles based on a pairwise fiber distance measure. The algorithm calculates a distance matrix between all fiber pairs for a bundles dataset (\(d_{ij}\)), by using the maximum of the Euclidean distance between fiber pairs (Equation [1]). Then, it computes an affinity graph on the \(d_{ij}\) matrix for the fiber pairs that have Euclidean distance below a maximum distance threshold (fiber_thr). The Affinity [10] is given by the following equation:

\[\begin{equation} a_{ij} = e^{\frac{-d_{ij}}{\sigma^{2}}} \end{equation}\]

where \(d_{ij}\) is the distance between the elements \(i\) and \(j\), and \(\sigma\) is a parameter that defines the similarity scale in mm.

From the affinity graph, the hierarchical tree is generated using an agglomerative average-link hierarchical clustering algorithm. The tree is adaptively partitioned using a distance threshold (partition_thr).

phybers.clustering.hclust(file_in: str, dir_out: str, fiber_thr: int, partition_thr: int, variance: int) None[source]

Average-link hierarchical agglomerative clustering algorithm which allows finding bundles based on a pairwise fiber distance measure.

Parameters:
  • file_in (str) – Tractography dataset file in bundles format.

  • dir_out (str) – Directory to store all the results generated by the algorithm.

  • fiber_thr (str) – Maximum distance threshold in mm, default: 70.

  • partition_thr (str) – Partition threshold in mm, default: 70.

  • variance (str) – Variance squared and provides a similarity scale in mm, default: 3600.

Return type:

None

Notes

This function generates the following files in the specified directory:

final_bundlesbundles files

Directory that stores all the fiber clusters identified in separated datasets (bundles/bundlesdata files), sampled at 21 points. The file names are labeled with integer numbers ranging from zero to the total number of fiber clusters (N-1) identified.

final_bundles_allpointsbundles files

Directory that stores all the fiber clusters identified in separated datasets (bundles/bundlesdata files). The file names are labeled with integer numbers ranging from zero to the total number of fiber clusters (N-1) identified. This directory is only generated when the input tractography dataset file has not been sampled at 21 points.

centroidsbundles files

A directory storing two files corresponding to a tractography dataset in bundles/bundlesdata format, containing one centroid for each created cluster. This dataset is named centroids.bundles/centroids.bundlesdata, and is sampled at 21 points in the atlas space (MNI).

bundles_idtext file

Text file that stores the fiber indexes from the original input tractography dataset file for each detected cluster. Each line in the file correspond to a cluster in correlative order.

outputsSeveral formats

Temporal directory with intermediate results, such as the distance matrix, the affinity graph, and the dendrogram.

FFClust Description

FFClust is an intra-subject clustering algorithm aims to identify compact and homogeneous fiber clusters on a large tractography dataset. The algorithm consists of four stages. The stage 1 applies Minibatch K-Means clustering on five fiber points, and it merges fibers sharing the same point clusters (map clustering) in stage 2. Next, it reassigns small clusters to bigger ones (stage 3) considering distance of fibers in direct and reverse order. Finally, at stage 4, the algorithm groups clusters sharing the central point and merges close clusters represented by their centroids. The distance among fibers is defined as the maximum of the Euclidean distance between corresponding points.

phybers.clustering.ffclust(file_in: str, dir_out: str, points: [0, 3, 10, 17, 20], ks: [200, 300, 300, 300, 200], assign_thr: int = 6, join_thr: int = 6) bool[source]

Intra-subject clustering algorithm aims to identify compact and homogeneous fiber clusters on a large tractography dataset.

Parameters:
  • file_in (str) – Tractography data file in ‘.bundles’ format.

  • dir_out (str) – Directory to store all the results generated by the algorithm.

  • points (list()) – Index of the points to be used in the point clustering (K-means), default: 0, 3, 10, 17, 20.

  • ks (list()) – Number of clusters to be computed for each point using K-Means, default: 300, 200, 200, 200, 300.

  • assign_thr (str) – Maximum distance threshold for the cluster reassignment in mm, default: 6.

  • join_thr (str) – Maximum distance threshold for the cluster merge in mm, default: 6.

Return type:

None

Notes

This function generates the following files in the specified directory:

final_bundlesbundles files

Directory that stores all the fiber clusters identified in separated datasets (bundles/bundlesdata files), sampled at 21 points. The file names are labeled with integer numbers ranging from zero to the total number of fiber clusters (N-1) identified.

final_bundles_allpointsbundles files

Directory that stores all the fiber clusters identified in separated datasets (bundles/bundlesdata files). The file names are labeled with integer numbers ranging from zero to the total number of fiber clusters (N-1) identified. This directory is only generated when the input tractography dataset file has not been sampled at 21 points.

centroidsbundles files

A directory storing two files corresponding to a tractography dataset in bundles/bundlesdata format, containing one centroid for each created cluster. This dataset is named centroids.bundles/centroids.bundlesdata, and is sampled at 21 points in the atlas space (MNI).

bundles_idtext file

Text file that stores the fiber indexes from the original input tractography dataset file for each detected cluster. Each line in the file correspond to a cluster in correlative order.

outputsSeveral formats

Temporal directory with intermediate results, such as the point clusters.

Utils

The Utils module is a set of tools used for tractography dataset pre-processing and the analysis of brain fiber clustering and segmentation results. The module includes tools for reading and writing brain fiber files in bundles format, transform the fibers to a reference coordinate system based on a deformation field, sampling of fibers at a defined number of equidistant points, calculation of intersection between sets of brain fibers, and tools for extracting measures and filtering fiber clusters or segmented bundles. We considered the extraction of measures such as size, mean length (in mm), and the distance between fibers of each cluster (or fascicle), in mm. The source code is mostly developed in C/C++.

Deformation Description

The deformation sub-module transforms a tractography dataset file to another space using a nonlinear deformation file. The maps must be stored in NIfTI format, where the voxels contain the transformation to be applied to each voxel 3D space location. deformation applies the transformation to the 3D coordinates of the fiber points. The deformation can be employed on the Human Connectome Project (HCP) database [2] during the pre-processing stage before applying the segmentation algorithm.

phybers.utils.deform(deform_file: str, file_in: str, file_out: str) None[source]

Transforms a tractography file to another space using a non-linear deformation image file.

Parameters:
  • deform_file (str) – Deformation image (image in NIfTI format containing the deformations).

  • file_in (str) – Input tractography dataset in bundles format.

  • file_out (str) – Path to the transformed tractography dataset.

Return type:

None

Notes

This function generates the following files in the specified directory:

Tractographybundles files

Tractography dataset that has been transformed into the MNI space.

Sampling Description

Tractography datasets are usually composed of a large number of 3D polylines with a variable number of points. The sampling sub-module performs a sampling of the fibers, recalculating their points using a defined number of equidistant points. The input data of the algorithm are the path of the tractography dataset file to be sampled, the output file with the fibers with n points, and the number of points (npoints). The sampling sub-module is used in the pre-processing stage of the segmentation and clustering algorithms.

phybers.utils.sampling(file_in: str, file_out: str, npoints: int = 21) None[source]

Performs a fiber sampling by recalculating their points using a specified number of equidistant points.

Parameters:
  • file_in (str) – Tractography dataset file in bundles format.

  • file_out (str) – Path to save the tractography dataset sampled at n equidistant points.

  • npoints (str) – Number of sampling points (n). Default: 21.

Return type:

None

Notes

This function generates the following files in the specified directory:

Tractographybundles files

Tractography dataset sampled at n equidistant points in bundles format

Intersection Description

The intersection sub-module calculates a similarity measure between two sets of brain fibers, that could be generated with other algorithms, such as fiber clustering (fiber clusters) and bundle segmentation (segmented bundles). It uses a maximum distance threshold (in mm) to consider two fibers as similar. Both sets of fibers must be in the same space. First, an Euclidean distance matrix is calculated between the fibers of the two sets. The number of fibers from one set that have a similar fiber in the other set are counted, for both sets. The similarity measure yields a value between 0 and 100 %. The input data of the intersection algorithm are the two sets of fibers and the maximum distance threshold, while the output is the similarity percentage.

phybers.utils.intersection(file1_in: str, file2_in: str, dir_out: str, distance_thr: float = 10.0) Tuple[float, float][source]

Calculates a similarity measure between two sets of brain fibers (fiber clusters or segmented bundles). The similarity measure yields a value between 0 and 100%.

Parameters:
  • file1_in (str) – Path of the first fiber bundle.

  • file2_in (str) – Path of the second fiber bundle.

  • dir_out (str) – Directory name to store all the results generated by the algorithm.

  • distance_thr (float) – Distance threshold in mm used to consider similar two fibers, default: 10.

Returns:

Tuple[float, float], The first value indicates the percentage of intersection of the first set of fibers compared to the second set of fibers, and the second value indicates the reverse scenario, intersection of the second set of fibers compared to the first set of fibers.

Return type:

[float, float]

Notes

This function generates three tractography dataset files in the ouput directory, as follows:

fiber1-fiber2.bundles/fiber1-fiber2.bundlesdata:

Containing the fibers that are considered similar (or intersecting) for both fascicles.

only_fiber1.bundles/only_fiber1.bundlesdata:

Containing the fibers that are only in the first fascicle.

only_fiber2.bundles/only_fiber2.bundlesdata:

Containing the fibers that are only in the second fascicle.

Postprocessing Description

The PostProcessing sub-module contains a set of algorithms that can be applied to the results of clustering and segmentation algorithms. This algorithm constructs a Pandas library object (Dataframe), where each key corresponds to the name of the fiber set (cluster or segmented fascicle), followed by measures defined on the fiber set such as number of fibers (size), intra-fiber bundle distance (in mm) and mean length (in mm). It can be used to perform single or multiple feature filtering on the clustering or segmentation results. The input of the algorithm is the directory with the bundle sets to be analyzed, and the output is a Pandas Dataframe object with the calculated metrics.

phybers.utils.postprocessing(dir_in: str) DataFrame[source]

Sets of algorithms that can be applied on the results of clustering and segmentation algorithms.

Parameters:

dir_in (str) – Root directory where all segmentation or clustering algorithm results are stored.

Returns:

pd.DataFrame, contains the following list of keys: size: number of fibers in the bundle, len: centroid length per bundle, intra_mean: mean intra-bundle Euclidean distance.

Return type:

pd.DataFrame

Read and Write bundle Description

The functions read_bundle() and write_bundle() allow reading and writing files in the '.bundles/bundlesdata' format, respectively. Below, their inputs and outputs are described.

phybers.utils.read_bundle(file_in: str, points: int = 0) list[source]

Read tractography in bundles format.

Parameters:

file_in (str) – Path to read of the tractography dataset file in bundles format.

Return type:

List where each position contains a NumPy array representing the 3D coordinates of a brain fiber.

phybers.utils.write_bundle(file_out: str, points: list, buffer_size=1048576) None[source]

Write tractography in bundles format.

Parameters:
  • file_out (str) – Path to save of the tractography dataset file in bundles format.

  • points (list) – List where each position contains a NumPy array representing the 3D coordinates of a brain fibers.

Return type:

None

Visualization

The tractography dataset files can be rendered with lines or cylinders. In the case of lines, the software loads the streamlines with a fixed normal per vertex, which correspond to the normalized direction for the particular segment of the streamline. Furthermore, a phong lighting algorithm [11] is implemented in a vertex shader to compute the color fetched for the streamline. The MRI data is rendered by using specific shaders for slice visualization and volume rendering. Meshes can be displayed using points, wireframes or shaded triangles. The user interface (GUI) allows viewing several objects simultaneously, performing camera operations (zoom, rotate and panning), modifying material properties (color and adding transparency) and applying linear transformation on the brain tractography dataset.

Interactive 3D ROI-based fiber segmentation

This function enables users to extract bundles using 3D objects and labeled 3D images, creating a point-based data structure for fast queries, known as an Octree. It is based on storing points inside a bounding box with a capacity of N. When a node is filled, and a new point is added, the node subdivides its bounding box into eight new non-overlapping nodes, and the points are moved to the new nodes. The resulting selected fiber for each object can be used in logical mathematical operations (and, or, xor, not). This allows the use of multiple ROIs (Regions of Interest) to find fibers that connect specific areas while excluding others selected by different areas.

phybers.fibervis.start_fibervis(bundles=(), mri=(), mesh=(), args: list[str] = [])

Initializes the graphical user interface (GUI).

Parameters:

None

Examples

To test fibervis(), download the data from the links provided above. Then, open a Python terminal and run the following commands:

from phybers.fibervis import start_fibervis
start_fibervis()

Alternatively, you can run Fibervis from the Anaconda Prompt terminal by entering the following command:

fibervis

For your convenience when using fibervis(), a video showcasing its highlighted features is available through the following link.